We have now (January 2013) switched to Maven as described below. This page is only kept as an archive.
LanguageTool is planning to switch its build process to Maven. We're planning to do so after version 2.0 (released end of 2012). 1.9 and 2.0 are also available as a Maven artifact, but only as one large module.
For those who don't know, Maven (http://maven.apache.org/) is a so called build automation tool: it compiles and packages your software, managing dependencies for you as a developer. You only specify which other Open Source software you need and Maven will download it for you. The reason LT should use Maven is not so much that it makes managing our dependencies easier, but that using Maven and being hosted on the central Maven repository makes it very easy for Java developers to use LT in their applications.
Switching to Maven is a chance for LT to come up with a modular setup. Our stand-alone ZIP is 50 MB big - too much for those who maybe only want to add English grammar and style checking to their software.
This is a possible plan for how to switch to Maven after LT 2.0.
LanguageTool should be split up into the following modules:
- languagetool-standalone: our stand-alone GUI, the command line program, and development tools. Depends on languagetool-core and languagetool-gui-commons.
- languagetool-office-extension: the OpenOffice/LibreOffice integration. Builds the *.oxt. Depends on languagetool-core and languagetool-gui-commons.
- languagetool-gui-commons: GUI classes used in both the stand-alone GUI and the OOo/LO extension.
- languagetool-en, -de, -fr etc: one module per language that includes both the <Language>.java class (e.g. German.java), the language-specific XML rules, the language-specific Java rules. Depends on languagetool-core.
- languagetool-core: Everything else, e.g. XML rule engine, interfaces. Depends on no other LT modules.
Having one module per language means having a lot of Maven modules (30 or so) but developers can then, for example, specify lanuagetool-en as a dependency and they will get everything they need to add English style and grammar checking to their application.
To build all of LT in one go, we should have one "LanguageTool" Maven module that includes all the other LT modules and which itself has no code (a so-called multimodule project).
What is affected?
- Every Maven module will get its own directory in SVN, i.e. almost all files will need to move in SVN.
- As the Maven language modules (languagetool-en, …) depend on languagetool-core, but languagetool-core needs to initialize the language-specific classes, this needs to happen at runtime to avoid a circular dependency. Thus languagetool-core needs to scan, at runtime, the classpath for classes in org.languagetool.language which extend the Language class.
- Everything we publish should be built with Maven, i.e. all ant targets for running the unit tests and doing the build will be removed.
- The build process now also builds artifacts for Maven central which have to be uploaded. This means more work for the developer doing the release. (It doesn't have to happen at the same time as the end-user release, although that would be nice)
- Jenkins will need to be re-configured (should be easy)
What is not clear yet?
- Should we use less modules, i.e. put languagetool-standalone, languagetool-office-extension, languagetool-gui-commons into one module? This would then be one module which produces several artifacts (advice against this).
- How to build the JNLP stuff (webstart) with Maven?
- How to build the OXT with Maven?
Long-term: having even more modules
Once we have a clean Maven structure we can move more things to their own module. This is useful if these parts are useful for developers that do not need all of LT. Examples:
- The German morphology, which is not just a dictionary lookup but also splits compound words before doing the lookup.
- The FSA-based spell checking that has no dependencies on Hunspell and is thus pure Java