LanguageTool Standalone Changelog

What's new in LanguageTool Standalone 2.9

Mar 30, 2015

Catalan:
updated POS tag dictionary
added new rules
fixed false alarms
English:
Added a few rules and fixed a few false alarms
Added many new style rules contributed by Heikki Lehvaslaiho. As these may cause false alarms, they are not activated by default. You can activate them by turning on all rules in the new 'Plain English' category.
Esperanto:
added a few new rules
French:
updated POS tag dictionary and Hunspell dictionary to Dicollecte-5.3
German:
added a few new rules and fixed false alarms
Added a new rule that checks for subject verb agreement. For now, only cases with 'ist', 'sind', 'war', and 'waren' are supported. Example for errors that
are detected: 'Der Hund sind schön.', 'Die Autos ist schnell.'
To make this rule work, phrases are now unified in disambiguation.xml: for example, 'Mann' in the phrase 'ein Mann' will only retain its nominative reading (SUB:NOM:SIN:MAS), whereas it used to have also accusative and dative readings (SUB:AKK:SIN:MAS, SUB:DAT:SIN:MAS).
Italian:
improved a few rules
Polish:
added several new rules
Portuguese:
added/improved several rules
3695 compound words (pre-reform) - the largest free database
Russian:
added and improved rules
Ukrainian:
big dictionary update
new grammar rules
new simple replace rule for soft suggestions
disambiguator improvements
compound tagging and spelling improvements
initials tagging improvements
sentence and word tokenizing improvements
improved handling of stres symbol and soft hyphen
Bitext rules:
added a simple rule for checking whether translations end with the same punctuation mark as the original (this includes only .?! characters).
it is now possible to add external bitext rule files on the command line, by using bitextrule option. The file path has to be absolute. Note: this allows using bitext rules also for languages that have no bitext rules included by default.
Spelling:
The new files at /hunspell/spelling.txt can be used to add accepted words to the spell checker that are also considered when creating suggestions for misspelled words.
This is similar to the /hunspell/ignore.txt files, which list accepted words which are *not* used when creating suggestions for misspelled words.
API:
JLanguageTool.activateDefaultPatternRules() and
JLanguageTool.activateDefaultFalseFriendRules() have been removed - all pattern rules and false friend rules (if a second language is specified) are now activated automatically when the constructor of JLanguageTool is called. Should you need a checker without the XML-based pattern rules, extend your language class (e.g. 'English') with one that overwrites the getPatternRules() method and returns an empty list there.
ManualTagger.lookup() has been replaced by ManualTagger.tag() after being deprecated since the latest release
All static methods and fields from class 'Language' have been moved to the new class 'Languages'. For now, the methods/fields in class Language still exist but have been deprecated.
LanguageIdentifierTools has been removed. Use LanguageIdentifier instead.
Removed (Default)ResourceDataBroker.setResourceDir() and setRulesDir() as these can be set with the constructor
Cleaned up up class Contributor, e.g. removing getRemark()
Category.setDefaultOff() has been removed, this can be set via constructor now
Renamed classes:
o.lt.rules.patterns.Element => o.lt.rules.patterns.PatternToken
o.lt.rules.patterns.ElementMatcher => o.lt.rules.patterns.PatternTokenMatcher
Other small API cleanups that shouldn't affect the common use cases,
e.g. IncorrectExample.getCorrections() returns and unmodifiable list now, removal of deprecated methods.
Embedded server:
XML escaping has been fixed, this could cause invalid XML documents to be returned
new config file option 'maxWorkQueueSize' that lets you set the maximum size of the request queue - if it gets larger than this, requests will be rejected (503 Service unavailable)
The server now responds with more specific HTTP status codes to these error conditions:
413 Request Entity Too Large - if text exceeds maximum text size
503 Service Unavailable - if check exceeds maximum check time
GUI:
The stand-alone GUI can now take a plain text file as an argument, this file will then be loaded on startup (Github issue #232).
Command-line:
It is now possible to add an external rule file when calling LanguageTool from the command line. Use --rulefile to add a file. If the file name has a format that contains a language name, it will be used alongside other rules; otherwise, it will replace the rules. You can also load an external file with false friends by using the option --falsefriends . The file name should be an absolute file path, and false friend files are always added to the ones that are loaded for thelanguage. (Github issue #192)
Rule syntax:
A rule may now have a single example sentence as long as it has a 'correction' attribute - this can save some redundancy if the only correct sentence is the same as the incorrect sentence with the correction applied. Before, a rule needed at least two example sentences. example' element: type="incorrect" is now optional if there's a 'correction' attribute. The 'correction' attribute implies that the sentence is incorrect. example' element: type="correct" is now optional. No 'type' attribute and no 'correction' attribute implies that the sentence is correct.
Internal:
We have switched from Apache Tika to language-detector for automatically identifying the text language. It should be faster and results should be more reliable.
Detection of Asturian and Galician had to be disabled because the detection quality was too low and also affected detection of Spanish.
Fixed a regression that made it impossible to load external rule files in the GUI.

New in LanguageTool Standalone 2.8 (Dec 30, 2014)

Asturian:
removed dependency on Hunspell, now uses Morfologik for spell checking
Breton:
added and improved a few rules
Catalan:
updated dictionary
added and improved rules
fixed false alarms
Dutch:
added and improved many rules
English:
some new rules
updated the tagger and synthesizer dictionaries, fixing issue #202
new filter to be used for matching the part-of-speech of parts of words
filter class="org.languagetool.rules.en.EnglishPartialPosTagFilter"
args="no:1 regexp:in(.*) postag_regexp:JJ" This will only keep matches for words that start with 'in' and where the part after the 'in' is an adjective (POS tag 'JJ'). The 'no:1' is the token position, i.e. here the first (and only) matching is referred to.
French:
added and improved a few rules
German:
added and improved a few rules
Polish:
added and improved several rules
added and improved false friends with English
Portuguese:
added/improved several rules
Spanish:
removed dependency on Hunspell, now uses Morfologik for spell checking
Reformatted rules file
Added more rules
Tagalog:
removed dependency on Hunspell, now uses Morfologik for spell checking
the dash character ("-") is a delimiter now when tokenizing the text
Russian:
added and improved rules
added a few false friend rules (Russian/English)
Ukrainian:
many new rules (including agreement with nouns, time expressions etc)
rule coverage improvement
dictionary update (big improvements for proper nouns and vocative case)
new tag and rule to warn about alternative spelling
added word frequency information to improve spelling suggestions
some new disambiguator rules
Rule Syntax:
short.../short can now be added to a rulegroup to affect all the rules
of that group.
If you develop your own rules that are not part of LT you can now add external="yes" to your categories to prevent the rule link to community.languagetool.org from appearing in our stand-alone GUI (the link would not work for rules that are not part of the main distribution of LT). (Github issue #223)
If a rule group specifies default="off", the rules inside the rule group may not also specify default="on"/"off".
API:
Removed classes and methods that had been deprecated since 2.7 or longer
Embedded server:
The config file options 'requestLimit' and 'requestLimitPeriodInSeconds' can now also be used for the HTTP server (not just for the HTTPS server)
New config file option 'trustXForwardForHeader': set this to 'true' if you run the server behind a reverse proxy and want the request limit to work on the original IP addresses provided by the 'X-forwarded-for' HTTP header, usually set by the proxy. If you run behind a proxy but don't set this property to true, one user can use all the requests so other users will also get an error message because of the request limit.
Fix response of After the Deadline mode: description.../description was
sometimes empty, confusing the text check in WordPress
Bitext rules were not disabled properly, even if they were specified with a proper parameter for the server; now it's fixed
Fixed problem with improper positions for some bitext rules (issue #218)
GUI:
A new 'errorColors' setting has been added to the languagetool.cfg configuration file. It can be used to set the background color of errors. For example, errorColors=typographical:#b8b8ff, style:#ffb8b8 will show 'typographical' errors with a blue background and 'style' errors with a red background in the upper part of the LT window. 'typographical' and style' are the types that are set in grammar.xml as "type=...". There's no user interface yet to configure these colors. Note that you should only edit the languagetool.cfg file when LT is not running.
Internal:
Bugfix: rules inside a rule group had not been activated if a previous rule from the same rulegroup used default="off" Words are not ignored anymore by the spell checker just because they occur in a rule's suggestion. If you want the spell checker to ignore words globally, add them to hunspell/ignore.txt. To ignore them depending on the context, add a ignore_spelling' rule to disambiguation.xml. A file 'hunspell/prohibit.txt' can now be used to mark words as spelling errors even if the spell checker would normally accept them. This is useful to improve the LanguageTool spell checker without waiting for the upstream checker to be updated.
The 'prohibit.txt' file is the opposite of 'ignore.txt', which causes the spell checker to ignore words.
The part-of-speech tagger for most languages can now be extended by adding entries to the file org/languagetool/resource/XX/added.txt (XX being the language code).
The format is "fullform baseform postag", three columns separated by tabs.
This makes it easier for users (and developers) to extend the POS tagger, as they don't need to export, modify, and re-create the binary dictionary for every change.

New in LanguageTool Standalone 2.7 (Sep 29, 2014)

Breton:
added and improved rules
New rule that checks if a weekday matches a date, e.g. detects "Gwener 28 a viz Eost 2014", as that date isn't a Friday.
Catalan:
added and improved rules
fixed false alarms
Dutch:
added and improved many rules
switched to Morfologik-based spell checker
English:
Do you want to be part of the team that develops the world's most powerful Open Source proofreading tool? We're looking for a maintainer for the English rules in LanguageTool.
All English dictionaries have been extended to contain word frequency classes to improve the spell checker suggestions (the frequency data is taken from https://github.com/mozilla-b2g/gaia/tree/master/apps/keyboard/js/ime /latin/dictionaries, as for other languages that already use this feature).
Better suggestions for English learners: irregular verbs, nouns, and adjectives now usually have a better suggestion. For example, 'thinked' suggests 'thought', womans' suggests 'women'.
More misspellings provide suggestions now, e.g. 'garentee' (guarantee), greatful' (grateful). This may cause a performance decrease of ~ 10% (more for texts with a lot of unknown words).
New rule that checks if a weekday matches a date, e.g. detects "Monday, 7 October 2014", as that date isn't a Monday. This rule will only work if it detects the date format in use. So far, these formats are supported:
"Monday, 7 October 2014"
"Monday, 7 Oct 2014"
Esperanto:
New rule that checks if a weekday matches a date, e.g. detects "Vendredon la 28-an de AÅgusto 2014", as that date isn't a Friday.
French:
updated POS tag dictionary and Hunspell dictionary to Dicollecte-5.2
added a synthesizer - the agreement rule can now make suggestions for some errors
added/improved several rules
New rule that checks if a weekday matches a date, e.g. detects "vendredi 28/08/2014", as that date isn't a Friday.
German:
Fixed a rare NullPointerException and an ArrayIndexOutOfBoundsException
Fixed several false alarms
Added and improved rules
New rule that checks for sentences without a verb (turned off by default due to the risk of false alarms)
New rule that checks if a weekday matches a date, e.g. detects "Dienstag, 29.9.2014", as that date isn't a Tuesday.
Performance improvements for spell check suggestions
Persian:
added initial support for Persian (Farsi)
Polish:
added and improved some rules
new rule that checks if a weekday matches a date
Portuguese:
added/improved several rules
added many dozens of compound words
Russian:
added new rules
fix SourceForge feature request #38 (check for different quotation marks)
added a few false friend rules (Russian/English)
new rule that checks if a weekday matches a date
expanded Russian compound rule with new words from postag dictionary
Spanish:
Added new POS category Z (for spelled numbers, e.g. 'uno', 'dos', ...)
Spelled numbers can now be detected and managed both in disambiguation and rules.
Fixed some incorrect lemmas in POS dictionary.
Added Hybrid chunker-disambiguator.
Tamil:
Added initial support for Tamil. If the font for Tamil is not properly displayed on your computer and you're using Windows, you might need to apply the work around described here:
Ukrainian:
big update for POS dictionary (fixes and new words) some POS tag renamed for consistency; new tags for abbreviations and rare words many new rules and fixes for existing rules
new rule that checks if a weekday matches a date
token normalization performance improvement
LibreOffice integration:
Don't get confused by footnotes in LibreOffice 4.3 and later (it now provides us with the footnote positions as meta data, so we can ignore them).
API:
Major performance improvements for the multi-thread use case, where JLanguageTool gets created per thread, but the language object (e.g. 'German') gets created only once. Overhead for creating JLanguageTool should now be much lower.
Removed several classes and methods that had been deprecated since version 2.6
Removed DutchSpellerRule - use MorfologikDutchSpellerRule instead
The signature of Language.getRelevantRules() has changed
The JLanguageTool and MultiThreadedJLanguageTool constructors don't declare to throw an IOException anymore
WhitespaceRule has been renamed to MultipleWhitespaceRule (WhitespaceRule still exists but has been deprecated)
Deprecated some methods whose visibility will be decreased (e.g. from public to protected)
MorfologikSpellerRule.getRuleMatch(String, int) has been renamed to MorfologikSpellerRule.getRuleMatches(String, int)
The RuleMatch constructor now throws an exception if toPosition is not larger than fromPosition
Introduced a new abstract class TextLevelRule that extends Rule and that can be used for rules that cover more than single sentences.
Command line:
Enabling and disabling specific rules at the same time is now allowed.
In order to test only some rules (disabling all the rest), which previously was done with "--enable LIST_OF_RULES", now use "--enabledonly --enable LIST_OF_RULES" or "-eo -e LIST_OF_RULES").
Embedded server:
Two new options can be set in the properties file to make LanguageTool return the same XML format as After the Deadline (AtD). This way it can be used as a drop-in replacement for AtD: mode - 'LanguageTool' or 'AfterTheDeadline' afterTheDeadlineLanguage - code of default language if mode is set to 'AfterTheDeadline'
NOTE: the 'AfterTheDeadline' mode should be considered experimental for now.
The new option 'maxCheckThreads' allows setting the maximum number of threads working on requests in parallel. The default is 10, as it used to be.
Internals:
New abstract rule AbstractDateCheckFilter that allows to check if a week day and date match. For example "Tuesday, September 29, 2014" could be detected, as September 29, 2014 is not actually a Tuesday. This uses the new experimental RuleFilter interface that can be called from XML with the new 'filter' element. 'filter' takes these attributes: class': the fully-qualified name of a Java class that implements RuleFilter, e.g. "org.languagetool.rules.de.DateCheckFilter"
args': a string like "year:\1 month:\2 day:\3 weekDay:\4", i.e. a space separated list of key/value pairs, where \x gets resolved to the pattern's token value (as in the 'message' element)
The compound rule now ignores tokens that have been immunized in the disambiguation.xml
The "filter" action in the disambiguator is now applied only to POS tags that match the POS tag given. If they don't match, the rule is not applied. If you're extending the XML rules as described at http://wiki.languagetool.or /tips-and-tricks#toc2, the external rule and disambiguation files can now be hosted on a password-protected server by specifying an URL like this: http://user:[email protected]/path/user-rules.xml
The em dash ("â€”") is now a tokenizing character for all languages
New feature - Use of language models:
LanguageTool can now make use of ngram data. ngram data is information about how often phrases occur in a language. Currently, this uses phrases of length 3.
The data is used by an English rule to find homophone errors, like mixing up coarse/course or flair/flare. LanguageTool had some rules of this kind before, but the new rule now supports about 900 of such word pairs/sets.
The data needed for this is huge (7GB for English) and thus not part or LanguageTool.
The data (English only for now) and more documentation is available at http://wiki.languagetool.org/finding-errors-using-big-data
Using ngrams makes LanguageTool slightly slower when the data is stored on an SSD.
If not stored on an SSD, the performance might drastically decrease.
Use the new --languagemodel option with the command line client to activate the rule that uses the data. That option is not yet available for the stand-alone GUI.

New in LanguageTool Standalone 2.6 (Jun 30, 2014)

Breton:
updated FSA spelling dictionary from An Drouizig Breton Spellchecker 0.12
updated POS dictionary from Apertium (svn r53329)
improved several rules
Catalan:
fixed false alarms
added and improved rules
updated tagger dictionary
Morfologik spellchecking rule is enabled for use in LibreOffice/OpenOffice extension. The Hunspell spellchecker should be manually disabled in
LibreOffice/OpenOffice for the results to be visible
English:
The spelling rule now accepts words with hyphens if the parts are valid words
For example, "web-based" is accepted now. This avoids a lot of false alarms fixed a thread-safety problem in the synthesizer
added/improved several rules
Esperanto:
added several rules
French:
updated POS tag dictionary and Hunspell dictionary to Dicollecte-5.1
added/improved several rules
German:
fixed false alarm for words like "Stil-" in phrases like "Stil- und Grammatikprüfung" (issue #93)
added/improved several rules
detect wrong uppercase spelling in sentences like "Die Blaue Tür"
Greek:
added two new punctuation rules
Japanese:
added some rules
Polish:
added/improved many rules
improved disambiguation
improved spell-checking of foreign words with apostrophes
Portuguese:
added/improved several rules
a total of 3534 compound words
Russian:
added big wordlist to spellcheck dictionary.
Gabinski:
fixed wrong tag in tagger dictionary: ADJ_Comp --> ADJ_Sup, ADJ_S --> ADJ_Com
added/improved several rules
Spanish:
added and improved several rules
Asturian, Italian, Lithuanian, Malayalam, Swedish, and Tagalog have been switched to an SRX-based sentence tokenizer implementation
Wikipedia:
The deprecated options check-dump and wiki-index have been removed from org.languagetool.dev.wikipedia.Main. Please use check-data and index-data instead
API:
Almost all deprecated methods and classes have been removed. If you're upgrading from a version earlier than 2.5, we recommend to upgrade to 2.5 first, fix all deprecation warnings, and then upgrade to 2.6
If you extend the 'Language' class and don't implement the getSentenceTokenizer() method, your language now uses SimpleSentenceTokenizer. This is a very simple tokenizer and you probably want to implement getSentenceTokenizer() to return a LocalSRXSentenceTokenizer instead that's adapted to your needs Rule.supportsLanguage() now also works for PatternRules (i.e. rules loaded from XML files). It used to always return 'false' for those rules.
The public field Language.DEMO has been removed. It is now only available internally for tests, together with the demo language itself and DemoChunker JLanguageTool.printIfVerbose() has been changed from protected to private as it was not used anywhere and it's not really useful to extend JLanguageTool JLanguageTool.getAllActiveRules() has been fixed to not return rules that have the default="off" attribute (unless they have been enabled explicitly)
StringTools.isAlphabetic() has been removed as it was a workaround for Java 6 which is not supported anymore
ContractionSpellingRule.isDictionaryBasedSpellingRule() now returns false
Specific rules can be enabled for use in LibreOffice/OpenOffice extension with the new useInOffice() method. This is done, for example, for enabling Catalan Morfologik spelling rule
Embedded server:
The HTTP server now also accepts a --config option so that the maximum request size can be limited with the 'maxTextLength' parameter (this used to be working only for the HTTPS server)
The HTTP and HTTPS servers now accept a new option 'maxCheckTimeMillis' in the property configuration file to specify a maximum duration of a single check
Checks that take longer (e.g. because generating spell corrections is slow for some languages) will stop with an exception
GUI:
Improved configuration dialog
Allow the user to change the font of the main editing area
Rule syntax:
It is now possible to specify case sensitivity of individual exceptions and tokens in a pattern (both in grammar and disambiguation XML rules)
Simply use case_sensitive="yes" or case_sensitive="no"
The 'url' element can now be added to the 'rulegroup' so that all rules of a rule group share the same URL
Internals:
Fixed bug with longer complex strings of tokens to be unified
Fixed bug in a disambiguator where tokens with min="0" were mislocated

New in LanguageTool Standalone 2.5 (Apr 1, 2014)

Breton:
a new rule checks that there is a space character between sentences
Catalan:
added/improved many rules
fixed false alarms
added hundreds of suggestions for barbarisms in a simple replace rule
Dutch:
remove a hack from the word tokenizer code
several new false friends
English:
added a new rule to handle corrections of mistakes in standard English contractions (wasnt, didnt)
changed word tokenization so that the hyphen at the end of a word is no longer its part
removed spelling replacement pairs that made LT work very slowly
fixed a large number of false alarms
added new rules to handle common contextual misspellings and redundant phrases as well as common grammatical mistakes. Now the number of rules for English exceeds 1000
made it possible to use synthesizer to both add a determiner and make different replacement operations when creating suggestions simply add \\+INDT or \\+DT as special keywords to the regex that creates the POS tag in the match element
a new rule checks that there is a space character between sentences
Esperanto:
added/improved several rules
a new rule checks that there is a space character between sentences
French:
updated POS tag dictionary and Hunspell dictionary to Dicollecte-5.0.2
added/improved several rules
a new rule checks that there is a space character between sentences
German:
fixed several false alarms
added/improved several rules
a new rule checks that there is a space character between sentences
Japanese:
added some rules
Polish:
updated POS tag dictionary to PoliMorfologik 2.1
cleaned up spelling rules and hyphenation rules, added some frequent misspellings to generate proper suggestions
added simple compounding support for compound words containing prefixes such as "anty" or "mini", and adjectives with numerals such as "trzynasto"
removed annoying false alarms
added a number of new rules
changed word tokenization so that the hyphen at the end of a word is no longer its part; multi-word expressions with a hyphen are also split into their component parts to avoid many false alarms in spell-checking
a new rule checks that there is a space character between sentences
Portuguese:
added/improved several rules
added dozens of compounds: it now has 3400+ compound words
a new rule checks that there is a space character between sentences
Russian:
added a few rules
added some new rules, thanks to Julia Semenenko from WebSpellChecker.net
added many false friends rules (Russian/English)
updated POS tag dictionary and synthesizer dictionary from AOT.ru(Seman) rev. 242
added frequency information for spell-checking dictionary from AOT.ru(Seman)
Slovenian:
fixed sentence detection so that sentences that begin with a lowercase character can now be detected
added common punctuation and mathematical characters (=#**×·+÷) to the set of tokenizing characters in most languages
added new element in XML rules to mark the parts of the unified sequence that are not checked (useful to ignore punctuation, connectives, or not inflected words)
added new element (which can include , , , and ) useful for marking complex multiword exceptions in rules and rulegroups (a single antipattern can be shared by multiple rules in the same group)
spell checking:
Immunized tokens are now ignored and don't cause a spelling error anymore one can also add particular contextual expressions to ignored words by using a new action of the disambiguator: "ignore_spelling"
The words in ignore.txt are checked using case conversions allowed in the speller dictionary (in MorfologikSpeller-based dictionaries). This usually means that the word added in lowercase will be accepted also when it is found at the beginning of the sentence; but if it is added in uppercase, it won't be accepted when written in lowercase
updated Morfologik libraries to 1.9.0 to speed up suggestion generation in spell checks, not only when using replacement pairs but also for ignoring diacritics
stand-alone GUI:
The "More..." dialog now contains a link to community.languagetool.org with details about the matching rule
API:
incompatible change: changed the return type of Rule.getLocQualityIssueType() from String to ITSIssueType incompatible change: changed the parameter of Rule.setLocQualityIssueType() from String to ITSIssueType deprecated Rule.isSpellingRule(), please use Rule.isDictionaryBasedSpellingRule() instead

New in LanguageTool Standalone 2.4.1 (Jan 10, 2014)

New in LanguageTool Standalone 2.4 (Jan 3, 2014)

Breton:
SRX sentence tokenization
added/improved a few rules
fixed some false alarms
fixed incorrect suggestions thanks to added tests on corrections
Catalan:
added/improved several rules
fixed false alarms
made additions and fixes to the tagger dictionary
removed some words from synthesis dictionary (see filter-archaic.txt)
added frequency data to the tagger dictionary; frequency wordlist comes from the Gaia project, with Apache License, version 2.0
English:
added/improved a few rules
fixed some false alarms
French:
added/improved several rules
fixed some false alarms
German:
added/improved several rules
added a synthesizer - the agreement rule can now make suggestions for some errors (not all suggestions are correct, though)
Polish:
added/improved several rules, especially for hyphen and dash usage
added frequency information for spell-checking dictionary; frequency wordlist comes from the Gaia project, with Apache License, version 2.0
fixed some false alarms
Portuguese:
added/improved several rules (it now includes gender rules "a"/"o")
it now has 3400+ compound words
the JAR file has been renamed to languagetool.jar, from formerly languagetool-standalone.jar to avoid confusion about what 'standalone' means in this context
for languages with many rules (like French or German) performance on long texts has been increased by about 20-30%
fix for thread-safety (could cause hang in MultiWordChunker)
fixed a bug where chunk annotations were not tested in groups
fix: \1 and had not been evaluated in ...
fixed a bug in the unification mechanism that discarded some of the matching interpretations prematurely
added support for chunk annotations in the disambiguator, and fixed one bug in filtering tokens with chunk annotations
updated Morfologik libraries to 1.8.2 (bug fixes, stricter input sanity checking, add frequency data to dictionaries)
added the option of including frequency data to taggging or spelling dictionaries.
The expected format of the frequency wordlists is the one in the Gaia project, with Apache License, version 2.0
new command line tools to export and create binary dictionaries:
org.languagetool.dev.DictionaryExporter
org.languagetool.dev.POSDictionaryBuilder
LibreOffice/OpenOffice integration:
added a workaround for incorrect sentence detection for the case that a footnote appeared after a sentence full stop
stand-alone GUI:
The dialog opened by the "More..." item in the context menu of an error will now also display correct and incorrect example sentences
API:
SentenceTokenizer is now an interface, the implementation has been moved to
RegexSentenceTokenizer, but this is deprecated and SRXSentenceTokenizer should be used instead
Some methods from org.languagetool.tools.StringTools have been moved to the org.languagetool.gui.Tools class in the languagetool-gui-commons project
LanguageToolListener.languageToolEventOccured() has been renamed to
LanguageToolListener.languageToolEventOccurred() org.languagetool.tools.SymbolLocator isn't public anymore (shouldn't affect anybody)
removed DanishSentenceTokenizer which had been deprecated for three years Rule.getCorrectExamples() and Rule.getIncorrectExamples() don't return null anymore but an empty list if there are no examples. Consequently, setCorrectExamples() and setIncorrectExamples() don't accept null anymore.
Rule.getId() may return any string now, not just ASCII-only strings (actually this has been the case before, as the ASCII-only restriction was never enforced and only mentioned in the javadoc)
languagetool-wikipedia: the command line options for checking a Wikipedia dump have been simplified. The command can now be called like this:
java -jar languagetool-wikipedia.jar check-data -l en -f enwiki-20130621-pages-articles.xml
Call just "java -jar languagetool-wikipedia.jar check-data" to get a usage message.
More than one file can be specified with the -f option. Additionally to Wikipedia
XML dumps, CSV files from Tatoeba (http://tatoeba.org) are now also supported, they need to be filtered first to contain only the relevant language.

New in LanguageTool Standalone 2.3 (Oct 1, 2013)

Breton:
added/improved a few rules
fixed false alarms
updated POS dictionary from Apertium (svn r47282)
Catalan:
added support for language code ca-ES-valencia (Catalan Valencian),
to be used in LibreOffice 4.2.0
added a simple replace rule with hundreds of replacement suggestions
added/improved several rules
fixed false alarms
Chinese:
added a workaround for a StringIndexOutOfBoundsException
http://sourceforge.net/p/languagetool/bugs/186/)
English:
added replacement patterns for the spelling checker to make suggestions
better (now offers 'taught' for 'teached')
added/improved a few rules
French:
added/improved a few rules
fixed false alarms
updated POS tag dictionary and Hunspell dictionary to Dicollecte-4.12
German:
added/improved several rules
Portuguese:
added/improved a few rules
it now has 3300+ compound words
Ukrainian:
added/improved several rules
the source code has been moved to github:
https://github.com/languagetool-org/languagetool
LanguageTool requires Java 7 now
LanguageTool makes use of multiple threads now for text checking on modern
hardware, improving performance (this affects the stand-alone version, the
command line version and the LibreOffice/OpenOffice extension)
Rule syntax:
preliminary support for new min/max attributes that allow to match an
element that appears the given number of times. For example:
token min="0">foo will match nothing or "foo", i.e. "foo" is optional
token max="2">foo will match "foo" or "foo foo"
token min="0" max="2">foo will match nothing, "foo", or "foo foo"
Use max="-1" to allow unlimited occurrences.
For min, only 0 or 1 is supported (1 is the default).
support for OR-statements. For example:
or>
token>a
token postag="V"/>
or>
Internally and in run-time, a rule containing OR-statements is converted into
several rules without OR-statements.
English now has a chunker to detect, amongst others, singular and plural noun chunks.
This is documented at http://wiki.languagetool.org/using-chunks
standalone version:
The standalone version now underlines errors with a red (spelling errors) or
blue (other errors) line (Panagiotis Minos)
Remember the language selection for the next start
Improved window and dialog placement in a multi-monitor setup
embedded server: uses default port (8081) again if started without arguments
updated the morfologik-stemming library to version 1.7.1 to enable better suggestions,
including proper handling of diacritics and replacement patterns (equivalents of MAP
and REP features in hunspell dictionaries)
OpenOffice/LibreOffice integration:
fix: the "About" dialog didn't work in Apache OpenOffice 4.0
fix: country specific rules (like for British English) didn't work
API:
In class Language, getCountryVariants() has been renamed to getCountries(), and a new method
getVariant has been added.
Some methods have been deprecated
Some methods have been moved from the Tools class (languagetool-core) to the
new CommandLineTools class (languagetool-commandline)
AbstractRuleDisambiguator has been renamed XmlRuleDisambiguator and is not abstract anymore.
The RuleDisambiguator classes have been removed, XmlRuleDisambiguator can be
used directly instead.
A new method JLanguageTool.check(AnnotatedText) has been introduced that allows
you to check text with markup. Use AnnotatedTextBuilder to build up the input.
Thread-safety has been improved. The recommended use case is now to
create a new JLanguageTool object for each thread, but to create the
language only once (e.g. new English()) and use that for all JLanguageTool
instances. This changed the API of some public classes, but for the standard
use case of checking texts with the JLanguageTool object it shouldn't make a
difference. (patch by Stefan Lotties)
JLanguageTool.loadFalseFriendRules() now behaves like JLanguageTool.loadPatternRules():
it looks in the class path first, and then, if the given file is not found there, in
the filesystem
Introduced the Chunker interface that can assign chunks (also known as phrases)
to tokens. For example, for noun phrases like "a fast computer" the chunker could assign
an 'NP-singular' (noun phrase, singular) chunk to each of the tokens in that phrase.
In the grammar.xml, such a token can then be matched with this syntax:
token chunk="NP-singular">
The new class MultiThreadedJLanguageTool makes use of as many threads
as the computer has processors. In our tests this has improved text checking
time by about 70% on an Intel i7 processor when used on 30KB text.
AnalyzedTokenReadings now implements Iterable so it can be used in foreach loops
AnalyzedGermanTokenReadings has been removed, AnalyzedTokenReadings can be used instead
Embedded HTTP server: the server now uses 10 threads instead of 1 (thanks to
Panagiotis Minos)
text extraction from Wikipedia dumps has been improved

New in LanguageTool Standalone 2.2 (Jul 1, 2013)

Breton:
added/improved several rules
fixed some false alarms
updated POS dictionary from Apertium (svn r45122)
Catalan:
added/improved many rules
fixed false alarms
rules have been categorized according to the upcoming Internationalization Tag Set (ITS) Version 2.0 standard from W3C.
Dutch:
updated rules to fix false alarms, thanks to Ruud Baars
The Dutch spell checking has been switched back to Hunspell for now to avoid too many false alarms because of unknown compounds. Unfortunately, Dutch spell checking does not provide suggestions anymore, for performance reasons.
English:
added/improved a few rules
Esperanto:
added/improved several rules
fixed some false alarms
updated links to PMEG.
French:
added/improved several rules
fixed some false alarms
updated POS tag dictionary and Hunspell dictionary to Dicollecte-4.10
German:
added/improved several rules
Greek:
added a few rules (by Panagiotis Minos)
Italian:
small rule improvement
Japanese:
avoid an ArrayOutOfBoundsException in the POS tagger
Khmer:
added some rules (by Nathan Wells)
Polish:
added a few new rules
Portuguese:
added/improved a few rules
it now has around 2000 compound words taken from a huge Porto Editora dictionary
Russian:
added some new rules (thanks for these rules to Julia Semenenko)
fixed some false alarms
added new segmentation rules
added false-friend rule
added bitext rule
added new style rules
Ukrainian:
new POS dictionary
new synthesizer dictionary
new spelling dictionary
new grammar rules
updated sentence tokenizer rules
disambiguator implemented
word tokenizer updated to ignore accent and soft hyphen and understand different apostrophes
HTTP server:
enabling and disabling rules at the same time (keeping the rest of the default options) is now allowed. To disable all the rules except those explicitly enabled, you can use the parameter enabledOnly=yes.
Fix bug "java.lang.StringIndexOutOfBoundsException" in DifferentLengthRule
Worked around the "There is an incompatible JNA native library installed on this system" error
Updated Tika (used for language detection) from 0.9 to 1.3
The "--version" parameter of languagetool-commandline.jar now also prints the build date
Several small bug fixes, code cleanups, and Javadoc improvements

New in LanguageTool Standalone 2.1 (Apr 2, 2013)

Breton:
added/improved several rules
fixed several incorrect suggestions thanks to added tests on corrections
spelling checker now ignores words with a POS tag, hence accepting words which are in Apertium or in An Drouizig spelling checker
updated FSA spelling dictionary from An Drouizig Breton Spellchecker 0.11
updated POS dictionary from Apertium (svn r43173)
Catalan:
added/improved several rules
fixed multiple false alarms
improved sentence and word tokenization
the tagger dictionary has been fixed and expanded (added 9000 nameplaces; 115000 tagged forms from Softcatalà dictionary)
verbal forms have been tagged according to regional variants (with script tag_verbs.pl)
Hunspell dictionary (Softcatalà) has been replaced with LT tagger dictionary for spellchecking
spelling checker now ignores words with a POS tag (from multiwords.txt or disambiguation.xml)
English:
fixed some false alarms
added a few new rules
updated the tagger and synthesizer dictionaries to recognize more words, including plural forms of nouns, new verbs, and missing cardinal numerals (fixes Sourceforge bug #3560624)
updated the dictionary building script
Esperanto:
added/improved several rules
fixed several incorrect suggestions thanks to added tests on corrections
single quotes at beginning/end of word no longer cause false spelling error
French:
added/improved several rules
updated POS tag dictionary and Hunspell dictionary to Dicollecte-4.9
fixed several incorrect suggestions thanks to added tests on corrections
SRX sentence tokenization
single quotes at beginning/end of word no longer cause false spelling error
German:
added/improved several rules
The German spell checker now returns suggestions. We combine Hunspell and our own morfologik-based spell checker to create the suggestions for good performance, but they are far from perfect yet. First, some misspelled compound words fail to have a suggestion. Second, sometimes the best suggestion is not at the front of the list of suggestions. org/languagetool/resource/de/hunspell/create_dict.sh can be used to create
the three morfologik spell checker dictionaries (German, Austrian, and Swiss).
New language variant "Simple German", with rules contributed by Annika Nietzio. This variant uses 'de-DE-x-simple-language' as a language code.
Italian:
added/improved a few rules
Malayalam:
now uses the default sentence tokenizer, not the English one (because languages are now modules that don't depend on each other)
Polish:
false alarm fixes
added and improved some rules
major tagger and synthesis dictionary update (to Morfologik 2.0 PoliMorf), which also fixes sf.net bug #3554018
Portuguese:
added/improved several rules
it now has over 1500 compound words taken from a huge Porto Editora dictionary
Russian:
several new rules
bugfix: the uppercase sentence rule sometimes was not triggered
spell checking: URLs, if specified like "http://www.foo.org", are now ignored and don't cause a spelling error anymore
API: Languagetool has been split up into several Maven modules. This causes some API changes:
No more language constants: "Language.GERMAN" now needs to be "new German()", similarly for all other languages
org.languagetool.gui.ResourceBundleWithFallback is now org.languagetool.ResourceBundleWithFallback
org.languagetool.gui.ContextTools is now org.languagetool.tools.ContextTools
LanguageIdentifierTools.ADDITIONAL_LANGUAGES has been removed, languages are detected at runtime now
rule syntax: Suggestions are now also allowed outside the element.
bugfix: suggestions for compounds parts were missing sometimes
bugfix: Portuguese translation was not used
bugfix: fix false alarm of unpaired bracket rule on smileys :-) and ;-) (Sourceforge bug #3604367)
bugfix: testrules now report all cases when untouched examples were touched (Sourceforge bug #3600995)
bugfix: the whitespace rule now reports also combinations of non-breaking space and white space characters (Sourceforge bug #3608410)
stand-alone GUI: the very first check for languages with a lot of rules (e.g. German, French) should now be faster
embedded HTTPS server:
two new properties, to be set from the property configuration file, allow limiting the maximum number of requests: requestLimit - the maximum number of requests; requestLimitPeriodInSeconds - the time period in which the requests are considered, in seconds
bugfix: only the first line of a POST request's body was considered
OpenOffice/LibreOffice integration:
Another try to fix the ConcurrentModificationException (Sourceforge bug #3572536)
Command line:
In verbose mode, the subId of disambiguator matched rules is displayed.
Testing rules:
Now regular expressions in disambiguation rules are heuristically tested (Sourceforge bug #3599002)

New in LanguageTool Standalone 2.0 (Jan 4, 2013)

Breton:
added several new rules and fixed false alarms
updated POS dictionary (fix wrong POS tags) from Apertium (svn r42124)
updated FSA spelling dictionary from An Drouizig Breton Spellchecker 0.10
updated classification of words tagged as plural nouns of persons
these have different mutation than other nouns)
Catalan:
many new rules
fixed false alarms
multiple fixes and additions in the tagger dictionary
Hyphen is used as a word separator, as this is very common in web pages. This way some ambiguities cannot be solved if the typography rule substituting hyphen for dash is disabled.
English:
uses the same word tokenizer again as other European languages
fix Sourceforge bug #3479817 (capitalization of the first item of a list not required)
a few rule updates
Esperanto:
few new rules
French:
updated POS tag dictionary and Hunspell dictionary to Dicollecte-4.8
fixed false alarms
German:
many new rules and rule updates
Portuguese:
Added hundreds of compound words taken from a huge Porto Editora dictionary
Added/improved several rules
Replaced the current dictionary with PT-pt (pre-language agreement).
To use the post-language agreement, use the PT-br one.
Russian:
improved some rules
several new rules
updated POS dictionary (fix wrong POS Tags)
Spanish:
several new rules
Ukrainian:
updated a few rules
OpenOffice/LibreOffice integration:
Fixed ConcurrentModificationException (Sourceforge bug #3572536)
stand-alone GUI:
the tray icon menu (reachable with the right mouse button on the tray
icon) now has a checkbox to enable the embedded HTTP server
the tray icon will show a small "S" symbol when the server is running
fixed bug "Tray icon too big sometimes" (Sourceforge #3573078)
API:
Language.getLanguageForShortName() now consistently throws an exception
if the given language code is not known
Tools.median() is now private (it was accidentally made public)
the Java API of HTTPServer has been modified in incompatible ways. You might get compile errors if you have used this class from your Java code.
HTTP API:
support for auto-detecting text language (parameter autodetect=1)
added HTTPSServer, a lightweight embedded HTTPS server which works like HTTPServer but supports SSL encryption. This server supports *only* https, not http.
the XML we return now contains a new attribute "locqualityissuetype", which
is the "Localization Quality Issue Type" in the upcoming Internationalization Tag Set (ITS)
Version 2.0 standard from W3C. This means errors are now categorized according to a standard, additionally to LanguageTool's own categories. Useful values are only returned for English for now.
Please consider this to be a prototypical implementation for now ***
For rule developers: specify this using the new 'type' attribute. It is
inherited from category to rulegroup, and from rulegroup to rule. If a rule also has the 'type' it overwrites the rulegroup's and category's 'type'.

New in LanguageTool Standalone 1.9 (Jan 4, 2013)

Breton:
several new rules and fixed false alarms
the internal spelling engine can now additionally tokenize words if the source dictionary did not include compound versions (for example, hyphenated words).
updated classification of words tagged as plural nouns of persons these have different mutation than other nouns).
Catalan:
many new rules
fixed false alarms
Chinese:
hundreds of new Chinese rules
unit tests for the Chinese tokenizer and tagger
Danish:
major update to the dictionary and the rules
English:
fixed false alarm (sf bug #3543914)
some new rules and rule updates
Esperanto:
several new rules and fixed false alarms
updated list of transitive/nontransitive verbs
added links to REVO (Reta Vortaro) with permission from Wolfram Diestel.
French:
updated POS tag dictionary to use Dicollecte-4.6
several new rules and fixed false alarms
opening guillemet (U+00AB) followed by thin no-break
space (U+202F) wrongly detected as error (sf bug #3545050)
German:
added simple verb/subject agreement checker
several new rules and rule updates
fixed some false alarms by upgrading to jwordsplitter 3.4 (sf bug #3475553)
Japanese:
initial support with about 20 rules
Polish:
fixed bug with the word repetition rule (sf bug #3560925)
Portuguese:
several rule updates
Russian:
several new rules and rule updates
improve user interface translation
Swedish:
Support for Swedish has been re-enabled after it had been disabled in LanguageTool 1.7.
bugfix for command line: We removed XML from even plain text input. Now XML/HTML elements are only filtered out if the new --xmlfilter option is specified. Note that there's still a bug that can screw up position information with that option, thus it is deprecated.
command line: new option "--list" to list all available languages
introduced a file resources//hunspell/ignore.txt with words that the spell checker will ignore
development: new ant target generate-ignore-files: call "ant generate-ignore-files" if you want to re-generate the ignore.txt files, populated with those suggestions from the grammar files that the spell checker would complain about.
In other words, this makes sure "simple" suggestions (i.e. those without back references) made by the LanguageTool rules are never considered spelling mistakes.
If you have downloaded the stand-alone ZIP of LanguageTool, call this command for the same effect:
java -cp LanguageTool.jar org.languagetool.rules.spelling.SuggestionExtractor
stand-alone GUI: rules can now be disabled and re-enabled with a single click
stand-alone GUI: copy and paste from the result area now keeps line breaks
HTTP API: "+" was incorrectly removed from input
HTTP API: the XML returned by the "/Languages" URL (which returns a list of all supported
languages) has been extended to contain an 'abbrWithVariant' attribute which lists the language with its variant, e.g.:
language name="Catalan" abbr="ca" abbrWithVariant="ca-ES"/>
HTTP API: small internal cleanup for better exception and charset handling (always expects UTF-8)
HTTP API: the embedded HTTP server can now be started with a "--public" parameter so it can be accessed from anywhere (not just localhost). Please be careful with this option, it is not recommended for production use:
java -cp LanguageTool.jar org.languagetool.server.HTTPServer --public HTTP API and XML output: extended XML to include the version and build date of LanguageTool and the category and offset of each match
The word tokenizer now considers the following characters as a word separator: | (pipe) and ` (backtick).
bugfix: the column count was sometimes totally wrong

New in LanguageTool Standalone 1.8 (Jan 4, 2013)

Greek:
initial support for Greek with few rules for now
German:
many new rules and rule updates
added spellcheck dictionary for the standalone version
LanguageTool can now detect spelling errors, but it does not offer suggestions for performance reasons
added variants for Austria, Germany, and Switzerland
added several thousand words to the internal dictionary for better error detection and less false alarms
English:
several rule improvements and new rules
although a matter of debate, "an historic", "an historical", and "an habitual"
is not considered incorrect anymore
fixes in the tagger dictionary and disambiguation rules to reduce false alarms
added British and American English spelling dictionaries
added some rules to detect typical American expressions in British English
Catalan:
major update, including many new rules and new tokenization
new synthesizer (given a part-of-speech, this generates the inflected word
forms)
added spellcheck dictionary for the standalone version
using Hunspell dictionary ca-valencia (avl) 2.3.0
Russian:
fixed a few false alarms
added URL element for some rules with URL with more information about this rule
added spellcheck dictionary for standalone version.
several rule improvements (adding links, improving messages)
new Java rule "repeat word in sentence" (disabled by default)
Italian:
several new rules
added spellcheck dictionary for the standalone version
French:
updated dictionary to use Dicollecte-4.5
several new rules and fixed false alarms
added spellcheck dictionary for the standalone version
using Dicollecte-4.5
Breton:
updated dictionary to use Apertium svn r38896.
several new rules and fixed false alarms
added some references to the Breton grammar:
"La grammaire bretonne pour tous".
added spellcheck dictionary for the standalone version
Polish:
added spellcheck dictionary for the standalone version
some new rules and fixed false alarms
major update of the tagger dictionary
Portuguese:
initial support
Esperanto:
several new rules and fixed false alarms
added for further information about errors with links to
PMEG
added spellcheck dictionary for the standalone version
LibreOffice / OpenOffice integration:
If a LanguageTool rule has an URL with more information, the
grammar checking dialog in LibreOffice will now offer a "More..." link
to that URL. That URL is also displayed on the command line and stand-alone user interface as "More info:", and as "url" attribute of the element in the XML API format. This makes it possible to have external documentation about rules.
Bugfix #3526635: SingletonFactory now implements XServiceInfo
Bugfix #3534637: Fixed false alarm about word being uppercase when the previous sentence ended with a footnote
GUI:
made the result of "Tag Text" a bit more readable
Bugfix: when LT was hidden in the tray, two very fast mouse clicks could activate the check twice, and that caused errors. Right now only one checking thread is active.
Command line:
In verbose mode, the log of the rule-based disambiguator actions is displayed.
In the profiling rules' mode on the command-line, you can now enable
and disable rules.
API:
Deprecated some methods and the SentenceTokenizer class (SRXSentenceTokenizer should be used instead)
LanguageTool in the standalone version now supports spell-checking. Some of the languages use hunspell
There are now two distribution files: a .zip file for standalone use,
and an .oxt extension for LibreOffice/ApacheOpenOffice. All languages supported by LanguageTool, except for Chinese, have spelling dictionaries bundled with the standalone version. Spell-checking errors appear in red in the LanguageTool standalone version, other errors appear in blue.
LanguageTool now supports separate rules for different local variants of a language, for example American English and British English. To use them from the command line, simply use "en-US" or "en-GB" as the language code. This implements sf.net feature suggestion #3287388.
The configuration file now stores settings individually for all languages. This means that you can enable spellcheck for American English, disable it for Polish, and British English, and all these settings will be saved separately.
If you start the HTTP Server from the GUI, it now reads the configuration files
that are configured in the GUI (if the appropriate checkbox is set).
This way the user can control the behavior of the server easily.
Two new options for the HTTP Server added: "disabled" and "enabled", which is used to disable or enable rules in the same way as on the command-line.
The XML format for rules has been changed to use ... tags instead of mark_from and mark_to attributes.
Overlapping rule matches are filtered now so that only the first match per is kept
It is now possible to suppress misspelled suggestions altogether in XML rules by applying
an attribute suppress_misspelled="yes" on the element, AND on the element. If only element has this attribute set to "yes", then the suggestion is displayed, but no content of is contained within (this might be a conditional part of the suggestion). Note: for this to work, the tagger dictionary needs to be fairly complete;
words without lemmas and POS tags are considered to be misspelled.
Improved startup speed
Some internal bug fixing in disambiguation and pattern rules, including problems with unification.
Update of morfologik-stemming library to 1.5.3 (bug-fix release).
Larger tagger dictionaries were encoded in cfsa2 binary format to save space.
Bugfix #3054895: fixed incorrect column reported by LanguageTool in command line mode or with --api option when error spanned new line.
Bugfix #3431788: When the message contained the reference to a previous token as the first character, the corrections generated via regular expression replacements on match elements were wrong.

New in LanguageTool Standalone 1.7 (Mar 26, 2012)

New in LanguageTool Standalone 1.6 (Feb 22, 2012)

New in LanguageTool Standalone 1.5 (Feb 22, 2012)

Added support for Chinese
Added support for Asturian
Added support for Tagalog
Added support for Breton
English:
Fixed some of the problems in the Sourceforge bug #3396850 (false alarms)
German:
Extended the internal part-of-speech dictionary with about 2000 word forms. This should improve coverage of the agreement rule. It mostly affects words that have a new spelling after the spelling reform.
Several new rules
French:
Updated dictionary to use Dicollecte-4.2
Added and updated a few rules. A few rules taken from Grammalecte-0.0.12 written by Olivier R.
Esperanto:
Added disambiguation for Esperanto
Improved SRX sentence tokenization rules
Added and updated a few rules
Ukrainian:
Updated translation
Initial SRX sentence tokenization rules
Belarusian:
Updated translation
Initial SRX sentence tokenization rules
Russian:
Fix SourceForge bug #3334498 (False warning about unpaired bracket)
Added and updated a few rules
Updated and improved part-of-speech dictionary
Dutch:
Fixes and updates to rules
Updated the dictionaries
Galician:
Added SRX sentence tokenization rules
Added new rules
Added more false friends
Updated dictionary to support enclitics
Added disambiguation support and rules
Added synthesizer
Khmer:
Updated coherency rules
Rule development:
Fixed testrules.sh to not throw an AssertionFailedError error if called with a language parameter
Fast Rule Evaluation is supported in "dev.index" packag
LanguageToolGUI and LanguageTool now have support for automatic language detection. On the command-line, you can use -adl or --autoDetect to enable automatic language detection.
Internal changes:
The disambiguator has a new action "immunize" that protects token from being matched by error-detecting rules (especially in XML rules).
The disambiguator allows removing interpretations by specifying just some attributes with the tag.
Allowed three character codes for languages, e.g. 'ast' for Asturian. This is the ISO 639-2 code, as Asturian doesn't seem to have an ISO 639-1 two-character code.
Fixed the bug with the special tag UNKNOWN: now it deals properly with the words at the end of sentence / paragraph.
Languages now initialize their classes (tokenizers, part-of-speech taggers etc.) lazily, i.e. only when the classes are actually needed. Improves startup performance a bit.
Switched to transifex.net for translation. Also changed our code to use the English string if no translation exists.

LanguageTool Standalone Changelog

What's new in LanguageTool Standalone 2.9

New in LanguageTool Standalone 2.8 (Dec 30, 2014)

New in LanguageTool Standalone 2.7 (Sep 29, 2014)

New in LanguageTool Standalone 2.6 (Jun 30, 2014)

New in LanguageTool Standalone 2.5 (Apr 1, 2014)

New in LanguageTool Standalone 2.4.1 (Jan 10, 2014)

New in LanguageTool Standalone 2.4 (Jan 3, 2014)

New in LanguageTool Standalone 2.3 (Oct 1, 2013)

New in LanguageTool Standalone 2.2 (Jul 1, 2013)

New in LanguageTool Standalone 2.1 (Apr 2, 2013)

New in LanguageTool Standalone 2.0 (Jan 4, 2013)

New in LanguageTool Standalone 1.9 (Jan 4, 2013)

New in LanguageTool Standalone 1.8 (Jan 4, 2013)

New in LanguageTool Standalone 1.7 (Mar 26, 2012)

New in LanguageTool Standalone 1.6 (Feb 22, 2012)

New in LanguageTool Standalone 1.5 (Feb 22, 2012)

New in LanguageTool Standalone 1.4 (Feb 22, 2012)

New in LanguageTool Standalone 1.3.1 (Feb 22, 2012)

New in LanguageTool Standalone 1.3 (Feb 22, 2012)

New in LanguageTool Standalone 1.2 (Feb 22, 2012)

New in LanguageTool Standalone 1.1 (Feb 22, 2012)

New in LanguageTool Standalone 1.0 (Feb 22, 2012)

New in LanguageTool Standalone 0.9 (May 22, 2007)