VietOCR Changelog

What's new in VietOCR 5.7.5

Jul 22, 2021
  • Update dependencies

New in VietOCR 5.7.2 (Nov 30, 2020)

  • Support text-only PDF format

New in VietOCR 5.7.1 (Nov 16, 2020)

  • Update dependencies

New in VietOCR 5.7.0 (Oct 18, 2020)

  • Support multiple renderers and provide pre- and post-processing for Bulk/Batch ops
  • Various UI improvements
  • Proper cleanup of working intermediate files
  • Expand command-line interface (CLI) support
  • Update dependencies

New in VietOCR 5.6.5 Beta (Oct 12, 2020)

  • Provide pre- and post-processing to Bulk/Batch ops
  • Update dependencies

New in VietOCR 5.6.4 (Aug 24, 2020)

  • Update Tess4J 4.5.3
  • Update other dependencies
  • Update translations

New in VietOCR 5.6.3 (Aug 15, 2020)

  • Add commons-logging library

New in VietOCR 5.6.2 (Aug 13, 2020)

  • Update Tess4J 4.5.2
  • Update other dependencies

New in VietOCR 5.6.1 (Apr 18, 2020)

  • Fix locked file issues with batch process

New in VietOCR 5.6.0 (Jan 4, 2020)

  • Fix minor UI issues
  • Additional localized strings
  • Persist selected directory of PDF files
  • Update Tess4J 4.5.1 (Tesseract 4.1.1)
  • Update other dependencies

New in VietOCR 5.5.3 (Dec 15, 2019)

  • Update Tess4J 4.4.1
  • Add support for reading JPEG2000 image
  • Update translations

New in VietOCR 5.5.2 (Sep 29, 2019)

  • Update Hunspell & other dependencies

New in VietOCR 5.4.3 (Feb 25, 2019)

  • Add support for double-sided pages
  • Update Ghost4J to have Unicode filename/path support

New in VietOCR 5.4.2 (Jan 1, 2019)

  • Improve Convert PDF to TIFF for multiple files

New in VietOCR 5.4.1 (Dec 28, 2018)

  • Rebuilt Tesseract Windows executable without /arch:AVX flag
  • Update Tess4J 4.3.1

New in VietOCR 5.4.0 (Nov 1, 2018)

  • Upgrade to Tesseract 4.0.0

New in VietOCR 5.3.1 (Oct 19, 2018)

  • Update Tess4J 4.2.3 & pdfbox dependencies

New in VietOCR 5.2.1 (Sep 20, 2018)

  • Update Tess4J 4.2.2

New in VietOCR 5.2 (Aug 20, 2018)

  • Upgrade to Tesseract 4.0.0-beta.4 (fd49206)

New in VietOCR 5.1.1 (Jul 30, 2018)

  • Update Tess4J 4.1.1

New in VietOCR 5.1.0 Beta (Jul 4, 2018)

  • Upgrade to Tesseract 4.0.0-beta.3 (b502bbf) and language data
  • Upgrade to Tess4J 4.1.0-SNAPSHOT and Lept4J 1.10.0
  • Update dependencies

New in VietOCR 5.0.3 (Jul 4, 2018)

  • Add support for Convert PDF to TIFF

New in VietOCR 5.0.2 (May 11, 2018)

  • Update available language list to include scripts

New in VietOCR 5.0.1 (May 4, 2018)

  • Update Tess4J 4.0.2 and Lept4J 1.9.4

New in VietOCR 4.7.2 (Apr 17, 2018)

  • Update jai-imageio-core to 1.4.0 for Java 9 fixes
  • Update to Tess4J 3.4.7 and Lept4J 1.6.4

New in VietOCR 4.7.1 (Mar 26, 2018)

  • Update PDFBox dependencies

New in VietOCR 4.7 (Mar 25, 2018)

  • Update to Tess4J 3.4.5; remove bundled Ghostscript DLL and use PDFBox if Ghostscript not available on system

New in VietOCR 5.0 Alpha (Feb 18, 2018)

  • Upgrade to Tesseract 4.00alpha (ce7ee87) and language data
  • Upgrade to Tess4J 4.0.0-SNAPSHOT
  • Upgrade Tesseract 4.00 fast language packs
  • Autodeskew for batch and bulk processes

New in VietOCR 4.6.2 (Nov 15, 2017)

  • Update to Tess4J 3.4.2

New in VietOCR 4.6.1 (Sep 23, 2017)

  • Update Tesseract 3.05.01 (e2e79c4)
  • Upgrade to Tess4J 3.4.1

New in VietOCR 4.6 (Sep 5, 2017)

  • Upgrade to Tesseract 3.05.01 (2158661)
  • Upgrade to Tess4J 3.4.0

New in VietOCR 5.0 Alpha (Mar 7, 2017)

  • Upgrade to Tesseract 4.00alpha (b851d47) and language data
  • Upgrade to Tess4J 4.0.0-SNAPSHOT

New in VietOCR 4.5 (Feb 18, 2017)

  • Upgrade to Tesseract 3.05 (5afface)
  • Upgrade to Tess4J 3.3.0

New in VietOCR 4.4 (Jan 16, 2017)

  • Update GhostScript to 9.20
  • Improvements:
  • Additional image filters
  • Expand support to include Regex text replacements from DangAmbigs.txt file
  • Hyphen replacements

New in VietOCR 4.3 (May 31, 2016)

  • Update Tess4J to 3.2.1
  • Convert WIA scanned image BMP to PNG

New in VietOCR 4.3 RC (May 25, 2016)

  • Implement remove lines & crop image function
  • Update Tess4J to 3.2
  • Update various dependency versions

New in VietOCR 4.2 (May 25, 2016)

  • Upgrade to Tesseract 3.04.01 (4ef68a0)
  • Upgrade to Tess4J 3.1
  • Update various dependency versions

New in VietOCR 4.1 (Jan 19, 2016)

  • Upgrade to Tesseract 3.04 (953523b)
  • Upgrade to Tess4J 3.0 and Lept4J 1.0.1
  • Image zoom with mousewheel and Ctrl key
  • Display segmented regions
  • Update translations

New in VietOCR 4.0 (Apr 2, 2015)

  • Upgrade to Tesseract 3.03 RC (r1127)
  • Upgrade Tess4J to v2.0
  • Add support for searchable PDF output in bulk/batch mode

New in VietOCR 3.6 RC (Feb 17, 2015)

  • Update JNA to v4.1.0
  • Update Ghost4J to v0.5.1
  • Update Tess4J to 1.4.1
  • Add Split TIFF function
  • Add thumbnail bar for ease of page navigation
  • Display useful info in statusbar
  • Update URL of OpenOffice dictionaries
  • Update Hunspell to v1.3.3 and fix a NPE
  • Add support for reading specific configs files for setting control parameters

New in VietOCR 3.5 (Feb 18, 2014)

  • Additional translations

New in VietOCR 4.0 Beta (Feb 18, 2014)

  • Upgrade to Tesseract 3.03 RC (r1051)
  • Upgrade Tess4J library
  • Add support for searchable PDF output in bulk/batch mode

New in VietOCR 3.5 Beta (Nov 27, 2013)

  • Update Tesseract 3.02 to r866
  • Update Tess4J library
  • Update JNA to v4.0
  • Update JACOB to 1.17 version
  • Enhance Bulk ops with subdirectory support
  • Incorporate image filters
  • Implement Undo function
  • Additional localized UI data

New in VietOCR 3.4.3 Beta 3 (Jul 11, 2013)

  • Update Tesseract 3.02 to r861
  • Update Tess4J library
  • Update JNA to v3.5.2
  • Add localized UI data for Catalan, Turkish, and Czech

New in VietOCR 3.4.3 Beta 2 (Jul 2, 2013)

  • Update Tesseract 3.02 to r855
  • Update Tess4J library
  • Update JNA to v3.5.2
  • Add Catalan localized UI data

New in VietOCR 3.4.2 (Jan 7, 2013)

  • Add hocr support for Bulk & Batch and command-line operations
  • Update links to dictionary files
  • Update Tesseract 3.02 to r820
  • Update JNA to v3.5.1

New in VietOCR 3.4.1 (Dec 13, 2012)

  • Add Bulk OCR process
  • Update Tesseract 3.02 to r806

New in VietOCR 3.4 (Oct 31, 2012)

  • Update Tesseract engine to v3.02 (r798)
  • Use Tesseract 3.02 language data packs
  • Enable text entry in the combobox for Tesseract 3.02's multi-language OCR support
  • Fit Image now retains image aspect ratio
  • Add optional support for using Tess4J library

New in VietOCR 3.4 Beta 4 (Jul 20, 2012)

  • Fix issue with setPageSegMode method involving Tess4J

New in VietOCR 3.4 Beta 3 (Jul 20, 2012)

  • Update Tesseract engine to v3.02 Alpha (r731)
  • Use Tesseract 3.02 language data packs
  • Enable text entry in the combobox for Tesseract 3.02's multi-language OCR support
  • Fit Image now retains image aspect ratio
  • Add optional support for using Tess4J library
  • Update JACOB to 1.16.1 version

New in VietOCR 3.4 Beta (Feb 29, 2012)

  • Update Tesseract engine to v3.02 Alpha (r684)
  • Enable text entry in the combobox for Tesseract 3.02's multi-language OCR support
  • Fit Image now retains image aspect ratio

New in VietOCR 3.3 Beta (Feb 13, 2012)

  • Update Tesseract engine to v3.02 Alpha (r671)
  • Use Tesseract 3.02 language data packs
  • Enable text entry in the combobox for Tesseract 3.02's multi-language OCR support
  • Update Hunspell to v1.3.2

New in VietOCR 3.2.2 (Jan 23, 2012)

  • Fix a context menu's font issue with displaying Unicode characters for spellcheck suggestions

New in VietOCR 3.2.1 (Jan 16, 2012)

  • Fix an issue with opening Help file on OS X
  • Update JACOB to 1.16-M2 version
  • Update JNA to 3.4.0 version

New in VietOCR 3.2 (Oct 24, 2011)

  • Update Tesseract 3.01 to r638 (final release version)
  • Remove unneeded liblept168.dll
  • Update lists of language codes
  • Update JACOB to 1.16-M1 version
  • Add PSM support to execution from command line

New in VietOCR 3.1.5 (Oct 3, 2011)

  • Update Tesseract 3.01 to r625
  • Provide Page Segmentation Mode options for Tesseract engine

New in VietOCR 3.1.5 Beta (Aug 23, 2011)

  • Update Tesseract 3.01 to r622
  • Provide PSM options to Tesseract engine

New in VietOCR 3.1.4 (Aug 2, 2011)

  • Update Tesseract 3.01 to r597

New in VietOCR 3.1.3 (Jun 6, 2011)

  • Refactoring
  • Improve program usability, enabling image nagivation and manipulation with keyboard
  • Fix an EOL issue that broke Remove Line Breaks functionality on Windows
  • Fix an issue with restart notification after language pack downloads
  • Update Tesseract 3.01 to r585
  • Replace Vietnamese language pack with an improved version

New in VietOCR 3.1.2 (May 30, 2011)

  • Incorporate deskew functionality using GMSE Deskew algorithm
  • Fix a MissingResourceException associated with Font dialog (Java only)

New in VietOCR 3.1 (May 30, 2011)

  • Port changes from version 2.0
  • Update Tesseract OCR engine to 3.01 (r551)

New in VietOCR 3.0 (May 30, 2011)

  • Upgrade Tesseract OCR engine to 3.0
  • Replace old format (2.0x) language data with new format (3.0) language data
  • Change datafile suffix from .inttemp to .traineddata

New in VietOCR 2.0.2 (May 30, 2011)

  • Incorporate deskew functionality using GMSE Deskew algorithm
  • Fix a MissingResourceException associated with Font dialog (Java only)

New in VietOCR 2.0.1 (May 30, 2011)

  • Fix a bug which hangs the program if x.DangAmbigs.txt contains entries starting with an equal symbol
  • Improve postprocessing performance by caching the word list used; reload only if changes
  • Fix a bug that crashes the program when inline spellcheck suggests on empty text (.NET only)
  • Incorporate Apple Java Extensions (Java only)

New in VietOCR 2.0 (May 30, 2011)

  • Upgrade JACOB library to version 1.15-M4 (Java only)
  • Add support for spellcheck suggestion in context menu
  • Improve program accessibility and usability
  • Add support for downloading and installing language data packs and appropriate spell dictionaries
  • Add UI localization for Lithuanian and Slovak
  • Refactor by breaking up large classes into smaller ones

New in VietOCR 1.9 (May 30, 2011)

  • Integrate a Java binding for Hunspell library to provide spellchecking and spellcheck-as-you-type functionality. Include English and Vietnamese dictionaries
  • Add support for a custom dictionary
  • List in correct order files generated from PDF conversion
  • Upgrade JACOB library to version 1.15-M3
  • Preset Tesseract path on Linux to /usr/bin, the default install location of Tesseract

New in VietOCR 1.8 (May 30, 2011)

  • Display image information
  • Add Screenshot Mode, which rescales low-resolution images to 300 DPI to be more suitable for OCR operations
  • Read output and error streams to prevent subprocess to block or deadlock due to limited buffer size for standard output streams (Java version)
  • Fix a problem in which paste (image) event fires twice (Java version)
  • Fix an issue with subimages generated by selection box on Linux (Java version)

New in VietOCR 1.7 (May 30, 2011)

  • Add provision to load UTF-8 text file into textbox
  • Add Recent Files submenu
  • Add Save button on toolbar
  • Fix scale factor, offset issues in image manipulation
  • Improve postprocessing for Vietnamese
  • Add support for more VNI fonts to Vietnamese language data

New in VietOCR 1.6 (May 30, 2011)

  • Fix an image size issue and associated scale factor when toggling between Fit Image vs. Actual Size after (Java) resizing window or (.NET) scrolling in picturebox
  • Add unit test
  • Improve post-OCR correction for Vietnamese
  • Bundle Vietnamese language data for VNI & TCVN3 (ABC) fonts

New in VietOCR 1.5 (May 30, 2011)

  • Add support for execution from command line
  • Add support for paste image from clipboard
  • Add support for JPEG2000 and PNM image types (Java version)

New in VietOCR 1.4 (Oct 28, 2009)

  • Publish OCR interim results to produce more responsive UI performance, improving user experience
  • Support for cancellation of running OCR tasks
  • Merge PDF functionality

New in VietOCR 1.3 (Oct 28, 2009)

  • Improved exception handling with appropriate error messages
  • Improved handling of PDF documents that has many pages. Putting too many images, as a result of PDF extraction, in a multi-page TIFF eventually will generate out-of-memory exceptions
  • Split PDF functionality

New in VietOCR 1.2 (Oct 28, 2009)

  • Integrated PDF support using GPL Ghostscript

New in VietOCR 1.1 (Oct 28, 2009)

  • Merge TIFF functionality

New in VietOCR 1.0.1 (Oct 28, 2009)

  • Refactored for improvements

New in VietOCR 1.0 (Oct 28, 2009)

  • Updated to Tesseract 2.04 engine (bundled Windows executable)
  • Added more language codes to ISO639-3.xml file
  • Added a pangram.xml file for displaying appropriate Preview text in the Font Dialog for the OCR language currently selected
  • Moved various settings to the Options dialog
  • Removed the option of Locating Tesseract on Windows. Current Tesseract is the executable bundled inside the program
  • Added support for custom text replacement in postprocessings

New in VietOCR 0.9.13 (Oct 28, 2009)

  • Updated to Tesseract 2.04RC engine
  • Added indeterminate progressbar for (more animated) task status
  • Added All Image Files filter
  • Removed Vietnamese-glyph font filter to now show all system fonts
  • Changed FontDialog's default Preview text to the standard English pangram to make it more universal
  • Modified SimpleFilter to accept multiple file extensions

New in VietOCR 0.9.12 (Oct 28, 2009)

  • Fixed the way TESSDATA_PREFIX environment variable handled in Linux
  • Clean up temporary files if errors occur during OCR operations
  • Fixed a regression EOL bug with output files in Windows
  • Display appropriate error message during batch process

New in VietOCR 0.9.11 (Oct 28, 2009)

  • Added text formatting functionality

New in VietOCR 0.9.10 (Oct 28, 2009)

  • Added watch folder functionality for Batch Processing support

New in VietOCR 0.9.9 (Oct 28, 2009)

  • Revamped localization codes
  • Added rudimentary support for English postprocessing

New in VietOCR 0.9.8 (Oct 28, 2009)

  • Minor fixes and various improvements

New in VietOCR 0.9.7 (Oct 28, 2009)

  • Implemented image rotation functionality

New in VietOCR 0.9.6 (Oct 28, 2009)

  • Fixed an error with path in Linux
  • Additional instruction for configuring Tesseract on Linux

New in VietOCR 0.9.5 (Oct 28, 2009)

  • Integrated scanning support via WIA Automation Library v2.0

New in VietOCR 0.9.4 (Oct 28, 2009)

  • Localized user interface

New in VietOCR 0.9.3 (Oct 28, 2009)

  • Proof-of-concept design
  • Support TIFF image formats
  • Added support for JPEG, GIF, BMP, PNG formats
  • Added post-processing for Vietnamese to improve accuracy
  • Added Vietnamese input methods
  • Added recognition of selected area on image
  • Added file drag-drop
  • Added a context menu for the textarea
  • Added support for selection of Look and Feel
  • Display appropriate message when Tesseract engine crashes
  • Fixed the issue involving filepaths containing spaces
  • Bundled JAI Image I/O 1.1 library
  • Use Java 6.0
  • Use Tesseract 2.03 OCR engine
  • Use Vietnamese language data for Tesseract 2.03 (data for 2.01 crashes frequently with Tesseract 2.03)