Bytescout PDF Extractor SDK Changelog

What's new in Bytescout PDF Extractor SDK 13.4.1 Build 4780

Jul 14, 2023

Enhanced text parsing
Improved image file rendering
Other minor fixes and improvements.

New in Bytescout PDF Extractor SDK 13.4.0 Build 4659 (Apr 10, 2023)

New in Bytescout PDF Extractor SDK 13.3.0 Build 4514 (Sep 27, 2022)

New in Bytescout PDF Extractor SDK 13.2.0 Build 4485 (Jun 7, 2022)

New in Bytescout PDF Extractor SDK 13.1.0 Build 4386 (Jan 25, 2022)

New in Bytescout PDF Extractor SDK 13.0.0 Build 4253 (Jan 25, 2022)

New column detection mode 'ColumnDetectionMode.ContentGroupsAI' that works better on tables without borders and on pages with multiple tables
Greatly improved tables detection in 'TableDetector2'
Improved filtering of shadow-like text ('ExtractShadowLikeText' option)
Improved the 'LineGroupingMode.JoinOrphanedRows'
DocumentMerger': Improved merging of PDF forms. Now it can link fields with matching names or rename them to avoid unwanted linking. See the property 'RenameMatchingFieldsDuringMerge'
JSONExtractor' and 'XMLExtractor' now output the page size for each page
All extractor classes now support extraction of page ranges
Added properties 'DetectUnderlineTextStyle' and 'DetectStrikeoutTextStyle' to 'CSVExtractor' and 'XLSExtractor'. They help to prevent underlined text affecting the line grouping in table cells
Improved background color detection for the option 'ConsiderBackgroundColors'
Added property 'NormalizeText' to all extractors. It replaced unicode spaces and hyphens in the extracted text with normal ' ' and '-' characters
Remover2': fixed handling of PDF page rotation
Remover2': making unsearchable now performed only for edited pages
XMLExtractor': Added property 'IndentedXML' to control indentation
JSONExtractor': Added property 'IndentedJSON' to control indentation
Stamper': fixed stamping of rotated pages
Added new OCR mode - 'OCRMode.AutoRepairFonts'. It automatically tries to detect PDF documents with corrupted text and forces OCR font repair for them. Works only for English texts
Added property 'PageSeparator' to CSV and XLS extractors
XLSExtractor': improved negative numbers detection
TextExtractor.FindAll()' method was ignoring the case sensitivity option. Fixed now
Added property 'OCRDetectLines' that helps to detect table structure in scanned documents
JSONExtractor' and 'XMLExtractor' now outputs number of pages in the result and number of pages for which OCR was performed
Added property 'OCRPageCount' to extractors that contains number of pages for which OCR was performed during the last extraction
JSONExtractor': Added property 'OutputStructure' that allows to select structure of output JSON
JSONExtractor': Added property 'OutputTransformation' that allows to apply JSONPath expression to the output JSON
Performance improvements
Improved parsing of PDF documents
Other minor fixes and improvements

New in Bytescout PDF Extractor SDK 12.1.0 Build 4136 (May 18, 2021)

New in Bytescout PDF Extractor SDK 11.3.0 Build 3983 (Oct 26, 2020)

New in Bytescout PDF Extractor SDK 11.2.0 Build 3919 (Jun 30, 2020)

New in Bytescout PDF Extractor SDK 11.1.0 Build 3845 (Mar 19, 2020)

New in Bytescout PDF Extractor SDK 11.0.0 Build 3805 (Feb 12, 2020)

New in Bytescout PDF Extractor SDK 10.8.0 Build 3732 (Dec 4, 2019)

New in Bytescout PDF Extractor SDK 10.7.0 Build 3697 (Nov 2, 2019)

New in Bytescout PDF Extractor SDK 10.6.0 Build 3659 (Oct 1, 2019)

New in Bytescout PDF Extractor SDK 10.5.0 Build 3637 (Sep 2, 2019)

New in Bytescout PDF Extractor SDK 10.4.0 Build 3600 (Aug 7, 2019)

New in Bytescout PDF Extractor SDK 10.3.0 Build 3566 (Jul 2, 2019)

New in Bytescout PDF Extractor SDK 10.2.0 Build 3512 (May 29, 2019)

New in Bytescout PDF Extractor SDK 10.0.0 Build 3420 (Mar 22, 2019)

New in Bytescout PDF Extractor SDK 9.4.0 Build 3398 (Mar 12, 2019)

New in Bytescout PDF Extractor SDK 9.3.0 Build 3352 (Feb 4, 2019)

New in Bytescout PDF Extractor SDK 9.2.0 Build 3254 (Oct 24, 2018)

New in Bytescout PDF Extractor SDK 9.1.0 Build 3163 (Jul 19, 2018)

New in Bytescout PDF Extractor SDK 9.0.0 Build 3079 (Apr 12, 2018)

New in Bytescout PDF Extractor SDK 8.8.1.3025 (Jan 29, 2018)

New in Bytescout PDF Extractor SDK 8.8.0.3015 (Jan 23, 2018)

New in Bytescout PDF Extractor SDK 8.7.0.2980 (Nov 8, 2017)

New in Bytescout PDF Extractor SDK 8.6.0.2911 (Aug 6, 2017)

New in Bytescout PDF Extractor SDK 8.5.0.2855 (Jun 2, 2017)

New in Bytescout PDF Extractor SDK 8.4.0.2820 (Jun 2, 2017)

New in Bytescout PDF Extractor SDK 8.3.0.2792 (Jun 2, 2017)

New in Bytescout PDF Extractor SDK 8.2.0.2697 (Feb 1, 2017)

New in Bytescout PDF Extractor SDK 8.1.1.2606 (Nov 18, 2016)

New in Bytescout PDF Extractor SDK 8.1.0.2600 (Nov 18, 2016)

New in Bytescout PDF Extractor SDK 8.0.0.2523 (Nov 18, 2016)

New in Bytescout PDF Extractor SDK 7.00.0.2474 (Jul 7, 2016)

New in Bytescout PDF Extractor SDK 6.00.2071 (May 20, 2015)

PDF to XML, PDF To CSV, PDF To Text functionality improved
PDF To XLS command line sample added (based on vbscript)
PDF To HTML SDK adds new .DetectHyperLinks property (TRUE by default) to enable/disable automated links detection in the text
New SearchablePDFMaker (available for PRO licenses) to convert PDF into searchable PDF files
new properties in extractor: ConsiderFontNames, ConsiderFontSizes, ConsiderFontColors, ConsiderVerticalBorders in CFG files
header columns detection (when AutoAlighHeaderToColumns = true) improved
.DetectLinesInsteadOfParagraphs replaced with new .LineGroupingMode to control how lines are merged into paragraphs
IMPORTANT PDF To XML fixes long time issue with incorrect Y coordinate for text objects (was point to the bottom left instead of top left)
.TableXMinIntersectionRequiredInPercents and .TableYMinIntersectionRequiredInPercents properties added
C++ source code sample added
XML Extractor fixes missing empty columns in PreserveFormatting=true mode
Minor fixes in colors in some PDF files
support for for multiple OCR languages added
PDF Multitool GUI: adds Copy to Clipboard button to TXT, CSV, XML and raster renderer dialogs
XLSExtractor: adds PageToWorksheet property to enable/disable generation of separate worksheets per page.
new .TextEncodingCodePage property
PDFViewerControl: adds ValidateContextMenu allowing user to add custom items to context menu
PDF Viewer control: adds properties ShowTextObjects, ShowImageObjects, ShowVectorObjects.
XMLExtractor now adds "OCRConfidence" attribute for recognized text
PDF/A checking functionality (in beta)
improving controls and text checking and alignment according to the original layout. The issue was caused by the shift of Y coordinates in controls while parsing: that was incorrect. The correct way is to shif...
XML Extractor updated: now produces tag for checkboxes and text fields
changed using of current directory to temp directory.
checkboxes,radioboxes, editboxes, comboboxes are better supported
now allows partial trust callers.

New in Bytescout PDF Extractor SDK 5.80.1781 (Jan 29, 2015)

New in Bytescout PDF Extractor SDK 4.00.1487 (Jun 3, 2014)

New in Bytescout PDF Extractor SDK 3.30.1240 (Dec 11, 2013)

New in Bytescout PDF Extractor SDK 3.20.1209 (Dec 11, 2013)

New in Bytescout PDF Extractor SDK 3.20.1200 (Dec 11, 2013)

New in Bytescout PDF Extractor SDK 3.20.1179 (Oct 23, 2013)

New in Bytescout PDF Extractor SDK 3.20.1100 (Oct 23, 2013)

New in Bytescout PDF Extractor SDK 3.20.1092 (Aug 14, 2013)

New in Bytescout PDF Extractor SDK 3.20.1075 (Jul 15, 2013)

New in Bytescout PDF Extractor SDK 3.10.899 (May 17, 2013)

New in Bytescout PDF Extractor SDK 3.00.864 (Apr 13, 2013)

New in Bytescout PDF Extractor SDK 3.00.825 (Mar 14, 2013)

New in Bytescout PDF Extractor SDK 2.50.708 (Dec 13, 2012)

New in Bytescout PDF Extractor SDK 2.40.650 (Nov 2, 2012)

New in Bytescout PDF Extractor SDK 2.30.594 (Sep 24, 2012)

New in Bytescout PDF Extractor SDK 2.30.568 (Sep 24, 2012)

New in Bytescout PDF Extractor SDK 2.20.539 (May 23, 2012)

New in Bytescout PDF Extractor SDK 2.20.525 (May 23, 2012)

New in Bytescout PDF Extractor SDK 2.20.458 (Feb 9, 2012)

New in Bytescout PDF Extractor SDK 2.20.415 (Dec 29, 2011)

New in Bytescout PDF Extractor SDK 2.20.396 (Dec 12, 2011)

New in Bytescout PDF Extractor SDK 2.20.392 (Dec 12, 2011)

New in Bytescout PDF Extractor SDK 2.10.303 (Dec 12, 2011)

New in Bytescout PDF Extractor SDK 2.10.276 (Dec 12, 2011)

New in Bytescout PDF Extractor SDK 2.00.228 (Dec 12, 2011)

New in Bytescout PDF Extractor SDK 2.00.217 (Dec 12, 2011)

New in Bytescout PDF Extractor SDK 2.00.206 (Jun 8, 2011)

New in Bytescout PDF Extractor SDK 2.00.186 (Jun 8, 2011)

New in Bytescout PDF Extractor SDK 1.10.168 (Jun 8, 2011)

New in Bytescout PDF Extractor SDK 1.10.160 (Jun 8, 2011)

New in Bytescout PDF Extractor SDK 1.10.150 (Jun 8, 2011)

New in Bytescout PDF Extractor SDK 1.10.144 (Jun 8, 2011)

New in Bytescout PDF Extractor SDK 1.10.121 (Jun 8, 2011)