PDF Data Extractor Changelog

What's new in PDF Data Extractor 3.04

Jan 25, 2023
  • See what's new in help

New in PDF Data Extractor 3.03 (Sep 9, 2022)

  • updated custom pattern match for 'a' alpha to check a-z, A-Z
  • fixed a settings issue.
  • updated custom match for use in multi output methods.
  • added custom part match e.g. match on nn-nnn would match 12-345 but also anything taken afterwards in the word e.g. 12-345E.1
  • changed limit outputs for multi output from 200 to 800.
  • added XX-XXX to pattern match for alpha or numeric.
  • added column adjustment for (multi) and line feed with extra fields used.

New in PDF Data Extractor 3.02 (Nov 22, 2021)

  • Fix for log move viewer title.
  • Fix for permissions issues and monitor setup.
  • Fix for permission log issues.
  • Fixed an issue with batch process list from menu.
  • Fixed process button status when loading new setup.
  • Fixed a read only not closing file issue when permissions redirected to desktop output.
  • Added clear all option to batch list menu.
  • Fix close buttons in log viewers when scaled.
  • Added F1 help to monitor.
  • Fix for when adding multiple files to batch.
  • Fix for enterprise monitor - report to one option.

New in PDF Data Extractor 3.01 (Oct 8, 2021)

  • fix for until text match, now case & non-case sensitive.
  • added new output option: if last exact data match output all after until match (across all pages), also ignores rules beyond rule page for it to work, must have start and end match to output.
  • added drag and drop .pde files to load them automatically in the source file(s) box.
  • added extras until text match now can use multiple words to match on with a pipe delimiter e.g. Account|Statement
  • changed text limit per column to 4096

New in PDF Data Extractor 3.00 (Sep 24, 2021)

  • added smart setup highlight in adobe for a quick and easy setup
  • added email search match output eg test@123com
  • added Upper, Lower & Smart Upper-Lower to extras
  • added Telephone number output numbers detection within a text eg tel: +nn nnn nnnnn or tel: +n (nnn) nnnnnn, 20+ different permutations detected
  • added two pattern matching options eg input data is like Account: AA12345-6789 you can take AA12345-6789 with pattern match anywhere aannnnn-nnnn
  • improved offset match, can now full text match, first part match, [a] or [n] or [an] for first alpha or numeric match or either
  • added stop to process button for stopping at any point during the processing
  • improved File Save, to show saved when not default file
  • added [COL(n)] or [F(n)] to file naming, for using the column data extracted into the filename eg c:report[F1]csv or multiple eg c:report[F1]-[F2][F3]csv
  • added lookup file match to extras, so now can substitute comma separated data with other data eg 123,Canon EOS 500 will data match code '123' with output 'Canon EOS 500'
  • added font name and font size to list
  • added font name / size matching, so now you can extract eg any bold font match in an area or font name and or size
  • added yellow color in list to extract data for easier viewing
  • added smart setup color from pdf to list eg if data highlight note is set as red in adobe pdf then this color is used in the list
  • added quick header in header setup, tries to guess data titles on data & setup with one button click
  • added quick example loading from file menu
  • added presets for popular requests and common bills / invoices / bank statements eg BT, EE, MBNA, Barclays, Virgin Money etc etc
  • added xls and xlsx output support
  • added xls / xlsx multi-sheet output support can have up to 5 sheets
  • added output (multi) (join + add space) for easier repeated lines of multiple words setup
  • fix for saveas issue when clicking on a file then changing it, won't save also pre-populates save to filename as one loaded
  • fixed issue with view pdf when location of pdf changed / typed not picking up change
  • added FDF output form file support eg output as fdf to import into another pdf template or batch print
  • added XML output support
  • added start at page option, so now can process last page eg enter 999 as start or skip a page by entering eg 2, 0 is for start at the beginning
  • fixed issues with multiple rules per page for different outputs
  • added highlight rule exact match, eg IFPOSMATCH: 1 for if rule "h,v exact match data: 1" rule on that page h,v analysis
  • added IFPOSHMATCH: n and IFPOSVMATCH: n to highlight setup rules
  • added AFTER2 <data> <data> in highlight setup for after two words of data note: AFTER <data> is for after last word
  • added excel sums etc to output header eg Vat=SUM(E[LINE]/100*20) would output header Vat, then do =SUM(E2/100*20) for line 2 in the output
  • fixed sort not working
  • added light red color to rule matches for easy viewing

New in PDF Data Extractor 2.02 (Feb 19, 2021)

  • Fix for scan ocr issue.
  • Fix for output all on one line and batch, now adds line feed after each file automatically.
  • Some improvements to one line processing per file and positions.
  • Fix for potential line feed problem.

New in PDF Data Extractor 2.01 (Nov 25, 2020)

  • Added if last exact data match then output h,v range joined with space.
  • Added h>=(n) && h<=(n),v>=(n) && v<=(n) match output (last one) (join + add space)
  • Added take until text option in filter. so can stop at certain text to take e.g. 'test 123' take only 'test'

New in PDF Data Extractor 2.00 (Apr 10, 2020)

  • changed c:default.lst to default.lst
  • when no text on analyze show link to video: https://youtu.be/aNeNPp5EfVg
  • added add multiple pdf files to batch list option.
  • added pre OCR first option.
  • fixed no output, when no output matched wasn't outputing line to csv.
  • fixed number constraints for negative in input boxes. now -9999 to 9999
  • added alpha/numeric option filter.
  • fixed a money value figure not working correctly.
  • fix for delete key on multiple conditions.
  • verification fix for money filter, should have , or . when above 999. otherwise probably wrong data so blank it.
  • fix for , in money value only.
  • added clear all settings option in setup.
  • fix - when drag and drop file, it clears output path.
  • fix for date formats when . in date e.g. 20.12.19 instead of slash /.
  • added 2000 to date output if below 1900 e.g. 20.12.19 to 20/12/2019
  • fix for joined dates e.g. 20.12.1 then 9 later fix, now ouptut's correctly.
  • now checks for bad date more than 10 characters then blanks it. so can pickup correct one later.
  • added numeric filter fixing option e.g. O to 0, l to 1, S to 5, B to 8 etc etc. if need any others then please let us know.
  • now when outputing in spreadsheet mode i.e. multi rows - you need to use the (multi) output option.
  • changed so can add line feed after other output extras option. e.g. filter number fixing.

New in PDF Data Extractor 1.06 (Nov 26, 2019)

  • Fix with loading rules limit when more than 20.

New in PDF Data Extractor 1.05 (Feb 23, 2017)

  • Added:
  • Sortation
  • Customize link email

New in PDF Data Extractor 1.04 (Jul 18, 2012)

  • added row adjust for floating row sizes, match on h text
  • changed text dump to include all fonts, embedded and subset fonts

New in PDF Data Extractor 1.03 (Jul 18, 2012)

  • fix for dialog resize
  • fixed some joining (join) parameter issues.
  • added column option
  • added rows options and size of rows on same page

New in PDF Data Extractor 1.02 (Jul 18, 2012)

  • added find box and button
  • if last-1 exact data match option to output
  • if last-2 exact data match option to output
  • if last-3 exact data match option to output
  • if last-3 && if last-3 exact data match option to output
  • added rename / copy script options, for batch renaming files on data extraction
  • added batch rename to dos options -b option
  • added page stop option

New in PDF Data Extractor 1.01 (Jul 18, 2012)

  • fix for command line rules missing