docx2txt Changelog

What's new in docx2txt 1.4

May 16, 2014
  • New feature:
  • Added configuration variable config_unzip_opts. This removes dependency on unzip program, and allows users to use unzipping programs like 7z, pkzipc, winzip as well.
  • Updates:
  • Fixed list numbering.
  • Improved list/paragraph indentation and corresponding code.
  • Updated README with brief guidance on how this utility can be used to recover text from corrupted docx file.

New in docx2txt 1.3 (Apr 8, 2014)

  • New features:
  • Added support for handling lists (bullet, decimal, letter, roman) along with (attempt at) indentation.
  • Updates: - Added configuration variable config_twipsPerChar.
  • Removed configuration variables: config_listIndent, config_exp_extra_deEscape.
  • Text output omits deleted text. This matters in case changes are being tracked in docx document.
  • Text output omits non-document_text content marked by wp/wp14 tags.

New in docx2txt 1.2 (Jan 16, 2012)

  • New features:
  • Perl script usage is extended to accept docx file from standard input. It also
  • works with input/output redirection now. Please refer to the documentation for more information.
  • Script files and configuration file can be installed in separate directories on
  • non-Windows) systems using Makefile for installation.
  • User specific and system wide configuration files can be maintained separately even on windows.
  • Updates:
  • "-h" has to be given as the first argument to Perl script to get usage help.
  • Added new configuration variable "config_tempDir".
  • Configuration file is uniformly looked for in current directory, user configuration directory (APPDATA on Windows and HOME on non-Windows), system configuration directory (same location as script files on Windows, /etc or as set during installation on non-Windows systems) in the specified order.
  • Documentation has been updated with usage examples and information on how .docx file text content can directly be viewed using Vim and Emacs editors.
  • Improved handling of special (non-text) characters, along with support for more non-text characters like fractions.
  • Fixed Bug #3463033: added ' and " to docx specific escape character conversions.
  • Fixed the wrong code that had got committed during earlier fixing of nullDevice for Cygwin.

New in docx2txt 1.1 (Dec 13, 2011)

  • New features:
  • Added a check for existence of unzip command.
  • Configuration file is looked for in HOME directory as well.
  • Updates:
  • Configuration variables now begin with config_ .
  • Fixed bugs #3003903, #3082018 and #3082035.
  • Fixed nulldevice for Cygwin.
  • Superscripted cross-references are placed within [...] now.