dtSearch Changelog

What's new in dtSearch 2023.02 Beta

Oct 25, 2023
  • Updated RAR file parser to the current version of the Rarlab source (6.2.10 released August 1, 2023). dtSearch uses source code from Rarlab to implement content extraction from RAR archives.
  • Note: Rarlab reports that the updated source fixes two security vulnerabilities. Based on information available about vulnerabilities, we do not believe they affect any dtSearch product.
  • (1) CVE-2023-40477 (out of bounds write) affects RAR recovery volumes. RAR recovery volumes have a .rev extension and a different binary header from RAR archives, so dtSearch will not invoke the RAR file processing code if it encounters a RAR4 recovery volume. Additionally, the code to process recovery volumes is disabled in the RAR extraction code that dtSearch uses. This Rarlab article states that unrar.dll is not affected by the vulnerability, and the unrar.dll source code is what dtSearch uses in its RAR file parser.
  • (2) CVE-2023-38831 (launching of an incorrect file) is associated with the WinRAR user interface and does not affect dtSearch.
  • dtSearch Desktop Index Manager index properties window displays open error message instead of blank index properties when an index cannot be opened
  • Fixed search performance bug that could cause reduced search speed during phrase searches involving extremely large documents
  • Fixed incorrect handling of the Unicode soft hyphen character (U+00AD), which should be ignored during searches because it is not indexed.
  • Fixed bug that could cause duplicate field name errors when documents include field names with certain CJK diacritical characters. Verifying an index will indicate if an index is affected. Affected indexes should be rebuilt.
  • Text log files history.ix, dtSearchIndexingHistory.log, and indexlog.dat now format dates YYYY/MM/DD instead of MM/DD/YYYY
  • File parser bug fixes affecting: DOCX, PDF
  • In dtSearch Desktop the Options > Create group policy dialog creates an MSI file that places the registry keys under HKMU instead of HKCU or HKLM, so you can control at installation time whether the installation is per-user or per-machine. To install per-machine, specify ALLUSERS=2 and run the .msi with administrator permissions.
  • Fixed time zone bug in the Linux indexer causing documents to be reindexed unnecessarily during an incremental index update.
  • Added option in dtSearch Desktop in Options > Preferences > Indexing Resources to "Ask Windows to keep computer awake during indexing" (to prevent automatic sleep from blocking scheduled updates)./li>
  • Other bug fixes

New in dtSearch 2022.02 (Build 8775) (Dec 2, 2022)

  • dtSearch Desktop/Network:
  • Added built-in stemming rules and noise word lists for over 25 European languages.
  • Added option in Search dialog box to select language to use for stemming.
  • Added option in Options > Preferences > Letters and Words to select language to use for noise words.
  • Added option in Options > Preferences > Indexing Resources to control number of threads used for multi-threaded indexing
  • dtSearch Engine:
  • WebDemo sample ASP.NET Core application updated to require .NET Core 6 and current versions of Bootstrap and JQuery
  • Stemming rules and noise word lists for over 25 European languages are included in the dtSearch Data folder.
  • Fixes/Minor Enhancements:
  • Fixed bug in beta/preview multithreaded indexer that could cause a corrupt index to be generated if the underscore character is changed to a word break character in the alphabet settings.
  • Fixed dtSearch Desktop search dialog box bug causing search dialog box to close when a search finds no documents.
  • Fixed dtSearch Desktop search dialog box bug causing "More search options" dialog box to clear all criteria from previous search between searches.
  • Fixed dtSearch Desktop search results display causing modification times to incorrectly reflect daylight savings times (earlier or later by one hour)
  • Improved View > Document Information display, shown as a dockable sidebar instead of a pop-up and with better formatting of information.
  • File parser bug fixes affecting: DOCX, WordStar, PDF
  • Exceptions thrown from .NET 4.x DataSource API DocStream objects previously would fail the associated I/O operation after a retry; now any detected exceptions will immediately fail the document and report it through IIndexStatusHandler and WasDocError.
  • Other bug fixes

New in dtSearch 2021.02 (Build 8730) (Dec 7, 2021)

  • Index Groups:
  • Index groups provide a way to organize your indexes in the Search dialog box to make very large numbers of indexes easier to manage. To enable this option, click Options > Preferences > Search Options, and check the box to "Show indexes by group". When groups are enabled, if an index name contains a colon, then the part before the colon is considered to be the group. For example, if an index is named "Business: Records", then the group is "Business". In the Search dialog box, "Records" would appear under a collapsible "Business" group heading.
  • "Find indexes":
  • The new "find indexes" control in the Search dialog box lets you filter the list of indexes by name and quickly select indexes to search using the keyboard.
  • To enable it, click Options > Preferences > Search Options, and check the box to "Show index finder".
  • To use "find indexes", in the Search dialog box press Ctrl+F and type any part of an index name. As you type the index list will update to only show matching index names. You can also use the * and ? wildcard characters to find indexes using wildcard matches.
  • Press Ctrl+Q to quickly check only the first listed index and return to the search request box, or press ENTER to just return to the search request box.
  • Better handling of font scaling on systems with multiple monitors.
  • Added sort direction arrows to the search results column headers.
  • Added Ctrl+Q hotkey for checking/unchecking checkboxes to select items in search results.
  • Added support for using 64-bit Adobe Acrobat/Adobe Reader to display PDF files. The 64-bit dtSearch PDF Search Highlighter plug-in is also needed for hit-highlighting to work.
  • The dtSearch Desktop/Network search shortcut now launches the dtSearch Desktop/Network version that corresponds to the Adobe Reader/Acrobat version installed, so if you have the 64-bit version of Adobe Acrobat or Adobe Reader, the shortcut will automatically launch the 64-bit version of dtSearch Desktop/Network.
  • Added option in Options > Preferences > Indexing resources to "Store a copy of indexed Outlook items in the index". Checking this option eliminates the need to run the 32-bit version of dtSearch Desktop/Network if you are using the 32-bit version of Outlook and the 64-bit version of dtSearch Desktop/Network if you are using the 64-bit version of Outlook, because dtSearch can display retrieved Outlook items by extracting them from the index.

New in dtSearch 7.96.8668 (Jun 10, 2020)

  • The 64-bit version of the dtSearch Engine now uses the Microsoft Universal CRT. The CRT is included with Windows 10 and is automatically installed by Windows Update on most other versions of Windows. It can be installed as described this Microsoft article: The latest supported Visual C++ downloads.
  • The 64-bit version of the dtSearch Engine supports Windows version Vista and later, so Windows XP is no longer supported.
  • The dtSearch PDF Search Highlighter plug-in has been updated for compatibility with the 2020 versions of Adobe Reader and Adobe Acrobat. The new highlighter can be downloaded here.
  • Fixed bug in .NET Standard DataSource API affecting modification and creation dates of documents and the DocDisplayName property.
  • Fixed hit highlighting error affecting EML with embedded IMG tag.
  • Fixed bug affecting the "not w/n" operator, causing it to retrieve a document that should not have been retrieved.
  • File parser bug fixes affecting: XLS, DOC, DOCX, PDF
  • Other bug fixes

New in dtSearch 7.95.8633 (Jan 30, 2020)

  • Fixed: in the .NET and .NET Core API, IndexInfo.UserFields did not delimit fields and values clearly. A quote-delimited string is used now, with double quote characters in field values converted to single quote characters to prevent ambiguity.
  • Fixed: Crash when the dtsConvertAutoUpdateSearch flag was used with dtsSearchWantHitsByWord in a field search.
  • Fixed: bug affecting fuzzy search in combination with wildcard near front of word.
  • Fixed: multicolor hit highlighting did not work with a search request that included a regular expression with the ( character as part of the regular expression pattern.
  • Fixed: Encoding error processing long UTF-8 text file (in version 7.94 only)
  • File parser bug fixes affecting: PDF, XLS, Unicode Filtering, MSG, CSV
  • Other bug fixes

New in dtSearch 7.84.8404 (Aug 26, 2016)

  • Fixes and minor enhancements:
  • dtSearch Desktop indexer now reports image-only PDF files in the index update log. (Metadata in these files is still indexed. This change only affects reporting in the index update log.)
  • Improved 32-bit indexing performance in low-memory conditions.
  • Fixed indexing crash in mso20win32client.dll in the latest update to Office 2016 when indexing Outlook messages.
  • Added API to change the score of a document in the dtsnSearchFound notification (C++) or ISearchStatusHandler.OnFound callback (.NET)
  • File parser bug fixes affecting: .pdf, .rtf, .emf, .doc, .xls, .ppt, .pst
  • Other bug fixes.

New in dtSearch 7.83.8353 (Aug 26, 2016)

  • Fixes and minor enhancements:
  • Improved formatting of WordPerfect documents.
  • Fixed file parsing bug causing extra line breaks to appear between paragraphs in .msg files.
  • File parser bug fixes affecting: .pdf, .msg, .xlsx, .jtd, .one
  • Other bug fixes.

New in dtSearch 7.82.8339 (Feb 10, 2016)

  • Fixes and minor enhancements:
  • All executables are code signed using SHA-2 in addition to SHA-1 (dual signed). All MSI files are signed using SHA-2 only, because MSI files do not support dual signing. Please see this Microsoft article for more information on SHA-1 deprecation.
  • Added option in dtSearch Desktop's Edit > Copy File function to automatically shorten excessively long filenames.
  • Added new C# sample, AjaxWordListBuilder, demonstrating how to use the dtSearch Engine's WordListBuilder object on a web page.
  • Added ixStepCommittingUpdate and ixStepRemovingDeletedFiles to the IndexingStep enumeration to separately identify these steps during an index update.
  • Added file parser support for a OneNote file format variant created by certain Microsoft online services.
  • Added experimental option in dtSearch Desktop to use the standard dtSearch Desktop "Next Hit" toolbar button to navigate hits in PDF files displayed in Adobe Reader (otherwise only the Ctrl+Shift+Space hotkey can be used). This option is in Options > Preferences > PDF View Options.
  • File parser bug fixes affecting: .pdf, .xlsx, .xlsb
  • Other bug fixes.

New in dtSearch 7.81.8281 (Feb 10, 2016)

  • Fixes and minor enhancements:
  • File parser bug fixes affecting: .doc, .pdf, .rar, .docx, .msg, .one, .pages, .qpw, .ppt
  • Fixed bug preventing "view as report" in dtSearch Desktop from working with PDF files opened in Adobe Reader
  • Fixed error reporting bug causing "Unable to access index %2" error message (without the index path) in dtSearch Desktop when an index could not be accessed to search.
  • Tested and compatible with Windows 10.
  • Added support for highlighting hits in Adobe Reader DC. An updated version of the dtSearch PDF Search Highlighter plug-in is also needed for Adobe Reader DC. Please see http://download.dtsearch.com/pdfhl to obtain the plug-in.
  • Other bug fixes.

New in dtSearch 7.80.8253 (Feb 10, 2016)

  • Added support for indexing PDF files with 128-bit RC4, 128-bit AES, and 256-bit AES encryption, as long as the file does not require a password to open and does not have the "copy text" permission disabled. Developer note: This is implemented in a new component, dtv_pdfCrypto.dll, that is subject to export restrictions. Please see the license.htm file accompanying this version for more information.
  • Fixes and minor enhancements:
  • Fixed dtSearch Desktop bug causing some PDF files to be opened in a separate Adobe Reader window when file is located on a network share.
  • Fixed extra "PBrush" and "Adobe Photoshop Image" captions in some Word documents with embedded images.
  • In the Linux version, the dtSearch Engine library (.so) files are installed in the dtsearch/bin and dtsearch/bin64 folders instead of the lib and lib64 folders.
  • Added RAR file parser (dtv_rar.so) to the Linux version of the dtSearch Engine
  • Fixed incorrect parsing of some .docx, .xlsx, and .pptx documents when document has missing or incorrect filename extension.
  • Other file parser bug fixes affecting: .mdb, .pdf, WordPerfect 4.2, WordStar, KeyNote, .tar
  • Other bug fixes.

New in dtSearch 7.79.8235 (Mar 18, 2015)

  • All products:
  • Added support for indexing Apple iWork 2009 Pages, Numbers, and Keynote files
  • Fixes and minor enhancements:
  • Fixed bug affecting cancellation of file conversion after either expiration of FileConverter.TimeoutSeconds or when OutputStringMaxSize exceeded when processing large binary input files with the dtsConvertInlineContainer flag.
  • File parser bug fixes affecting: *.xlsx, .pdf, .doc, .msg, .rtf, .wps
  • Added FileConverter.SetIndexCache() API to an IndexCache to be used with file conversion.
  • Updated March 10, 2015 to fix incorrect build number in version resource in build 8234.
  • Other bug fixes.

New in dtSearch 7.78.8215 (Nov 5, 2014)

  • Fixes and minor enhancements:
  • Fixed incorrectly rounded display of numeric value in Excel with conditional formatting
  • Fixed dtSearch Web search form bug causing "undefined" to appear in Filename field in Internet Explorer 8
  • Added IndexInfo.TotalDataSize to COM and .NET APIs
  • File parser bug fixes affecting: .docx (incorrect display of paragraph style; error handling non-breaking hyphens), .zip (hang indexing file deleted by antivirus software during indexing), .html
  • Other bug fixes.

New in dtSearch 7.77.8205 (Sep 3, 2014)

  • Fixed security issue reported in a third-party component, imgman32.dll, used in the dtimage.exe utility. See http://support.dtsearch.com/faq/dts0235.htm for more information.
  • Added support for indexing Outlook 2013 and 2010 OST files. Note: Microsoft has not officially documented the OST file format specification, so this support is based on unofficial non-Microsoft information about the OST file format.
  • Added support for indexing metadata in Adobe Photoshop images
  • Fixed "~dtpdf.tmp" filenames appeared in tabs in dtSearch Publish
  • Fixed PDF hit highlighting error in dtSearch Publish on systems with Adobe Reader versions 7-9 and Internet Explorer 10
  • Added "Images" field at the end of MIME messages listing names of inline image files.
  • Fixed incorrect handling of filename-only indexing option causing "Unsupported file format" errors
  • In the API, the flag dtsIndexCreateVersion6 is now ignored, so indexes will always be created in the current index format.
  • Fixed high-DPI scaling error in dtSearch Desktop causing checkbox lists to be drawn incorrectly
  • Fix bug causing filename-only indexing option to instead report all files as inaccessible.
  • File parser bug fixes affecting: .xls, .doc, .msg
  • Other bug fixes.

New in dtSearch 7.76.8193 (May 21, 2014)

  • dtSearch Engine:
  • Added Options.TempFileDir to control location of temporary files created during file parsing.
  • Added dtsOptions2, dtssSetOptions2, and dtssGetOptions2 in the C++ API to replace dtsOptions, dtssSetOptions, and dtssGetOptions. dtsOptions remains supported for backward compatiblity. dtsOptions2 replaces the fixed-length buffers used in dtsOptions with string pointers to eliminate length restrictions on string values in option settings.
  • Other bug fixes.
  • Fixes and minor enhancements:
  • Fixed bug causing incorrect XML conversion output from conversion of Word document with Ole10Native stream to it_ContentAsXml format
  • File parser bug fixes affecting: .xls, .doc, .rar, .pst
  • Fixed dtSearch Desktop indexer bug in "Update Multiple Indexes" dialog box causing case/accent sensitivity to be transferred between indexes when the "Clear index before adding documents" box was checked.
  • Other bug fixes.

New in dtSearch 7.75.8178 (Mar 17, 2014)

  • Fixes and minor enhancements:
  • Updated RAR file parser to support RAR 5
  • Reduced stack use when indexing very deeply-nested containers
  • dtSearch Web: Fixed bug in Build Search Form when generating a search form containing a custom field name
  • dtSearch Web: Fixed bug affecting highlighting of the selected hit when clicking Next Hit in Internet Explorer 8
  • Some Microsoft Photo Editor 3.0 objects embedded in Office documents were not recognized as image data
  • File parser bug fixes affecting: .mdb (Access 2003), .msg embedded in .rtf, .xls, .xlsx, .xlsb, .pdf
  • Other bug fixes.

New in dtSearch 7.74.8153 (Oct 15, 2013)

  • Added support for indexing iCalendar (*.ics) files

New in dtSearch 7.74.8152 (Oct 12, 2013)

  • dtSearch Desktop/Network:
  • Added support for indexing Outlook emails and other content using 64-bit versions of Microsoft Office. A 64-bit version of mapitool is also included.

New in dtSearch 7.74.8150 (Oct 5, 2013)

  • dtSearch Web/Publish:
  • Updated search form templates for dtSearch Web and dtSearch Publish. A new drop-down list in the "Build Search Form" dialog box lets you pick the template to use. The updated templates include frameless options and new HTML5 elements such as the Calendar control for date searching

New in dtSearch 7.73.8128 (Aug 13, 2013)

  • dtSearch Engine:
  • Added dtsConvertUseStyles flag in the ConvertFlags enumeration, to provide a way to use CSS styles to format content.
  • Added FileConverter.DocTypeTag, to provide a way to specify a DocType in HTML output.
  • dtSearchNetApi4.dll is now built using Visual Studio 2012 Update 3.
  • dtSearch Desktop:
  • Added DocStyles.css in the dtSearch templates folder to control the formatting of property tables and headings in retrieved files.

New in dtSearch 7.73.8126 (Jun 29, 2013)

  • Support for some older Windows versions is discontinued in dtSearch 7.73. Supported: Windows Server 2012, Windows Server 2008, Windows Server 2003 SP 2, Windows 8, Windows 7, Windows Vista, Windows XP SP 3. Not supported: Windows 2000, Windows ME, Windows 98, Windows 95, and Windows XP versions without SP 3. This change is a result of our transition to Visual Studio 2012, using the v110_xp platform toolset, to build all products.

New in dtSearch 7.73.8123 (Jun 8, 2013)

  • Fixes and minor enhancements:
  • FileConverter API - fixed missing line breaks when converting from HTML to .txt
  • FileConverter API - fixed missing ... tags around HTML metadata when converting from HTML to it_ContentAsXml

New in dtSearch 7.73.8121 (May 28, 2013)

  • Fixes and minor enhancements
  • File parser bug fixes affecting .pdf, .xls
  • FileConverter API - fixed missing line breaks when converting from HTML to .txt
  • Java API - faster garbage collection of strings passed through IIndexStatusHandler API reduces memory use during indexing
  • Reduced memory use when indexing .msg files with very large numbers of recipients or attachments
  • SearchReportJob API - fixed slow detection of timeout
  • dtSearch Desktop Indexer - fixed "not responding" message when indexing some very large documents

New in dtSearch 7.73.8120 (May 14, 2013)

  • Fixes and minor enhancements:
  • File parser bug fixes affecting .ppt, .pdf
  • Other bug fixes.

New in dtSearch 7.72.8095 (Mar 7, 2013)

  • Added dtsoFfSkipEmailProperties flag in Options.FieldFlags to suppress display of email properties such as sender, subject, etc.
  • Fixed XML structure errors in XML generated for it_ContentAsXml output by FileConverter
  • File parser bug fixes affecting .msg, .docx, .xfa, .pdf, .rtf

New in dtSearch 7.72.8091 (Feb 7, 2013)

  • Added detection of Windows PE and NE executables and Linux ELF executables (these formats are still indexed according to the binary files setting, with content either filtered or skipped)
  • Fixed bug causing use of dtsExoDoNotConvertAttachments in FileConverter.ExtractionOptions to generate an incorrect "File Encrypted" error for some documents during file conversion (not indexing or searching).
  • Fixed bug causing email headers to be indexed even if dtsoSkipEmailHeaders flag is set if filetype.xml set up to index message bodies separately from attachments.
  • Added support for metadata extraction from HDPhoto images.

New in dtSearch 7.72.8089 (Feb 2, 2013)

  • Fixes and minor enhancements
  • Fixed XML structure errors in XML generated for it_ContentAsXml output by FileConverter
  • File parser bug fixes affecting .msg, .docx, .xfa
  • .NET API: Added FileConverter.InputStream to allow the input document to be passed as a .NET Stream object

New in dtSearch 7.71.8080 (Dec 4, 2012)

  • dtSearch Engine:
  • New support for highlighting hits using different colors for each search term. For API documentation on this feature, please see the article "Highlighting each term using different attributes" in dtSearchApiRef.chm.
  • Fixes and minor enhancements:
  • Added "Find Indexes" button in dtSearch Desktop's Index Library Manager to locate all indexes in a folder tree.
  • Fixed CSV file parser bug that caused "duplicate field id" error during index merge.
  • In dtSearch Web Setup, improved detection and reporting of IIS configuration problems such as missing IIS components
  • File parser bug fixes affecting MSG, WordPerfect 4.2, PPTX, XBase, PPT, XLSB
  • In Language Analyzer API, dtsLaJob.indexRetrievedFrom and dtsLaJob.alphabetLocation were not set during searches

New in dtSearch 7.70.8063 (Dec 4, 2012)

  • dtSearch Desktop:
  • New support for displaying images embedded in Office documents (DOC, DOCX, PPT, PPTX, XLS, XLSX, RTF, EML). To enable display of images in dtSearch Desktop, click Options > Preferences > Document display, and check the box to "Display images in documents".
  • Added new options in dtSearch Desktop to (1) hide MIME headers in emails, (2) show properties of images embedded in documents, and (3) control whether paths are indexed along with filenames when the "Index filenames as text" options is enabled. These options are in the Options > Preferences > Indexing Options dialog box.
  • dtSearch Engine:
  • Embedded attachments, objects and images in documents can be extracted using dtsExtractionOptions (C++) or ExtractionOptions (Java and .NET), which specify output locations and rules for filename generation. Currently the following are supported:
  • Attachments in EML, MSG, DBX, TNEF (winmail.dat), PDF, MDB and ACCDB (Access);
  • objects in DOC, DOCX, XLS, XLSX, PPT, PPTX, RTF;
  • images in DOC, DOCX, PPT, PPTX, XLS, XLSX, RTF, EML, MDB and ACCDB (Access).
  • New single-document option for indexing Access (*.mdb, *.accdb), XBase (*.dbf), and Comma-separated values (*.csv) files.
  • By default, dtSearch indexes each record of database files (*.mdb, *.accdb, *.csv, *.dbf) as a separate document. This new option provides a way to index all records in a database file as a single document. For more information, see dtSearchApiRef.chm (Overviews > Databases and Fields > Database files (*.mdb, *.dbf, *.csv))
  • Added dtsoFfShowImageProperties flag in Options.FieldFlags to display image properties (such as EXIF data) for images embedded in documents. Image properties are always indexed for images in seperate files. This flag only affects images embedded in documents, such as a .jpg embedded in a Word file. A related change, made for consistency, affects the hanlding of image files embedded in .eml email files. Previously these properties were always extracted. Now they will only be extracted with the dtsoFfShowImageProperties flag is set, so .eml files will be handled consistently with other file formats.
  • Fixes and minor enhancements:
  • Eliminated use of FILE_FLAG_RANDOM_ACCESS, which could cause excessive memory consumption under Windows Server 2008 because of what appears to be a bug in Windows caching behavior (see http://support.microsoft.com/kb/2549369 for more information).
  • Zlib version updated to 1.2.7
  • dtSearch.Spider2.dll and dtSearch.Spider4.dll have new dependencies on zlib DLLs zlib_wapidll_{VC8/VC10}_{32/64}.dll to handle gzipped sitemap.xml files.
  • Added file parsers for Ichitaro word processor versions 5 and later.
  • File parser bug fixes affecting MSG, PDF, DOCX, PPTX, Excel 2, RTF
  • Message attachments to MIME emails are now indexed as attachments (so they can be handled consistently with other attachments in the new attachment-related features described above) rather than being merged with the text of the message.
  • Added reporting of PDF files that do not contain any page text. In dtSearch Desktop, these will appear in the index log with "Image Only" after the type name (click View Log in the Update Index dialog box to see the log of indexed files). In the API, the flag fiImageOnly will be set in IndexFileInfo (.NET, Java) or dtsIndexProgressInfo.fileInfoFlags (C++) during indexing.
  • Removed extra path information from headers in containers converted to text using FileConverter.exe or FileConvertJob with the dtsConvertInlineContainer flag
  • Removed "Document Properties" caption from Word, PowerPoint, and Excel 2003 file properties. For applications that require this flag for backward compatibility, use the new flag dtsoFfIncludeDocumentPropertiesCaption in Options.FieldFlags
  • Added new values to SearchReportJob.Header and SearchReportTemplate.rtf: %%Ordinal%%, %%DocId%%, %%Type%%
  • Added new dtsConvertIncludeBOM flag to FileConverter.Flags to add UTF-8 BOM to UTF-8 output
  • FileConverter with dtsConvertJustDetectType produces more specific type ids for image, music, and video files instead of it_Media
  • Fixed search/highlighting error affecting the pre/N and w/N operators
  • Added new dtsnIndexFolderInaccessible callback notification in IndexJob, logging in indexlog.dat, and logging in HTML index log of inaccessible folders during indexing
  • Fixed incorrect time zone adjustment of PDF built-in creation and modification date fields
  • Fixed too-long filenames generated for items extracted from PST files (names could be too long for some file systems when copied using Edit > Copy File in dtSearch Desktop)

New in dtSearch 7.68.8025 (Jun 14, 2012)

  • Outlook and MIME email files have a new, simplified header that will be consistent between the two formats with these fields: From, SentVia (for emails sent using a mailbox other than the "From" address), To, CC, BCC, Subject, Date.
  • The footer for these formats will include these additional fields for backward-compatibility: Sender (combines From and SentVia); Recipient (combines To, CC, and BCC); SentDate (msg files only); DeliveredDate (msg files only).
  • The Sender and Recipients fields provide a way to search across all senders or all recipients of a message. These fields will also allow searches using the field names used in older versions, such as "Sender contains example", to continue to work.
  • Documents indexed with dtSearch versions 7.67 and earlier will be displayed with the old header so the change will not cause hit highlighting problems with previously-indexed emails.
  • Includes new (August 9, 2011) security updates from Microsoft Bulletin MS11-025.
  • Added dtsoFfIndexArchiveFileLists flag. This option adds a searchable file named ArchiveFileList.html to ZIP and RAR archives during indexing. The original file is not modified but the ArchiveFileList.html file is searchable as if it were part of the ZIP or RAR file. The file consists of a list of the names of the files inside the archive.

New in dtSearch 7.67.7973 (Jun 14, 2012)

  • Includes security updates from Microsoft Bulletin MS11-025.
  • Added support for indexing PST files directly. Because Outlook locks the PST file that is currently in use, this will not work with the PST file that you are actively using in Outlook, and is primarily for use in situations where archived or forensically-obtained PST files are being searched.
  • Added plug-in for Adobe Reader X to enable hit highlighting in PDF files retrieved after a search.
  • .NET API: Added DataSource.DocStream to allow a document to be passed through the DataSource API as a stream
  • .NET API: Added sample code demonstrating indexing of Azure blob data (examples\cs4\AzureBlobDemo)
  • Fixes and minor enhancements

New in dtSearch 7.66.7936 (Jun 14, 2012)

  • Added .NET 4.0 versions of the .NET API (dtSearchNetApi4.dll, dtSearch.Spider4.dll) and sample code for C# .NET 4.0 and VB.NET 4.0
  • Added dtsSearchFastSearchFilterOnly search flag to enable much faster, optimized generation of a SearchFilter from a search when no other output is required from the search.
  • Added WordListBuilder.GetLastError to the C++, Java, and .NET APIs to provide better reporting of errors resulting from WordListBuilder calls.
  • Added new flag to enable caching of field values in WordListBuilder to make ListFieldValues calls faster. The flag is dtsWordListEnableFieldValuesCache (in the WordListBuilderFlags enumeration) and is passed to WordListBuilder using the new SetFlags method.
  • Added new .NET method Server.SetEnginePath to allow ASP.NET application deployment without administrative access
  • Added new .NET sample application, AzureDemo, demonstrating use of the dtSearch Engine in an Azure instance. For documentation explaining how to deploy in Azure, see:
  • Overviews > Installing the dtSearch Engine > Deployment steps: Azure 64-bit (in dtSearchApiRef.chm).
  • Added a way to disable file parsers using the file type table (filetype.xml) by setting the TypeId to the id of the parser to disable and the Flags value to 2.
  • Added a mechanism for a dtsInputStream to simulate an I/O error by returning a negative value from read() of less than 10,000. When this occurs, dtSearch will interpret it as an I/O error and halt processing of the current input file immediately, reporting an I/O error through the API.
  • Java and .NET API: Fixed IIndexStatusHandler bug causing PercentDone to remain zero during compression of an index
  • Added docId of document being removed from an index to IndexFileInfo reporting through IIndexStatusHandler
  • Fixed FileConverter bug that caused invalid XML to be generated from some conversions due to output of character code 128.
  • Added SearchJob.UnindexedSearchFlags in the .NET API and SearchJob.setUnindexedSearchFlags in the Java API to enable case and accent-sensitive unindexed searches in these APIs
  • Added .NET SearchFilter.GetItems() to provide access to an array of the doc ids selected in a SearchFilter
  • File parser bug fixes affecting Office XML drawings embedded in Word, PowerPoint, and Excel files; interpretation of OEM character codes (_x00NN_) in Excel 2007 files; dates prior to 1970 in MDB files; performance and memory use parsing MIME files; Word auto-numbering; PDF

New in dtSearch 7.65.7907 (Jun 14, 2012)

  • Fixed file parser bug affecting indexing of QuickBooks backup (*.qbb) files

New in dtSearch 7.65.7906 (Jun 14, 2012)

  • Added dtsoFfSkipEmailHeaders flag for Options.FieldFlags to suppress searching and display of headers in MIME and Outlook messages
  • Reduced memory requirements for parsing very large XLS files
  • Fixed bug that allowed XML output from saved search results and XML generated by conversion to it_ContentAsXml to contain the colon (":") character in tag names, which caused the generated XML to fail validation.
  • Fixed PDF hit highlighting error affecting documents using ActualText parameter.
  • Automatic date recognition has been changed to limit the scope of automatically recognized entities so they will not cross a field boundary.
  • Fixed error in HTML conversion causing some output to fail to word-wrap when displayed in a browser.
  • Fixed memory leak in searches that use the dtsSearchLanguageAnalyzerSynonyms flag.
  • ZIP file parser applies default encoding in filetype.xml when interpreting ambiguous ZIP filenames, and applies automatic encoding detection if no default encoding is specified.
  • File parser bug fixes affecting HTML, XLS, DOC (paragraph numbering error), DOCX, PPT.
  • Reduced memory use when merging very large indexes
  • Fixed PDF hit highlighting errors in certain types of corrupt PDF files

New in dtSearch 7.64.7876 (Jun 14, 2012)

  • Added dtsSearchLanguageAnalyzerSynonyms flag to enable using a language analyzer to generate morphological variations on a search term at search time. When this flag is set, the language analyzer is called for each word or phrase in the search request. The flag dtsLaInputIsSearchTerm is passed to the language analyzer in dtsLaJob.flags, so the language analyzer knows why it is being called.
  • Added dtssGetWordBreaker API function to provide direct access to the dtSearch Engine's internal word breaker using the language analyzer API. For sample code demonstrating how to use this API, see the WordBreak example in examples\vc8\WordBreak.
  • Added more structural information to the output generated by conversion to the it_ContentAsXml file format.
  • Added to COM interface: WordListBuilder.ListFieldValues, WordListBuilder.SetFilter, and IndexJob.EnumerableFields.
  • Added dtsListIndexSkipNoiseWords flag for ListIndexJob to list words in an index without including any noise words.
  • Added dtsoFfSkipDataSourceFields flag for Options.FieldFlags to prevent DocFields values from appearing in FileConverter output

New in dtSearch 7.63.7836 (Jun 14, 2012)

  • Fixed problem running dtSearch.exe on some systems after installing the kb973923 patch from Windows Update.
  • Fixed missing checkboxes in dtWebSetup64.exe under Windows Server 2008.

New in dtSearch 7.63.7835 (Jun 14, 2012)

  • Added IndexFileInfo.UserFields in .NET API to provide access to stored fields through the IIndexStatusHandler callback interface during indexing.
  • Added dtsnIndexDeletedFileRemoved, dtsnIndexListedFileRemoved, and dtsnIndexListedFileNotRemoved notifications to the indexing status callbacks to notify the calling application when files are removed from the index during indexing or when an attempt to remove a listed file fails.
  • Compatibility note for developers working with the .NET 2.0 API only: The DLL dependencies for dtSearchNetApi2.dll have changed due to the release of the Visual Studio .NET 2005 Service Pack 1 Security Update for ATL. Because dtSearchNetApi2.dll is built with the updated version of Visual Studio .NET 2005, it requires the updated MFC and CRT DLLs that are included with that version.

New in dtSearch 7.62.7804 (Jun 14, 2012)

  • Regular expression searching extended to support TR1 regular expressions
  • Java API: Added IIndexStatusHandler to Java API for monitoring of IndexJobs
  • Java API: Added IndexInfo object for more efficient retrieval of index properties from an index
  • Java API: Added SearchFilter.SelectItems() with array of doc ids
  • .NET API: Added SearchFilter.SelectItems() with array of doc ids
  • Java API: Added SearchJob.WantResultsAsFilter
  • FieldFlags: Added dtsoFfHtmlSkipImgAlt and dtsoFfHtmlSkipInputValues
  • Language Analyzer API: Added dtsLaBlockWasSkipped to LanguageAnalyzerWordFlags, providing a way for a language analyzer to request that the internal dtSearch word breaker handle a block of text from the input.
  • C++ API: Added userFields to dtsIndexProgressInfo, providing a way to access stored fields from a document as it is indexed
  • Added dtsConvertAutoUpdateSearch flag to ensure consistent hit highlighting when a document was modified since it was indexed or was indexed by an older version of dtSearch than is used to search it.

New in dtSearch 7.61.7769 (Jun 14, 2012)

  • New file parser added for RAR (*.rar) archives.
  • Added it_ContentAsXml output format for FileConverter. This format organizes document content, metadata, and attachments into a standard XML format for easier automated processing. It does not currently support hit highlighting and is designed for automated content extraction only.

New in dtSearch 7.54 Build 7680 (Mar 27, 2009)

  • .NET and Java enhancements