DocsCorp releases contentCrawler 2.1
Global News

DocsCorp logoFaster processing, easier administration and automated reporting

DocsCorp, a global leader in document productivity software for enterprise content management systems, today announced the release of contentCrawler 2.1-the newest version of its integrated analysis, processing and reporting software that provides document management professionals with the peace of mind of knowing that their content is 100% searchable. 

Used by a wide variety of companies such as, Marshall Dennehey, Cuatrecasas Gonçalves Pereira, Hugh James and the Law Society of British Columbia, contentCrawler's versatile automated end-to-end process intelligently examines image-based documents in a content repository and converts them to searchable PDFs, making them available to search technologies for indexing. The contentCrawler 2.1 release includes several usability and performance enhancements and improvements.

“I upgraded directly with no problems.  Former documents were transferred to the new version with no problems at all,” reported Pedro Monteiro, Support at Truewind-Chiron.  “This new version is faster, more informative and much lighter process wise to the server.  Documents are assessed and saved 10x faster comparing to the old version we had.”

Faster processing

Multi-OCR processing - contentCrawler takes advantage of faster processing using multi-threading to optimise support for 4, 8, 16 and 32 CPU cores. For example, with 4 CPU core processing, contentCrawler will be able to OCR 1 page per second, or 85,000 pages per day. This represents a significant improvement over other OCR solutions and remains unique in its ability to OCR documents already stored in a DMS. 16 CPU core processing will be capable of OCR'ing 4 pages per second, or up to 350,000 pages per day!

Apply Advanced Search filters - New Advanced Search filters provide users with greater control over document types to be processed. Users can exclude certain document types from the search to decrease processing time, including those saved as email message attachments.

Easy administration and reporting

Set up Service email notifications - Users can establish various email notifications to report on the progress of the crawl and request that the Service Statistics and Error reporting be emailed to them.

Monitor progress status - Users can instantly see the progress status of individual documents being processed at the OCR stage. This information is displayed to the user as a percentage.

Document information display - Provides document information such as total page number and size of documents being processed, including an overall total size of documents requiring OCR.


Configurable Multilingual OCR - Users can easily configure multilingual OCR'ing across all services. contentCrawler supports over 180 languages.

Export Report - Users can export processing reports as CSV files for analysis and review.

Configurable minimum disk space limit - Users can specify minimum free space threshold for document cache directory.

20% of documents in content repositories are invisible to search

contentCrawler was developed to address the very real and serious issue of non-searchable content in enterprise content management systems. More than 20% of documents in a content repository are "invisible" to search technology. These documents are often profiled as a result of ingestion of legacy or litigation documents, saving emails with attachments, mobile technology and employee workarounds that bypass the OCR'ing process. Failure to produce documents on demand impacts the bottom line, workplace efficiency, regulatory compliance, and productivity, and exposes an organisation to unnecessary risks.

Download the contentCrawler 2.1 trial to see how much non-searchable content is in your content repositories. Email This email address is being protected from spambots. You need JavaScript enabled to view it. for more information.

contentCrawler integration

contentCrawler integrates with HP Autonomy WorkSite, HP Records Manager (formerly HP TRIM), OpenText eDOCS DM, ProLaw, MS SharePoint as well as MS Windows file systems. Integration with OpenText Content Server and Worldox will be available soon. 

About DocsCorp

DocsCorp designs easy-to-use software and services for document professionals who use enterprise content management systems. We provide solutions for metadata removal, document processing, PDF manipulation, and document comparison. DocsCorp is a global brand with customers located in the Americas, Europe, Asia Pacific and beyond. Find out more at or follow us on LinkedIn, Twitter, Facebook, and Blog.


Copyright © 2019 Legal IT Professionals. All Rights Reserved.

Our Newsletter
All the legaltech headlines in your mailbox once a week? Sign-up now! 

Media Partnerships

We offer organizers of legal IT seminars, events and conferences a unique marketing and promotion opportunity. Legal IT Professionals has been selected official media partner for many events.

A muchbeta site