The physical part of document capture -- scanning -- is essentially a “dumb” process that simply preserves corporate documents in accessible electronic files.
For example, increased software functionality and intelligence is enabling organizations to automate document indexing, quality checks and even the internal routing of documents, said Alan Weintraub, an analyst at Cambridge, Mass.-based Forrester Research Inc. In many cases, he noted, document capture “now kicks off the process for deriving value” from documents.
“This is a very vibrant time for capture, and I think it has crossed a chasm in terms of its application,” said Alan Pelz-Sharpe, a principal analyst and director at Olney, Md.-based consulting firm Real Story Group. Although document imaging and document capture technologies have been around as long as there have been scanning tools, their speed and accuracy have reached new levels, Pelz-Sharpe added. “At the high end, if used properly, they can be more accurate than humans, and that is an obvious breakthrough,” he said.
As an example of the degree to which document capture results have improved, Pelz-Sharpe cited a client that only a few years ago employed about 40 people to verify and correct scan problems on document images. Today, that head count has been reduced to two because of better software. “It isn’t so much the scanners themselves, though they have continued to improve,” he said. “It’s the software, which now often incorporates artificial intelligence.”
At a basic level, optical character recognition (OCR) tools, their intelligent character recognition (ICR) cousins and bar-code recognition software assess what a scanner “sees” in a document. Then, algorithms built into document capture software do more detailed content analytics, according to Pelz-Sharpe. “It’s now possible to recognize different languages [and] different writing systems and to detect and make allowances for spelling errors or variations,” he said.
And once the capture process has been completed, the tools can trigger a document management workflow. “Systems today can read a letter and not only determine to whom it is addressed but also make a judgment regarding which individual should most likely respond based on the meaning of the content,” Pelz-Sharpe said.
From a document processing standpoint, the capture process usually generates files such as TIFFs or PDFs. In parallel, it also extracts key information about the content of documents and creates metadata records that can be integrated directly into other applications, such as enterprise resource planning (ERP) systems, Pelz-Sharpe said. “In fact, the TIFF document is becoming more of a backup, rather than a primary file,” he added.
Document management process enters business mainstream
The increased sophistication of the software creates the potential for document capture to become much more central to business processes as part of an organization’s document management strategy. For example, another of Real Story Group’s clients tracks patents globally. In the past, that was largely a manual process: Documents were scanned into a document management system to reduce paper handling, but interpreting them depended on a small army of people with specialized technical and language skills, Pelz-Sharpe said.
We humans are much less accurate than we think we are, so when a machine is 90% accurate or better, that’s actually very good.
Alan Pelz-Sharpe, Real Story Group
Now, he said, the company is able to leverage its document capture software to identify patent filings from different countries, interpret much of the content and route and prioritize documents in minutes instead of in weeks, as in the past.
Pelz-Sharpe did offer two caveats about the accuracy of capture tools. First, they still aren’t perfect -- but, he said, “we humans are much less accurate than we think we are, so when a machine is 90% accurate or better, that’s actually very good.” Second, except on the most basic applications, the tools need some time to learn the ropes. “With a simple accounts receivable process, you might come up the learning curve in just a couple of days,” he said. “But for more complicated tasks, you have to allow several months for the system to achieve optimal accuracy.”
For organizations that want to move toward “straight-through” processing of documents such as invoices, capture technologies are capable of putting the information in them directly into general ledger applications and other business systems, said Melissa Webster, an analyst at Framingham, Mass.-based IDC. That doesn’t mean there won’t be any need for manual processing of documents, but the software can handle most of the work on its own while identifying problematic documents that require human intervention, she noted.
In addition, some document capture tools are able to pull data from ERP systems, customer relationship management software and other enterprise applications to help validate or complete information in documents as they’re being scanned and captured. “The whole document capture process is now a two-way street,” Webster said.
ABOUT THE AUTHOR
Alan R. Earls is a Boston-area freelance writer focused on business and technology.