Business Information

Technology insights for the data-driven enterprise


News Stay informed about the latest enterprise technology news and product updates.

Content analytics vs. predictive coding and beyond

Records management software needs to mature. But is the way forward through predictive coding methods, content analytics, search-based applications or some combination?

Some say records management tools are outmoded and unfit for today's information explosion, but technologies are emerging that may give them a much-needed boost. Predictive coding and content analytics address the ever-increasing volume of data that all organizations wrestle with.

"We do not go to work looking forward to tagging objects for purposes of compliance or records management," said Jason R. Baron, an attorney in the information governance and e-discovery practice at Drinker, Biddle and Reath LLP in Washington, D.C., and the former director of litigation at the U.S. National Archives and Records Administration.

Predictive coding software tags and categorizes documents, which reduces the time and cost of manually sorting through millions of files. It creates a sample cluster of documents and employees review the documents for accuracy. Subsequent rounds of coding may be required to "teach" the software further.

But legal challengers worry that the process -- a brute-force chunking method of sorts -- separates wheat from chaff at the risk of introducing errors: Relevant documents may be missed, or irrelevant ones included. But predictive coding is far superior to the alternative, which is unmanageable given the growing volume of information. Having legal professionals devote hours to tagging documents is a "terrible state of affairs," Baron said.

Still others argue that the real goal is more sophisticated technologies, such as content analytics.

Sandra Serkes, founder and president of Valora Technologies Inc., a content analytics services provider in Bedford, Mass., said content analytics can address real trends and conduct business intelligence on all enterprise information, regardless of format.

"We look at everything there is to know about a file," Serkes said about Valora's PowerHouse content analytics software. "What language is it in, where did it come from, how does it match typical versions of the same genre, are there unique attributes?"

The future of records management may require content analytics, predictive coding and intelligent search. But it may take time, since single-purpose uses like e-discovery and compliance already address enterprise pain points and the return on investment on information management has been harder to establish. Then there are human obstacles.

"Technically, this is not that difficult," Serkes said. "It is the emotional side of information sharing that becomes problematic. Records managers have built empires for themselves because they are the only ones who know where information is."

Article 4 of 12

Next Steps

Key questions on predictive coding technology

Are predictive coding technologies the answer to information overload?

Predictive coding doesn't replace human review

Dig Deeper on Enterprise records management software

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.

Get More Business Information

Access to all of our back issues View All