Manage Learn to apply best practices and optimize your operations.

Are text analytics tools the future of records management?

In recent years, text analytics tools have simplified the time-consuming and onerous work of records management.

Transaction documents are a crucial byproduct of business, and records management is all about ensuring proper...

stewardship of those resources for compliance and operational planning. Whether paper or digital, records need to be organized and accessible to have any value. It was once thought the rise of computers would greatly ease and improve those tasks for businesses. Largely, that hasn't happened, but text analytics tools and auto-classification software may finally deliver on that promise.

Records management in the digital age has not been a resounding success. Ask a records manager if they have digital records identified and properly controlled, and many will deflect the question or respond with qualified answers. Don't blame them though; records management is hard to do right.

The good news is help is on the way. Emerging technology is opening doors for an age of information governance (IG), where business finally takes a wider view of compliance. There are hopeful conversations getting underway about treating information as an asset and not just a liability.

But, there is a problem. We still cannot accurately classify the thousands upon thousands of documents residing in the organization. Even classifying new documents going forward is a high hurdle, to say nothing of classifying past documents. When you get down to it, people do not want to be records managers and industry needs to remove that burden.

Paper worked

Organizations have kept and classified all sorts of records for decades. ARMA was founded 60 years ago to help records managers collaborate to perfect their profession. Remember the triplicate copy system? It worked because one of those copies went down to records, and was filed by a person who looked at the document and properly classified it.

Things changed when workplaces went digital. The clerical workers and secretaries were downsized, as businesses looked to save on supplies, storage and personnel. The theory was that document creators could file records quickly and, potentially, more accurately. There were savings, but at the expense of individual productivity.

Another drawback is that most employees do not want to be a records manager. It isn't something they are trained for, nor is it something to which they aspire. If employees have to perform a task outside their core job, it needs to be simple, quick and require almost no training.

If you have used a records management system, you know that's typically not the case. It's only easy in cases where the organization invested the time and money to streamline the process so much that the people saving documents don't even have to think about it.

Unfortunately, investing in a custom interface is cost prohibitive for most organizations. That leaves many records managers with sleepless nights, as they ponder options to move the needle of progress forward.

Enter analytics

Text analytics tools are rapidly becoming a bigger piece of the discussion. The specific technology is new, but the premise is as old as digital records itself: Use search to find what you want and automatically classify it.

Anyone who used search engines back in the 1990s will readily tell you that they were long on promise and short on delivery. They worked fine if you needed to find a keyword in a subset of documents. If you wanted to determine which of the thousand or so record categories to which a document belonged, you were out of luck.

Over the past few years, those thousands of categories have been narrowed to hundreds, as people realize that fewer options, when married with other metadata, can create a more compliant organization. While the original plan was to make it easier for people to classify records, it has prepared the way for computers to auto-classify them.

Auto-classification doesn't depend upon reading a document and knowing what it says. A large effort to codifying knowledge isn't necessary. The analytic engine learns where documents go through example. Analytic engines use similarities to group documents together. If you give the engine 1,000 documents and tell it that they are all budget documents, the engine can tell you, with some confidence, if document 1,001 is a budget document as well.

This is the same learning mechanism that has been at expanding in the e-discovery space. The most recent wave of vendors don't just depend upon keywords, they depend on being taught. After several days of watching an expert determine whether a document is compliant with a discovery request, the engine knows enough to classify documents with over 90% accuracy. Several days to a week may seem like a long time to train an engine, but given that the sheer volume of documents would typically take months to manually classify, it is a relative bargain.

By leveraging existing documents, record managers are already well down the road to having a set of records ready to train the analytic engines. When applied to existing repositories, the proper classification can be made for huge backlogs and even correct documents that may be improperly classified.

Time matters

If you're a records manager, this is exciting news. If the technology can be applied to the proactive declaration of records, then the broader approach of information governance has a chance to take hold. Information governance works only if all information is identified and classified correctly, and that requires quality records management. Like any asset, if information cannot be found because it is misplaced or lost, it cannot provide any value to the organization.

Auto-classification is already being applied to email systems. Is your current suite of software partners ready to dive in and apply this technology to your existing infrastructure? If not, it is time to start looking more closely at vendors offering auto-classification services and determining if they can solve your organization's challenges.

Once we know what we have, we can finally start to use our information as an asset-- and not just pay lip service to the concept.

Next Steps

Learn why content analytics is becoming a must-have for enterprises

Records management still in early phases for many enterprises

Records management skill sets continue to expand

Why information governance strategy equals information access

This was last published in September 2015

Dig Deeper on Information governance management

PRO+

Content

Find more PRO+ content and other member only offers, here.

Join the conversation

2 comments

Send me notifications when other members comment.

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Please create a username to comment.

Are auto-classification and text analytics tools part of your records management program?
Cancel
Laurence, thanks for making it as clear as it can be done on managing records in the digital age. It has been my goal to use text analytics to create a retention schedule and possibly smooth out a taxonomy. I am also working with an IT group that will use a tool to help track the use of records as it ages to validate retention times (exclusive of any law of course).

Unfortunately deciding on a change in certain corporate platforms has mucked things up. I also know that I will have to feed off an initiative as you don't do text analytics here just for a retention schedule.
Cancel

-ADS BY GOOGLE

SearchBusinessAnalytics

SearchDataManagement

SearchManufacturingERP

SearchOracle

SearchSAP

SearchSQLServer

Close