Data archive strategy key to efficient enterprise content management

Digital archiving and the longevity of electronic information is being overlooked by enterprise content management professionals at the enterprise’s peril. Digital doesn’t necessarily mean permanent.

A lot of noise is being made these days about the importance of capturing content to enable business processes. Unfortunately, though, so much attention is focused on getting content into a repository to leverage it when needed that its long-term preservation is getting lost in the commotion. That isn’t to say that the practice of archiving is being overlooked -- in many organizations, it isn’t. But the need to establish a data archive strategy frequently comes as a revelation to people when you point out that digital doesn’t guarantee permanence.

It’s quite the opposite, in fact.

Take, for example, the group of young enterprise content management (ECM) trainees from a foreign national archives office who were stunned to learn that the historical documents and photos they had recently painstakingly digitized might not be readable a decade from now (for reasons we’ll get into in a moment). You’d have thought that the organization would have studied a data archive strategy given that its charter is to preserve material for use forever. But apparently, the question never came up.

Digital longevity -- how long electronic information lasts -- has two aspects. The first has to do with the speed with which digital storage media deteriorates, especially when compared with paper documents and images. Although the theoretical limit for high-quality digital media is claimed to be 50 or more years, five to 10 is much more realistic.

Read more about effective data archive strategy, enterprise content management

Learn how an increasingly mobile workforce is altering document scanning and capture

Find out about how to avoid the hidden costs of open source content management

Discover some enterprise content management best practices and worst practices

Read about the top eight enterprise content management stories of 2011

The second has to do with hardware, software and media storage formats becoming obsolete. That can sometimes be mitigated by maintaining some level of backward compatibility, but it happens nearly as fast as digital media deterioration. Think about how long the hard disk in your last PC lasted: Did you get three, five or 10 years from it? That kind of lifespan might sound like a long time, but it’s the blink of an eye at an archives department as well as organizations with compliance imperatives to meet.

Another scenario to consider: How many times has a software upgrade or change required the manual or automatic conversion of data files because of compatibility issues? And consider the point at which you stopped backing up and sharing business information on floppy disks or ZIP drives -- and how often you ignore your collection of CDs in favor of MP3s.

The old mainstays are disappearing from the landscape, or have vanished already, and it’s folly to think that their replacements will last any longer.

Business impacts
The enterprise content management ramifications of all this obsolescence reach beyond the need for content to be kept for longer than the lifetime of the underlying technology. Unfortunately, the impacts often go unnoticed as organizations focus on other concerns, such as system security and data privacy.

One obvious example is the cost of converting older documents into formats that can be read when needed. Austerity measures nowadays often require that legacy systems be maintained for far longer than was originally intended to avoid new capital expenditures. As a result, information that might have been migrated to new systems now persists in its original form and often requires investments of time and money to remain useful.

Beyond that, the expense of maintaining older systems can be considerable. Just as with an old car, there is a point at which it makes sense to replace technology rather than continue to pay for its upkeep. And don’t forget, a rival organization adopting newer technology that leaves yours in the dust could hurt your competitive standing.

Best practices
If you are in the business of capturing and preserving information as part of your content management strategy, consider how long that information needs to be readable and match those needs with enabling technology. That is one reason film-based micrographics products still exist. In the worst case, a piece of film can still be held up to light and read by a human being.

Digital techniques, of course, provide a wide variety of benefits over their optical counterparts, including universal search, random access and simultaneous distributed viewing. But when a disk drive fails or next-generation media formats appear, data is at risk of being lost.

Steps should be taken to mitigate that risk. Here are some key best practices that can help you do so:

Migrate data to new formats as they appear to avoid the rush and inevitable associated errors of a forced migration at a later date.

  • Adhere to established standards such as PDF/A, an archiving-approved variation on Adobe’s Portable Document Format specification that presumably will continue to be supported in future systems.
  • Develop a formal workflow as part of your archiving and retention procedures so preservation preplanning can be institutionalized, formats can be cataloged and information migrations can be anticipated and managed.
  • Keep older software and hardware around to read older content, or create emulators to accomplish that.

Regardless of the techniques you choose, it is important to remember that this process typically involves more than just words or images on paper. In most organizations, content now includes voice and video and is being captured with all kinds of devices, from smartphones to tablet PCs. Since so much content is created digitally, long-term preservation is becoming more challenging -- and whether you work for a national archives office or any other type of organization, it’s time to pay digital archiving some heed.

Steve Weissman provides guidance and professional training on content, process and information management. Weissman is president of the AIIM New England Chapter and principal consultant at Holly Group. He can be reached at [email protected].

Dig Deeper on Enterprise content management (ECM) workflow