Essential Guide

Analytics technologies lend enterprise content management a hand

A comprehensive collection of articles, videos and more, hand-picked by our editors

The challenges of dirty data

Data pros often talk extensively about the importance of data cleaning and data governance to initiatives. But with unstructured data becoming more central to analysis of data from social media platforms, there is some debate about when to cleanse the data. Many data scientists, for example, want to see the data unvarnished, so they can identify outliers and other trends.

At the AIIM New England chapter meeting, we discussed best practices for dealing with dirty data, and Steve Weissman of the Holly Group and president of the chapter was on hand to offer some thoughts.

"It's not so much whether to clean dirty data, but when," he said. "There is value to getting all the raw data in, so that whoever is doing the analysis can make their own decisions about what biases are introduced by the dirty data. The alternative is clean it up first."

According to Weissman, particularly with unstructured data, there may be important information that is difficult to clean up and may be important to retain. Audience members also discussed the possibility of segmenting data outliers, then reintegrating the segment once it's been analyzed.

For more, check out this video.

View All Videos

Essential Guide

Analytics technologies lend enterprise content management a hand

GUIDE SECTIONS

  1. Trends
  2. SharePoint
  3. Videos
  4. Glossary

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

-ADS BY GOOGLE

SearchBusinessAnalytics

SearchDataManagement

SearchManufacturingERP

SearchOracle

SearchSAP

SearchSQLServer

Close