alphaspirit - Fotolia

Evaluate Weigh the pros and cons of technologies, products and projects you are considering.

How a content tagging taxonomy improves enterprise search

Creating an enterprise taxonomy can help users more easily find the content they need when searching through files in a content management system.

As people continue to work from home, one challenge continually confronts remote workers -- the ability to find documents and other information within content management systems using enterprise search functions.

Even when documents are where they should be, search engines are only as good as the metadata tags that users assign to content -- keywords and phrases that describe a file. Improving the accuracy of metadata tags is something that end users must own, but it requires training.

Identifying enterprise keywords that correctly represent a business's content is part of creating an organizational taxonomy -- one step in remedying enterprise search issues. This taxonomy supports content management systems, processes, workflows and business intelligence platforms. This gives information workers the right information at the right time to do their jobs efficiently.

Taxonomies vs. folksonomies

There can be numerous ways to both tag and search for items. For example, perhaps a user is searching for all content items created in the United States of America. How would one go about searching for that? U.S.? USA? U.S. of A? America? United States? None of these are wrong; however, if the content creators didn't tag files with the same keywords, a user or process that consumes this information is unlikely to get good search results.

This is where "taxonomies" and "folksonomies" come into play.

Taxonomies are formal business classifications and should start with the finance applications -- which categorize how organizations track money and assets -- and HR applications -- which provide an employee structure for the people who provide the product or service for the business.

Businesses should track these core systems for defining metadata types (e.g. "department") in a formal manner, along with the keyword values for each of those types -- such as accounting, engineering and IT. In the example above, both the HR and the finance systems would have a designated code for the United States of America.

Folksonomies, while less formal, are important because they reflect how people communicate and think. Folksonomies most often become apparent from the results of indexing for enterprise search and through content tagging features that allow hashtags.

Blog posts are an example of content that use hashtags for categorization and trends. Entire blogs or single posts may use the enterprise taxonomy, but the folksonomy that is created will surface in "heat maps," which graphically represent how people use certain keywords and their relevance. This evaluation can determine if an organization should add keywords to the formal taxonomy. Businesses should also include keywords from the formal taxonomy as hashtag values in the folksonomy.

Both are important; however, if businesses use them incorrectly, they will sabotage content management efforts. When organizations clearly define taxonomies and folksonomies, the convergence of the two can provide important insights. Training end users on the importance of both and when it is best to use each is key to developing a content taxonomy.

Use keywords for tags

Organizations use tags primarily in folksonomies, while keywords result from formal taxonomies. Rules for managing keywords are important. First, organizations need to make sure they know the "source of truth" for each keyword list. This includes knowing who names a new office or product and who assigns office locations. A business's content taxonomy strategy should outline who owns what and who updates it.

Organizations use tags primarily in folksonomies, while keywords result from formal taxonomies.

All mature content management systems support taxonomies and folksonomies with varying functionality and can often be a factor of the genesis of the company. For example, DocuWare started as a platform to capture scanned documents, and as a result has strong support for capturing metadata keywords through zonal optical character recognition.

Alternatively, Microsoft SharePoint is an outgrowth of the Office suite, and its genesis was to capture and control documents or data created in Word, Excel and other Office authoring tools. These systems, along with IBM Content Manager, all enforce formal taxonomies while encouraging folksonomies with several features, including:

  1. Application sources of truth. This is the ability to connect to corporate data sources -- such as finance and HR applications -- as sources of truth. This enables keywords to live in their system of origin and be dynamically available in the content management system or synchronized based on architecture and business needs.
  2. Global and local term stores. This is the ability to create a centralized list of keywords or terms. The global term store contains keyword values that are common for all users and groups within an organization. Local term stores enable local administrators or site owners to add metadata terms and keywords that are specific to their group or location. An example of this would be requiring a working location for each employee in a global term set, with a granular list of keywords with location names maintained in a local term set where a user creates this information.
  3. Content types. This enables administrators to create a metadata framework for content to enable guided tagging, including filtered keyword lists, required fields and approval workflows. An example of how businesses might use content types to enforce keyword and data collection is an invoice content type and a letter content type working together. Both types of content share some common information, such as date received and recipients. Organizations should consider this shared information in the core content type for their data collection. For each of the content items, businesses may want to collect additional information. For example, on the invoice, organizations may want to require the invoice number, due date and amount. An analysis of the metadata that is related to core content types, including the required site columns, is an important step to improving search using metadata.

Once administrators create keywords for data lists and document libraries, administrators can also suggest these keywords as a part of the folksonomy/tagging ecosystem.

Using metadata for customization

Most common user interfaces have a collection of information organized around dates, topics or activities. When keywords used to categorize enterprise content match the organizational taxonomy it creates opportunities to create consistent, sustainable, custom applications and reports by relying on the keyword values to provide filters and grouping.

Content management systems have varying approaches to enforcing tagging content with taxonomy keywords. IBM Content Manager uses an item type designation to determine what additional metadata and keywords should be required on content. DocuWare uses forms to assign an initial categorization and require additional data to be included in the content store. SharePoint uses content types to do the same. When content management systems enforce the taxonomy in a logical framework, it can then use keywords to customize or build applications including:

  1. Dynamic user pages. An example of this would be home pages for different departments. These homepages would then use the same page layout with webparts that pull information for the office of the user that is logged in. For example, two users from different departments would see announcements for their department only.
  2. Dynamic customer or project pages. Businesses can use customer or project information to dynamically generate homepages that show information such as contacts, invoices and to-do lists to complement or replace custom applications. Businesses can further limit available customers or projects by the user logged in and the customers assigned to them.
  3. User-selected content delivery. Organizations can improve following and subscribing to information by using keywords and tags, enabling users to select topics rather than specific pages. This promotes end user engagement in the success of enterprise search.
  4. Workflow triggers. These are often executed based on the value of a keyword. A good example of this would be assigning a task to a user and that user receiving an alert. This type of keyword trigger could also look at the date of an item and calculate a due date to determine whether something is overdue and kick off the appropriate automated activities associated with that status.

One-stop searching

A solid enterprise search strategy has multiple entry points to initiating a search -- with one engine delivering results -- and a common formal content taxonomy across multiple systems. In situations where the underlying systems use different keyword values -- one uses USA and the other United States, for example -- businesses should map values to one common keyword. 

Inconsistent search results have a negative effect on user confidence. Whether users initiate search in a CRM platform, SharePoint or any other system that serves as a portal, the metadata included in the search criteria should be identical and the results should be the same. Reaching that goal is attainable and businesses can it in an iterative fashion involving end users.

Next Steps

Enterprise search software comparison

Dig Deeper on Enterprise search engines and strategy

SearchBusinessAnalytics

SearchDataManagement

SearchERP

SearchOracle

SearchSAP

SearchSQLServer

Close