In the past few years, enterprise search software has become hugely popular and vital to both corporate marketing on the Web and internal digital document research. Everyone is looking for ways to improve enterprise search. It's not hard to set up the tools, but it's also not something you can just install and then ignore.
In part two of this two-part interview, Guy Creese and Larry Cannell, both content management experts at Burton Group, a Midvale, Utah-based researcher, weighed in on some enterprise search basics and offered tips for IT managers.
Part 1 Part 2
SearchWinIT.com: How much development work is required with search technology?
Guy Creese: It's pretty easy. There are connectors to standard repositories, like [EMC] Documentum or Vignette, Interwoven and SharePoint. You may need additional development if you have a [custom] repository. Usually a company is up and running and can see most of its documents without trouble.
Larry Cannell: General Web search just crawls from one page to another. But that's not an efficient way to index documents. For one thing, links can be different every time. So connectors just extract data in an XML format way that describes structure and content and shoves it into an indexer. The challenge of the integration is to either use an off-the-shelf connector or develop one yourself.
It's easier than people think. One user I know integrated 15 of their data sources into an Endeca engine within six months.
What are your recommendations for evaluating enterprise search vendors?
Creese: One thing you must do is be clear about what problem you are trying to solve. We are not yet at a point where any one search engine can do everything really well.
For example, e-commerce search is different from document search. On an e-commerce site, there is a lot of meta data about a product, such as size, weight and color. Endeca and Mercado are good at that. Other systems are better for when you are indexing documents without a lot of metadata. Google Web search is keyword based. When you search on myocardial infarction you will get documents with those two keywords. Keyword search hasn't figured out that it means the same thing as heart attack.
There are systems that look for word patterns and can figure out that these two phrases are equal. With Recommind's technology, for example, you can do a query and get heart attack.
Sit down with people who will use the search. Ask, is it useful to you? Run a bake off. Run queries that they will run. Do they get lost? Are they looking for other recommendations? Someone who is looking for a certain set of purchase orders is looking for something different than a researcher in a pharmaceutical company going through drug research. Both are search but what works for one won't work for the other.
Search technology has some care and feeding involved. How do companies organize the various tasks?
Creese: Companies go through stages. First they don't need search. Everyone knows where content resides, so why pay money for it? Then they realize they need it, so they think of search as a magic bullet. They might buy a search engine and install it and then realize they have to tune it. They need to get better at tagging their documents, maybe give different weights to documents. They may need to alter the user interface.
This is not onerous. Companies don't have huge armies [to do this work], but someone has to know the formats, whether to index, how quickly they refresh. If no one worries about this, then search becomes less effective. So beyond the eye candy, you have to know how to maintain and adjust your search.