Search technologies are hardly new, but a key challenge in enterprise search has been how to present users with...
a cohesive search experience from among multiple search engines and content stores. The results returned from multiple data sources aren't standardized or unified, a problem compounded by cloud adoption that creates additional content store and search experiences.
With the advent of technologies like SharePoint 2016, for content collaboration, and Office 365, which was built for the cloud, but can accommodate on-premises data stores, some of the historical issues with hybrid search are no longer onerous. SharePoint 2016 gives users access to hybrid search by extending the online search experience to include an on-premises environment. A more well-honed search tool can simplify not only a company's ability to perform e-discovery, but also its ability to search data repositories on a daily basis. This allows Delve, a search service within Office 365, to uncover on-premises content in order to provide a cohesive content personalization user experience.
In principle, the act of search is simple and can be classified as either of the following:
- Crawl content and build search indexes; and
- Process user search queries and present neatly formatted and prioritized search results.
In complex environments where volumes of content or geographically spread content sources were inherent, multiple instances of a search engine were often required to build multiple search indexes.
A traditional approach to addressing the dilemma from a user's search perspective was federated search. With federation, a search interface would issue a request to multiple search engine indexes simultaneously, bringing back multiple and disparate search results.
Consider a search scenario using Google, Bing and Yahoo. A search page can be built that issues the same user request to all three search engines and displays the results from each in a separate area on the search page. A user sees all the results from the search engines on the same page, but the net effect would have been the same if the user had visited each search engine separately and issued the same search request to each. Each search engine prioritizes its results, but there is no prioritization from among all search engine results, and the results are not interleaved.
Search aggregators are commonplace on the internet -- for example, on shopping and jobs websites. There are some caveats to this new hybrid search approach, including China-based data centers and government cloud customers.
When running Microsoft technologies, including SharePoint and OneDrive, and leveraging Microsoft search technologies, it might not be apparent why the federated search problem exists. There are several reasons, including the following:
- Large corporate organizations spread across multiple country sites and global geographies may require multiple on-premises SharePoint farms to manage the totality of their content.
- Organizations moving information to cloud apps, like SharePoint Online, Office 365 and OneDrive for Business, may continue to require on-premises versions of the apps to manage content not deemed suitable for cloud storage due to regulatory implications.
- Some organizations may require multiple cloud instances to segregate content in different geographic data centers.
The above scenarios can be addressed by using a federated search model, but for the reasons discussed, this is not the optimum and ideal solution. Microsoft's solution to the problem is the Cloud Hybrid Search Service Application, which was released last year.
The hybrid search service maintains a common search index in the Microsoft Cloud, even though content crawling can continue to occur locally in each disparate SharePoint farm. The service is available for SharePoint 2016 and SharePoint 2013.
Its operation is simple in concept, although architecturally technically complex. SharePoint on-premises farms that implement the Cloud Hybrid Search Service continue to crawl and parse content locally – search index update requests are, however, made to the centralized Azure/SharePoint online search engine via the centralized search service indexing mechanism.
A search query issued in return from any of the SharePoint farms uses the same cloud indexing API to return search results. The Cloud Search Service Application can crawl SharePoint content from the 2007, 2010, 2013 and 2016 versions.
Search-based applications offer far more capability than simply returning results for documents stored within SharePoint, OneDrive, file system or line-of-business applications. By leveraging the search crawler's ability to crawl external data connections using SharePoint Business Connectivity Services, SharePoint can become the aggregator across all content stores within an organization, including line-of-business applications, such as ERP and customer relationship management; business support applications, such as human resources management systems, knowledge and collaboration applications; and social feeds.
Advantages of hybrid search include the following:
- Potentially reduce the complexity and cost of on-premises infrastructure.
- Improve the scale and performance of the online index.
- Leverage the Office 365 search experiences and incremental updates, as well as improvements in Delve and Office Graph. For Delve, imagine that documents find users based on relevance, rather than users finding documents.
- Unified search experience across all content.
Challenges of hybrid search include:
- Performance and need for always-on sound internet connectivity.
- Facilitating appropriate User Directory Synchronization and, hence, ensuring correct access to the correct search results.
- User training and support.
- Ensuring a correct global user and global content permissions model that enables the right access to the right content.
Will Delve spur adoption of Office 365?
Cloud features push hybrid cloud front and center
Understanding SharePoint hybrid licensing