BACKGROUND IMAGE: iSTOCK/GETTY IMAGES
Companies are moving to cloud computing incrementally, particularly with applications like SharePoint.
Sometimes for regulatory reasons, sometimes out of concern about a data breach, companies often find moving all of the data from SharePoint isn't feasible because of the security issues it poses. As a result, most companies have opted for a hybrid SharePoint implementation, with some data that resides in the cloud and some data that resides only on corporate servers.
But an incremental approach poses problems for companies that have SharePoint in-house and in the cloud. Previously, while it was possible to search content relatively easily that was on-premises, accurate search in the cloud was another story. To date, SharePoint's search capabilities haven't been able to span content from both locations. Thankfully, Microsoft is bringing hybrid search capabilities to Office 365, the cloud-based suite of office productivity services.
Today, hybrid search is available only as a preview feature. It is exposed through the August 2015 Public Update for SharePoint 2013 and through the SharePoint Server 2016 IT Preview and the SharePoint Server 2016 Beta 2 release.
Because hybrid search is still in beta testing, you may encounter two issues. First, if on-premises SharePoint data is crawled too quickly Office 365 may throttle the feeding process that delivers data to the Office 365 index, slowing search considerably. Second, there is a soft limit of 2 million items of on-premises content that can be indexed. If you exceed this limit, the index is likely to stop working.
Hybrid SharePoint search setup
In order to make hybrid indexing work, there are a few things that you will need. First, you need to have a supported version of SharePoint running in your data center. You also need an Office 365 subscription with access to SharePoint Online, the cloud-based version of SharePoint. Beyond that, you need to set up a member server that resides in the same Active Directory forest as your SharePoint servers, and this server needs to be configured to provide directory synchronization services. If you don't meet these requirements, search won't work, and the directory sync process will fail. Finally, you need a couple of PowerShell scripts, which you can get from the Microsoft site. The key to making hybrid search work is to build a cloud search service application, which Microsoft refers to as the cloud SSA. You need to create only one Cloud SSA per SharePoint farm. In addition to creating a cloud SSA, establish Active Directory synchronization between your on-premises forest and Office 365. You also have to create a search service account.
Once the basic infrastructure requirements have been met, set up server-to-server authentication between on-premises SharePoint and your SharePoint Online subscription. You can accomplish this by installing the Microsoft Online Services Sign-In Assistant and the Azure Active Directory module for PowerShell onto your on-premises SharePoint servers. Once these components have been installed, use one of the previously mentioned PowerShell scripts (Onboard-HybridSearch.ps1) to enable server to server authentication.
Crawling SharePoint content works similarly to the way that it always has. In fact, you probably won't even have to worry about upgrading any content farms because you can index SharePoint 2007, 2010, and 2013 content farms.
Even though several components must be in place for hybrid search to work, the search architecture is surprisingly straightforward. As you may recall, the Cloud SSA is the key component that makes everything work. The Cloud SSA can crawl on-premises SharePoint content and then add the index. The index resides on Office 365, so you don't have to worry about maintaining an index in your data center. This same index is used for content crawled from SharePoint Online.
To prevent sluggish search, Microsoft recommends that users initiate search operations from the SharePoint Online Search Center. It is possible for to perform a hybrid search from a site search box, but doing so involves more overhead than searching directly from SharePoint Online. If a user chooses to search from an on-premises site search box, the Cloud SSA acts as a proxy to send the search to the SharePoint Online index and return the search results back to the user.
Hybrid search will finally allow users to use a single search to query both on-premises and online content, even if a company has legacy SharePoint Servers running in its data center. Best of all, Microsoft has indicated that hybrid search is compatible with the Office Graph and that Delve -- an Office 365 search and discovery service -- can be populated with data from an on-premises SharePoint farm and from SharePoint Online. The bottom line is that searching within hybrid SharePoint deployments is about to become much easier.
What SharePoint 2016 might look like
Hybrid SharePoint key theme at Ignite 2015
Configuring SharePoint 2013 search