How Search Engines Might Use Knowledge Base Information for Underserved Queries
If I were to tell you that the major search engines have a bigger and richer database full of information than their index of the World Wide Web, would you believe me? Chances are that you’re one of the persons who helped build it. The information that Google and Bing and Yahoo collect about the searches and query sessions and clicks that searchers perform on the Web covers an incredible number of searches a day. When Google introduced their Knowledge Graph this past May, they gave us a hint of the scope and usage of this database:
For example, the information we show for Tom Cruise answers 37 percent of next queries that people ask about him. In fact, some of the most serendipitous discoveries I’ve made using the Knowledge Graph are through the magical “People also search for” feature.
When someone performs a search for a query that doesn’t produce much results at Google or Bing, the search engines might remove some of the query terms to provide more results, or they might look for synonyms that might help fill the same or a similar informational need. But chances are that such approaches still might not produce the kinds of results that searchers want to see.
In How Google Might Suggest Topics for You to Write About, I covered a patent application (now granted – patents that Google had filed 8,037,063 and 7,668,823) co-authored by Jeffrey D. Oldham, Hal R. Varian, Matthew D. Cutts, and Matt Rosencrantz, which describe how Google might identify queries that don’t produce many relevant results, and what Google might do to increase the amount shown.
A Microsoft patent granted last week covers similar ground. The patent describes how it might take advantage of a knowledge base made up of information gathered from query logs and search history to broaden the results available to searches. It also describes a marketplace where those underserved queries might become accessible to content creators to enable them to produce relevant content for a fee. This echoes in many ways the approach that the Google patents uncovered.
In short, Bing might identify underserved queries where there might be below a certain threshold of search results. It might find other similarly underserved queries that are related, and categorize those into a taxonomy with a set of associated attributes. Content shown in search results might not just be matches for the query terms, but also for associated queries around those related categories.
We did get a glimpse of how Bing might generate and include knowledge base results within search results from a Microsoft patent application I wrote about in Should You be Doing Concept Research Instead of Keyword Research? The types of knowledge base results described in that approach would definitely make those search results much more useful by including information about the kinds of things that people commonly search for that might be related to the queries originally searched for. Bing Snapshot Results help expand search results in many useful ways. But how well do they help with queries that might be long tail queries?
The Microsoft patent is:
Generating content to satisfy underserved search queries
Invented by Mark Looi
Assigned to Microsoft
US Patent 8,311,996
Granted November 13, 2012
Filed: January 18, 2008
Generating content to satisfy search engines queries is described. A knowledge base including a plurality of prior search queries for a search engine and corresponding prior search results provided by the search engine is accessed and a plurality of underserved search queries are identified, wherein each of the underserved search queries comprises a search query pattern having a below threshold number of search results.
Each of the underserved search queries are heuristically related to one another. The plurality of underserved search queries are aggregated into a taxonomy category having a set of associated attributes, the attributes descriptive of the plurality of underserved search queries. Targeted content is generated based on the attributes, wherein the targeted content is tailored satisfy the underserved search queries.
Both the Microsoft and the Google approaches to underserved queries mention a way of letting people know which queries are underserved in an easy to find manner, and to set up some kind of marketplace where people could possibly even be paid to create content for those queries. In some ways, both remind me of what Demand Media patented to try to capture underserved queries, as I described in How Demand Media May Target Keywords for Profitability.
I’m not sure that we are ever going to see either Google or Bing create such a marketplace, but I can see how they might provide more quality results by each using their query logs and web search histories as knowledge bases to provide related search information, locating different aspects of long tail queries that might include information in similar related categories.