How Search Engines Might Use Knowledge Base Information for Underserved Queries

If I were to tell you that the major search engines have a bigger and richer database full of information than their index of the World Wide Web, would you believe me? Chances are that you’re one of the persons who helped build it. The information that Google and Bing and Yahoo collect about the searches and query sessions and clicks that searchers perform on the Web covers an incredible number of searches a day. When Google introduced their Knowledge Graph this past May, they gave us a hint of the scope and usage of this database:

For example, the information we show for Tom Cruise answers 37 percent of next queries that people ask about him. In fact, some of the most serendipitous discoveries I’ve made using the Knowledge Graph are through the magical “People also search for” feature.

When someone performs a search for a query that doesn’t produce much results at Google or Bing, the search engines might remove some of the query terms to provide more results, or they might look for synonyms that might help fill the same or a similar informational need. But chances are that such approaches still might not produce the kinds of results that searchers want to see.

In How Google Might Suggest Topics for You to Write About, I covered a patent application (now granted – patents that Google had filed 8,037,063 and 7,668,823) co-authored by Jeffrey D. Oldham, Hal R. Varian, Matthew D. Cutts, and Matt Rosencrantz, which describe how Google might identify queries that don’t produce many relevant results, and what Google might do to increase the amount shown.

A Microsoft patent granted last week covers similar ground. The patent describes how it might take advantage of a knowledge base made up of information gathered from query logs and search history to broaden the results available to searches. It also describes a marketplace where those underserved queries might become accessible to content creators to enable them to produce relevant content for a fee. This echoes in many ways the approach that the Google patents uncovered.

In short, Bing might identify underserved queries where there might be below a certain threshold of search results. It might find other similarly underserved queries that are related, and categorize those into a taxonomy with a set of associated attributes. Content shown in search results might not just be matches for the query terms, but also for associated queries around those related categories.

We did get a glimpse of how Bing might generate and include knowledge base results within search results from a Microsoft patent application I wrote about in Should You be Doing Concept Research Instead of Keyword Research? The types of knowledge base results described in that approach would definitely make those search results much more useful by including information about the kinds of things that people commonly search for that might be related to the queries originally searched for. Bing Snapshot Results help expand search results in many useful ways. But how well do they help with queries that might be long tail queries?

The Microsoft patent is:

Generating content to satisfy underserved search queries
Invented by Mark Looi
Assigned to Microsoft
US Patent 8,311,996
Granted November 13, 2012
Filed: January 18, 2008

Abstract

Generating content to satisfy search engines queries is described. A knowledge base including a plurality of prior search queries for a search engine and corresponding prior search results provided by the search engine is accessed and a plurality of underserved search queries are identified, wherein each of the underserved search queries comprises a search query pattern having a below threshold number of search results.

Each of the underserved search queries are heuristically related to one another. The plurality of underserved search queries are aggregated into a taxonomy category having a set of associated attributes, the attributes descriptive of the plurality of underserved search queries. Targeted content is generated based on the attributes, wherein the targeted content is tailored satisfy the underserved search queries.

Both the Microsoft and the Google approaches to underserved queries mention a way of letting people know which queries are underserved in an easy to find manner, and to set up some kind of marketplace where people could possibly even be paid to create content for those queries. In some ways, both remind me of what Demand Media patented to try to capture underserved queries, as I described in How Demand Media May Target Keywords for Profitability.

I’m not sure that we are ever going to see either Google or Bing create such a marketplace, but I can see how they might provide more quality results by each using their query logs and web search histories as knowledge bases to provide related search information, locating different aspects of long tail queries that might include information in similar related categories.

Share

13 thoughts on “How Search Engines Might Use Knowledge Base Information for Underserved Queries”

  1. It’s amazing their databases contain as much knowledge as you describe it shock me completely – But of course they have the knowledge are well gold! Could they just make their Google Translate better then I would be happy to just have Google translate and tried to read your blog in Danish, got quite sore head and eyes of all the strange phrases – So had to go back to English! (Hope you can read this post also written in Google translate) But thank you for an exciting SEO blog, it is bookmarked and I will follow your posts :)

  2. Hi Thomas,

    Google does collect an amazing amount of information before they ever leave their own query logs and go out onto the Web.

    Sorry about the headache. Your comment reads very well, so no issues with Google Translate there. :)

  3. This seems logical to me. I wasn’t surprised at all. What would be fascinating to know is what areas are the most “underserved”. By that I mean what industry or search topic. Keep up the great work. Still would LOVE a local search results article.

  4. Thanks Bill, One thing I am quite sure about that Log tail keyword is going to disappear in coming time. People don’t need to put more effort on type their search queries, as search engines will provide best results in less effort.

  5. Its no suprise that they hold a lot of contextual information regarding the searches. Data when mined is knowledge and any knowledge can be sold. It nice to hear your perspective on how this information can be used as a business model to generate revenue.

  6. Hi Robert,

    Not sure that it’s really something that can be broken down into specific categories or topics very easily. I’d imagine that there are many underserved queries spread out across a wide range of categories and topics.

  7. Hi Rajesh,

    I think that long tail queries will become more common rather than less, as people become more confident that they will receive good answers to more specific questions and queries.

  8. Hi Patrick,

    I didn’t write the patent, but we have a patent from Google and one from Microsoft now on how they could potentially pay people to create content that addresses some underserved queries. No signs from either that it’s actually something that they would actually do, but it’s interesting to see that the thought crossed their minds.

  9. Search engines paying people to produce content is an interesting concept (and a scary concept as well). I definitely wouldn’t be surprised if something like that was on the horizon.

    It would really take a lot of the excitement out of the internet and may discourage writers from wanting to continue writing content. I know one of the most exciting things for me has been writing a few articles here and there that show up on the first page of Google for under served terms. It’s a fun way to get users to click through to your website.

Comments are closed.