How Google May Add to its Knowledge Base with Entities and Attributes from Search Queries

Millions of searches stream into Google everyday as people try to meet their informational and situational needs. But those searches don’t disappear after the searches. They provide Google with some very interesting and useful information in return. For instance, they tell Google what people are interested in real time – right at this moment.

Those queries can help Google populate its knowledge base with more information as well.

When Google collects information about entities – people, places, and things, including products and brands, it might collect information about entities as well as information about attributes associated with those entities.

A couple of days ago, the Google Research Blog told us about how it might include that kind of factual information in search results, what they called Structured Snippets. In that post, Google gave us the news that Google finds information like this from Tables across the web.

A query about Canada's Provinces with a table in the search snippet

I hadn’t purposefully set out to find a snippet that actually included a table in it, but I asked for “all” the provinces in Canada, and that’s what I got – a snippet that included an actual table.

Information about entities is all over the Web, and it also fills queries people perform when they search on the Web. Considering that these queries are representative of things people look for when they search, they seem like a good source of information to use to find out more about entities and associated facts about them.

The patent describes how it might treat entities and attributes in queries if they include proper names, or somewhat generic attributes (it might ignore them). But what I found interesting was how Google tried to use linguistic patterns to try to find, identify and extract entities and attributes associated with them.

This process can be done without human oversight or intervention. It involves using search query logs to see what queries people searched for.

Given the numbers of searches that people do at Google everyday, there’s no shortage of queries to use.

I took the following examples of linguistic patterns that Google might use to identify entities and attributes related to them from the patent.

The patent is:

Inferring attributes from search queries
Invented by Alexandru Marius Pasca and Benjamin Van Durme
Assigned to Google
US Patent 8,812,509
Granted August 19, 2014
Filed November 2, 2012

Abstract

Systems, techniques, and machine-readable instructions for inferring attributes from search queries. In one aspect, a method includes receiving a description of a collection of search queries, inferring attributes of entities from the description of the collection of search queries, associating the inferred attributes with identifiers of entities characterized by the attributes, and making the associations of the attributes and entities available.

Linguistic patterns can be used to infer entity attributes from search queries. This can be done for a log of search queries to identify attributes for entities.

One extract pattern can be used to scan keyword-based queries for text that matches the format “what is the <attribute> of <entity>.”

Examples:

  • What is the capital of Brazil?
  • What is the airspeed velocity of an unladen swallow?

Another extract pattern can be used to scan keyword-based queries for text that matches the format “who is the <attribute> of <entity>.

Examples:

  • Who is the mayor of Chicago
  • Who is the CEO of Google

A third extract pattern might look through queries for text that matches the format “the <attribute> of <entity>.”

Examples:

  • the capital of France
  • the manager of the Yankees

And, a different extract pattern may try to find answers for “who is the <entity>’s <attribute>.”

Examples:

  • who is the Yankees’ manager
  • who is the airplane’s pilot

An extract pattern can also scann keyword-based queries for text that matches the format “<entity>’s <attribute>.”

Examples:

  • Rosemary’s baby
  • Michelangelo’s David

This isn’t an exhaustive list of extract patterns, but it should give you an idea of how effective it could potentially be.

It’s interesting that Google isn’t trying to extract this information from pages published to the Web, with the kinds of patterns shown above, but instead from that vast stream of data from many searchers.

6 thoughts on “How Google May Add to its Knowledge Base with Entities and Attributes from Search Queries”

  1. Hi Bill,
    I loved the Monty Python and The Holy Grail reference (“What do you mean, African or European… swallow?”).

    Do you think that Google is using extract patterns because of an implied veracity in the sheer volume of it’s accessible data, and no longer needs (human) website data to correlate, just the search queries and all previous data? Do you think that this is the rise of semantic search automation and possibly true AI contextual understanding in search?
    Thanks!

  2. As usual Bill, thanks for the explanation and breakdown with this Google Patent.
    How do you suggest having this work in your favor when it comes to the knowledge graph results for organic search terms specifically?

  3. Dear Bill,
    I love your detail specially When you said Google collects information about entities – people, places, and things, including products and brands, Google isn’t trying to extract this information from pages published to the Web.It really great to marketing to target audience
    Thanks

  4. Hi Bob,

    I think Google sees a benefit to having both question answering results and query answering results available to them to use to answer questions with. They can provide facts from knowledge bases where facts have been extracted (they don’t need to rely upon volunteers to enter it into freebase for them either), but Google can also provide a query result, even within the same snippet or at least on the same search result page, and provide people with opportunities to click through to pages about what they searched for. I think providing options and alternatives gives people a richer experience than they would otherwise have. It’s difficult trying to understand the context of a search and of a searcher’s informational need at the time of a search, and having the ability to display both types of answers is good for Google.

  5. Bill, that’s a great post.

    Google is making use of users’ queries greatly in incorporating semantic flavor to its algorithms.
    Users’ queries helps a lot to Google in determining the relationships between entity and attribute.

    With the introduction of semantic search, there will be little or no traffic for pages in the inner pages of SERPs.

    Google is clearly living up to its motto,
    “FAST and accurate results to users”.
    The future of SEO is 1st position, not first page.

  6. When go ogling our business the incorrect address will popup! How do we go about updating address and phone numbers?

    Thank you for any assistance you may have to offer.

    Yours Truly,
    Perry Todoulakis

Comments are closed.