How the Google Knowledge Graph Updates Itself by Answering Questions

Sharing is caring!

unsplash-logoElijah Hail

The Future of Search is in Providing Knowledge to Searchers through a Knowledge Graph

To us who do Search Engine Optimization (SEO), we’ve been looking at URLs filled with content, and links between content, and algorithms such as PageRank (based upon links pointed between pages) and information retrieval scores based on the relevance of that content have been determining how pages rank in search results responding to queries entered into search boxes by searchers. Web pages connected by links are information points connected by nodes. This was the first generation of SEO.

Many of the methods that we have used to do SEO will remain the same as new features appear in a Knowledge Graph-based search, such as knowledge panels, rich results, featured snippets, structured snippets, search by photography, and expanded schema covering many more industries and features then it does at present.

Search has been transforming. In 2012, Google introduced the knowledge graph, which they told us would focus on indexing things instead of strings. By “strings,” they referred to words that appear in queries and documents on the Web. By “things,” they referred to named entities, or real and specific people, places, and things. So when people searched at Google, the search engine would show Search Engine Results Pages (SERPs) filled with URLs to pages that contained the strings of letters that we were searching for. Google still does that and is changing to showing search results about people, places, and things.

Before introducing the Google knowledge graph, Google started an annotation framework project, which included a precursor to the knowledge graph. I wrote about that in Google’s Browseable Fact Repository – an Early Knowledge Graph.
browseable fact repository - an early Google Knowledge Graph

After Working On the Annotation Framework, Google Acquired Metaweb

In addition to the annotation framework, Google acquired the company MetaWeb, which had built a knowledge directory called Freebase, which Google used to populate the knowledge graph with entities and attributes about those entities and related information between entities. I wrote about that in Google Gets Smarter with Named Entities: Acquires MetaWeb.

metaweb entities in a Google Knowledge Graph

Google started showing us in patents how they were introducing entity recognition to search, as I described in this post:
How Google May Perform Entity Recognition

Google now uses information from the knowledge graph to show us knowledge panels in search results that tell us about the people, places, and things they recognize in the queries we perform. So, in addition to crawling web pages and indexing the words on those pages, Google is collecting facts about the people, places, and things it finds on those pages. That is the knowledge graph in action.

How The Google Knowledge Graph Updates Itself When It Collects Information About Entities

Google has filed a few patents that tell us about the knowledge graph. A Google Patent that was just granted in the past week tells us how the Google knowledge graph updates itself when it collects information about entities, their properties and attributes, and relationships involving them. This is part of the evolution of SEO that is taking place today – learning how Search Engines are changing from returning search-based results to showing knowledge-based results. Here is an example of part of a knowledge graph:

a knowledge graph

What does the patent tell us about knowledge?

This is one of the patent sections that detail what a knowledge graph is like Google might collect information about when it indexes pages these days:

Knowledge graph portion includes information related to the entity [George Washington], represented by [George Washington] node. [George Washington] node connects to [U.S. President] entity type node by [Is A] edge with the semantic content [Is A], such that the 3-tuple defined by nodes and the edge contains the information “George Washington is a U.S. President.” Similarly, “Thomas Jefferson Is A U.S. President” has the tuple of [Thomas Jefferson] node 310, [Is A] edge, and [U.S. President] node. Knowledge graph part includes entity type nodes [Person], and [U.S. President] node. The person type has connections from the [Person] node. For example, the type [Person] has the property [Date Of Birth] by node and edge and has the property [Gender] by node and edge. These relationships define in part a schema associated with the entity type [Person].

Notice that SEO is no longer just about how often certain words appear on pages of the Web, what words appear in links to those pages, page titles and headings, alt text for images, and how often certain wordget used or related words appear. Google is looking at the facts about entities, such as entity types like a “person,” and properties, such as “Date of Birth,” or “Gender.” We see the knowledge graph moving into other aspects of search at places like Google Trends and reverse image search, which I wrote about in Image Search and Trends in Google Search Using Freebase Entity Numbers

Note that the quote also mentions the word “Schema” as in “These relationships define in part a schema associated with the entity type [Person].” As part of the transformation of SEO from Strings to Things, The major Search Engines joined forces to offer us information on how to use Schema for structured data on the Web to provide a machine-readable way of sharing information with search engines about the entities that we write about, their properties, and relationships.

I’m writing about this patent because I am participating in a Webinar online about the Google Knowledge Graph, and it is being used and updated. The Webinar is tomorrow at:
#SEOisAEO: How Google Uses The Knowledge Graph in its AE algorithm. I haven’t been referring to SEO as Answer Engine Optimization, or AEO and it’s unlikely that I will start, but I see it as an evolution of SEO

I’m writing about this Google Knowledge Graph Patent because it starts with the following line, which it titles “Background:”

This disclosure generally relates to updating information in a database. This works with user input.

This line points out that this approach no longer needs users to enter data into a knowledge base. Instead, it involves how Google knowledge graphs may begin to update themselves.

Updating the Google Knowledge Graph

I attended a Semantic Technology and Business conference a couple of years ago, where the head of Yahoo’s knowledge base presented. He answered several questions in a question-and-answer session after he spoke. For example, someone asked him what happens when information from a knowledge graph changes, and it adds information and needs updating?

He answered that a knowledge graph manually updates with new information.

That wasn’t a satisfactory answer because it would have been good to hear that the information from such a source could be easily updated, and it was a little difficult to hear that a search engine would need to work as a newspaper would. This may have been the answer that the people from Yahoo believed was the proper answer, and I’ve been waiting for Google to answer a question like this to see what their answer would be. That made seeing a line like this one from this patent interesting:

In some implementations, a system identifies information missing from a collection of data. The system generates a question-to-answer service based on the missing information and uses the response from the question answering service to update data collection.

updating a knowledge graph

This would be a knowledge graph update so that the patent provides details using language that reflects that exactly:

In some implementations, a computer-implemented method works. The method includes identifying an entity reference in a knowledge graph, wherein the entity reference corresponds to an entity type. The method further includes identifying a missing data element associated with the entity reference. The method further includes generating a query based at least in part on the missing data element and the type of entity reference. The method further includes providing the query to a query processing engine. The method further includes receiving information from the query processing engine in response to the query. The method further includes updating the knowledge graph based at least in part on the received information.

How does the search engine do this? The patent provides more information that fills in such details.

The approaches to achieve this would be to:

…Identifying a missing data element comprises comparing properties associated with the entity reference to a schema table associated with the entity type.

…Generating the query comprises generating a natural language query. This can involve selecting, from the knowledge graph, disambiguation query terms associated with the entity reference, wherein the terms comprise property values associated with the entity reference, or updating the knowledge graph by updating the data graph to include information in place of the missing data element.

…Identifying an element in a knowledge graph at least in part on a query record. Operations further include generating a query based at least in part on the identified element. Operations further include providing the query to a query processing engine. Operations further include receiving information from the query processing engine in response to the query. Operations further include updating the knowledge graph based at least in part on the received information.

The Google Knowledge Graph updates itself in these ways:

(1) The knowledge Graph works with one or more previously performed searches.
(2) The knowledge Graph may work with a natural language query, using disambiguation query terms associated with the entity reference. The terms comprise property values associated with the entity reference.
(3) The knowledge Graph may use properties associated with the entity reference to include information updating missing data elements.

The patent that describes how the Google knowledge graph updates itself is:

Question answering to populate knowledge base
Inventors: Rahul Gupta, Shaohua Sun, John Blitzer, Dekang Lin, and Evgeniy Gabrilovich
Assignee: Google
US Patent: 10,108,700
Granted: October 23, 2018
Filed: March 15, 2013

Abstract

Methods and systems are provided for question answering. In some implementations, a data element identified in a knowledge graph, and a query is at least in part from the data element. The query is from a query processing engine. Information from the query processing engine in response to the query. The knowledge graph is based at least in part on the received information.

Nicolas Torzec tweeted me a link to a paper published on the Google AI Blog, which shares several authors with this patent. It was from 2014 (a year after the patent from this post came.) The paper explains in more detail how a knowledge graph might become more complete. As the Abstract of the paper tells us:

We discuss how to aggregate candidate answers across multiple queries, ultimately returning probabilistic predictions for possible values for each attribute. Finally, we evaluate our system and show that it can extract many facts with high confidence.

The paper is Knowledge Base Completion via Search-Based Question Answering I recommend reading this paper along with the patent. It presents a much more nuanced look at some of the issues that the people working upon this problem came across and some of the solutions they found to address them. One of the problems that they use to illustrate how this system works involves identifying the parents of Frank Zappa (His Band was “The Mothers of Invention,” which made that task have some issues unique, as well.)

It seems like it is difficult to update a knowledge graph using questions and answers like this, and it is a problem that faces some challenges. Besides, it is interesting seeing what stage we are at in having problems like this addressed – so read this paper carefully and the patent.

We have seen other approaches that look at a knowledge graph from other directions, such as:

3 Ways Query Stream Ontologies Change Search – this is about Google looking at query stream information to identify data that it can extract from the Web to use to build ontologies. By looking at searchers’ queries, in effect, it is crowdsourcing information about topics that may help build those ontologies.

Constructing Knowledge Bases with Context Clouds – This tells us about how Google could look at unstructured content that it might be able to use to build up knowledge bases. We see statements like this from the patent the post is about:

Extending the number of attributes known to a search engine may enable the search engine to answer more precisely queries that lie outside a “long tail” of statistical query arrangements, extract a broader range of facts from the Web, and/or retrieve information related to semantic information of tables present on the Web.

We haven’t quite reached the point where you can automate the updating or building of a knowledge base. That would mean updating some knowledge graph information about some sensitive topics that change may be necessary still. We have some examples of approaches that are underway towards such updates as a possibility.

I’ve written a few posts about named entities. These are some that I wanted to share:

Last Updated June 26, 2019.

Sharing is caring!

91 thoughts on “How the Google Knowledge Graph Updates Itself by Answering Questions”

  1. This is a great post about the fundamental shift in SEO and how search engines are getting smarter as they strive towards complete natural language processing. It’s imperative to manage business data and ensure clarity about your brand around the web.

  2. Hi Dan,

    That shift from strings to things, where it is not so much about matching keywords, but instead, about answering questions involving things. A search engine should be capable of filling in knowledge gaps in a knowledge graph about properties of an entity included in that knowledge graph, and by updating that graph, it becomes capable of answering questions about things that weren’t originally included in that knowledge graph. It is going to be interesting seeing how Google evolves beyond where it is at now.

  3. The connectedness of all things! 🙂

    I love that Google’s AI maybe asking questions on Q&A forums to fill in gaps! It does pose the new ethical question – are we the bots now?

  4. Hi Dixon,

    Maybe not a Q&A forum or an Ask Yahoo!, but lots of people ask Google questions daily. If Queries can be rewritten, and disambiguated, and new trending answers can be discovered, The knowledge Graph can be updated quickly when there are gaps in knowledge. We will see how that works. 🙂

  5. Just wait until Google Duplex (the 2 way conversation phone calling bot)is calling professors with data from Google + to get the answers to those tough questions. Maybe suggesting they write a blog post about the topic.

  6. Hi Loren,

    That would be the phone-a-friend option? That sounds similar to what Google did around 2 years ago to start providing better health-related featured snippets. They hired a number of doctors to write out answers. We shall see what Google does.

  7. very informative post. It’s a new thing to my knowledge. Thanks for sharing..

    Wikipedia is the most common warehouse of knowledge. I like to grab a sentence from your post.

  8. Hi Emmanuel,

    Thank you for asking. We don’t know how much Google might use DBpedia. You many find some papers from some academics, but Google doesn’t refer to DBPedia in any patents that I can recall. If you read the patent I wrote about, it does talk about the knowledge graph as if it is an entity on its own that may use other informational sources. Google does refer to Wikipedia in at least one patent where they talk about how they may extract information from information found on Wikipedia, and if you are curious about that, you can find it here:

    Extracting Facts for Entities from Sources such as Wikipedia Titles and Infoboxes
    https://www.seobythesea.com/2014/08/extracting-facts-for-entities-from-sources/

    Wikipedia and Google are separate entities, and while they do communicate, they fill separate purposes on the Web.

  9. Great job on the Google’s Knowledge Graph article Bill! It’s fascinating to learn about some of the complexities behind how the Google’s search engine gathers and distributes useful information. Very complex. I’m glad that we have great minds like yours on top of it!

  10. Awesome, you picked the great topic. I always love to read you post. Really appreciate you. Thanks for commenting session.

  11. This is a great insight into the Google Knowledge Graph. Thanks for this information. Search engines are becoming smarter day by day and the Knowledge Graph is one of the tools, contributing to it.

  12. Hi,
    Enjoyed reading the article above , really explains everything in detail,the article is very interesting and effective.Thank you and good luck for the upcoming articles.

  13. Thank You so much for this informative post Bill. Thanks for sharing how you are doing it and I am sure a lot of people will be helped through the resource you shared.

  14. Hi Bill,

    Insightful as always; thank you.

    I’m particularly intrigued by this paragraph;
    “Note that SEO is no longer just about how often certain words appear on pages of the Web, what words appear in links to those pages, in page titles, and headings, alt text for images, and how often certain words may be repeated or related words may be used. Google is looking at the facts that are mentioned about entities, such as entity types like a “person,” and properties, such as “Date of Birth,” or “Gender.””

    There are a number of takeaways from this but one which I’m sure will stir up debate is the part about words appearing in links. Even today we still see websites where SEO’s are building lots of anchor text links from PBN sites, convinced that this will reap rewards. These same people insist that SEO is all about the number of times they can introduce a keyword into the page content, either visible or in the code (I was only looking at a really poor example this morning).

    I was wondering, if this shift reduces the reliance on ‘on-page’ cues, does that mean that poorly optimised websites, with the relevant information scattered on the page, will rank, even if they don’t provide a good user experience? In other words, do you think people can we now schema to the top irrespective of UX?

    Thanks in advance.

    Jonathan

  15. Hi Jonathan,

    The focus is now less upon how relevant a page may be for a specific query term, and how well a page may provide answers about specific entities, their properties, and relationships between entities. For instance, when someone asks Google, “In what year did einstein published his theory of relativity?” they are looking for an answer to when a specific person published a paper or book about physics. It is no longer about links or the use of Private Blog Networks (which is an attempt to manipulate ranking signals at Google.) At one point, the focus of rankings of pages was upon information retrieval relevance scores, and authority scores based upon things such as PageRank. Those things still have value today, but we don’t know how much longer they might. Google is shifting to a different way of weighting the value of pages (that won’t be manipulated by things such as PBNs) and the ability of those pages to anwer queries. What role might User Experience have? That will be seen. It is likely not a matter of click throughs and dwell time like some people hypothesize, though Google will likely find value in sending searchers to pages that might have high quality scores.

  16. The SEO process is rapidly changing nowadays and google brings a lot of changes in 2018 that give website owner to rank high on SERP. The new updates from Google make us be more focused on the website.

  17. I think every website is different Bill, and what I mean by that is the competition etc, I have ranked our website for some very sought after keywords without any featured snippets, structured snippets or even Google maps, although there is one keyword I’m still after,you reckon featured snippets, structured snippets would help get us there? Any Tips Guru Bill? Awaiting your reply…Oh P.S: Greetings from Ireland 🙂

  18. Hi SEO,

    If you know of any questions that clients ask about the services that you offer, having a featured snippet that might answer that question may be worth pursuing. Find those questions, and identify or create pages that answer those, and provide answers that satisfy searchers. Include those questions, in the ideal (or canonical) version of the question that people are querying, and provide answers that will statisfy those searchers. You can use bulleted lists to do that with, using enough list items so that a features snippet will not show the whole answer, but will link to it, so that people would go visit your page if your answer becomes the featured snippet. You can see if specific questions that you might choose are returning featured snippet, which the page you create might be returned instead of their page (the potential exists to replace another featured snippet.) A structured snippet is one created from a data-based table, so if you create tables that Google might choose answers from, and they rank in Google’s experimental table search results. they may become candidates for structured snippets. To build such a table, again look at the questions that people might ask that you might want to rank for, and create content that provides answer to those questions, using informative headings in the

    rows in colums for those tables (and captions and headings for those tables.)
  19. Hi SEO
    You have describe very well in this post. great job Thanks for sharing this informative article.

  20. It is awesome. I appreciated it. Your blog information is very knowledgeable, and I like your style to explain your blogging skills. Your blogging toolkit information is excellent and genuine. Thank you for sharing.

  21. This is a wonderful piece of content to starts a day, This is the post I am reading in this morning and it actually helped me in unlocking the code “how Google’s Knowledge Graph works”

  22. Hi Bill,

    It was great to ready your post. So informative. Thanks for explaining every detail – it really helps is understanding the over all picture too.

    Cheers
    Daksh

  23. Amazing work. Please keep continue your good work and keep posting these interesting articles. this post is very helpful, Thanks you shared great content.keep it up

  24. Very clear explanation about knowledge graph… thanks for sharing the informative article…

  25. Hi,
    This is nice post for google knowledge graph and having right articles to see you here and thanks a lot for sharing with us.

  26. Incredible, mind blowing, awesome blog post. Your analysis is great for Google. I am so impressed with this post. Keep going and sharing your analysis in form of these wonderful blog post.

  27. Thanks for this article, Bill. You have focused on not so popular topic. Thanks to your text many people will learn about what the Google’s Knowledge Graph is.

  28. This is a great post about the fundamental shift in SEO and how search engines are getting smarter as they strive towards complete natural language processing. It’s imperative to manage business data and ensure clarity about your brand around the web.

  29. I like your blog post. Keep on writing this type of great stuff. I’ll make sure to follow up on your blog in the future.

  30. very informational and helpful post. It clear my concepts about google and its knowledge graph.Thanks for sharing..

  31. Wow Bill very interesting article about “that shift from strings to things”.

    I landed on your website from seoskeptic.com, first time I read your blog, I will come back for sure.

    From the paper published on the Google AI Blog, when it’s written:


    We discuss how to aggregate candidate answers across multiple queries, ultimately returning probabilistic predictions for possible values for each attribute. Finally, we evaluate our system and show that it is able to extract a large number of facts with high confidence.

    In other words, does it mean Google Knowledge Graph is browsing the WWW to compile structured or unstructured answers to a missing information and based on the authoritativeness and other factors decide which information is accurate and should be included in the Knowledge Graph?

  32. Amazing work. Please keep continue your good work and keep posting these interesting articles. this post is very helpful, Thanks you shared great content.keep it up

  33. very well. I really enjoy reading my blog and I will definitely bookmark it! Continue the interesting post

  34. I am reading a blog on this website for the first time and I would like to tell you that the quality of the article is up to the mark it is very well written.

  35. Thank you so much for this valuable and informative post.
    Really very meaningful blog with lots of Knowledge and Ideas

  36. Hey Bill,

    It was great to read your post. So instructive. Thanks for describing every detail – it surely helps us to understand the overall picture too.

    Cheers
    Drake

  37. The way you explain is like WoW.

    i almost read 3 blogs and every blog article is Superb.

    Nice Work

  38. Thank you for sharing your article on Knowledge Graph Updates which Most People ignore ,with us and it’s terribly helpful for Beginners. This website is informative web log and that i feel very lucky to read your Blog and keep share your articles and keep occurring.

  39. I really like your wonderful post. I loved to read such kind of article. and the first time I visit our website. and I happy to be here, thanks for sharing this amazing post

  40. Search engines are becoming smarter day by day and the Knowledge Graph is one of the tools, contributing to it. This is a great post about the fundamental shift in SEO and this is a great insight into the Google Knowledge Graph. Finally, we evaluate our system and show that it is able to extract a large number of facts with high confidence.

  41. Hi,
    This is nice post for how google updates itself answering questions and having right a articles to see you here and Thanks a lot for sharing with us.

  42. Your blog post gave an in-depth knowledge about how Google is becoming smarter day by day. Digital Marketing Analysts like me need such information to experiment in my profession.

    Thanks a lot for this information

  43. Thanks, for the blog…
    I like your knowledge about this topic and most important, way you explain things are really great. Believe me before reading your blog I have no idea about what is google knowledge graph is all about but now reading this I am sure that I can explain someone else too. 🙂

  44. Hi Bill Slawski
    I liked it lot. Your blogging style is amazing and very knowledgeable information you have written here. I appreciate it. Keep it up…
    Thanks!

  45. Hey Bill,

    thanks for the post it was a eye opener for me. So instructive. Thanks for describing every detail – it surely helps us to understand the overall picture too.

  46. It is true that they have the whole process automated. But when people report a certain issue, let’s say ‘Abraham Lincoln’ age is wrong or he is shown as alive in Google profile on search engines, they will resort to manual review. Because altering the algorithm will take a lot of time and might ruin other search results.

  47. Google has developed approaches involving normalization of data to try to avoid facts being incorrect. There isn’t always a need for a manual review. This is why NAP (Name, Address, Phone information in local SEO) is as important as it is.

  48. Your blog post gave an in-depth knowledge about how Google is becoming smarter day by day. Digital Marketing Analysts like me need such information to experiment in my profession.

    Thanks a lot for this information

  49. Knowledge graph is very informational resource for any topic. You have explained every minor detail related to the topic so i am really thankful to you for sharing your ideas and suggestions.

  50. This is a great post about the fundamental shift in SEO and how search engines are getting smarter as they strive towards complete natural language processing. It’s imperative to manage business data and ensure clarity about your brand around the web.

  51. Thanks, for the blog…
    I like your knowledge about this topic and most important, way you explain things are really great. Believe me before reading your blog I have no idea about what is google knowledge graph is all about but now reading this I am sure that I can explain someone else too. ????

  52. A search engine should be capable of filling in knowledge gaps in a knowledge graph about properties of an entity included in that knowledge graph, and by updating

  53. Bill, Awesome!

    its really great knowledge based articles. its about search engine should be capable of filling in knowledge gaps in a knowledge graph, Thanks for sharing informative articles!

  54. I am new in SEO and your post helps me a lot to know what is Google Knowledge Graph. Thanks for sharing the information in detailed.

  55. Hi Sangam,

    I find value in reading what Google is protecting as intellectual property when they write about processes that they are using to rank pages, and to display search results. Patents provide a chance to learn about the problems that search engineers are interested in solving, and they tell us what they are thinking about when they think about search, the Web, and searchers. There is value in their words. I don’t read patents because I am interested in how patents work, but because they are about search by people who are running a very large and influential search engine.

  56. Great read Bill! Over the last while, there’s really been a big shift away from quantitative factors, towards the qualitative. I think it’s the right direction, to gradually do away with all the raw metric exploitation that we saw and move towards a more “meaningful” user experience; a truly helpful outcome to every search query.

  57. Hi Nik,

    The knowledge base reconcilation process makes a lot of sense. Updating the knowledge graph from sources such as the news makes a lot of sense.

  58. Hey Billl
    The things you mentioned about knowledge graph is very important that every person who are doing SEO will know that as you covered every useful information about knowledge graph. I hope i will follow your techniques for the beneficial of my website & tell everyone of my team to read your blog
    Thankyou

  59. How does Google show some 3rd party sites in KG, while I search any generic term? I don’t know if Google trust those sites or there is any other way to get included into Google Knowledge Graph as reliable source. It shows something like (source: website name) for some the most competitive keywords and 3rd party sites are dominating in KG section.

  60. Hi Gaurav,

    It’s a little difficult deciding what it is actually that you are asking -whether you are asking about why there is a knowledge graph for something, or whether a site like wikipedia is quoted in a knowledge graph for a particular place or thing or business. Google often chooses information from knowledge bases to show about a topic in knowledge graphs, such as Wikipedia or IMDB, but they have been showing some information about the topic of a knowledge graph from sites that appear to be authoritative for the topic of a knowledge graph.

    Google decides to show knowledge graphs for entities, or specific people, places or things that may be in a query that someone performs. There are a few ways that Google may recognize that something is an entity, such as seeing an Wikipedia article about it (if it is considered notable enough to have a Wikipedia article about) or if it is a business that is verified in Google My Business.

Comments are closed.