How Google May Handle Question Answering when Facts are Missing

question answering

Sharing is caring!

I wrote about a similar patent in the post, Google Extracts Facts from the Web to Provide Fact Answers

This one introduces itself with the following statement, indicating a problem that Google may have with answering questions from the facts it may collect from the Web to fill its knowledge graph:

Embodiments relate to relational models of knowledge, such as a graph-based data store, can be used to provide answers to search queries. Such models describe real-world entities (people, places, things) as facts in the form of graph nodes and edges between the nodes. While such graphs may represent a significant amount of facts, even the largest graphs may be missing tens of millions of facts or may have incorrect facts. For example, relationships, edges or other attributes between two or more nodes can often be missing.

That is the problem that this new patent is intended to solve. The patent was filed in November of 2017. The earlier patent I linked to above was granted in June 2017. It does not anticipate missing or incorrect facts like this newer patent warns us about. The newer patent tells us about how they might be able to answer some questions without access to some facts.

It’s also reminding me of another patent that I recently wrote about on the Go Fish Digital Website. That post is titled, Question Answering Explaining Estimates of Missing Facts. Both the patent that post was about and this new patent include Gal Chechik, Yaniv Leviathan, Yoav Tzur, Eyal Segalis, as inventors (the other patent has a couple of additional inventors as well.)

The earlier question answering with estimates patent talks about how they might infer answers, and provide explanations with those answers. This also tells it might infer answers, but doesn’t include the explanations:

Facts and/or attributes missing from a relational model of knowledge often can be inferred based on other related facts (or elements of facts) in the graph. For example, a search system may learn that an individual’s grandfather is a male parent of a parent. Accordingly, the system can determine with high confidence that an individual’s grandfather, even though there is no grandfather edge between nodes, is most likely a parent of a parent (given that there is a parent edge between nodes) with an additional check the parent of the parent is male. While this example uses one piece of supporting evidence (called a feature), inferring an individual’s grandfather, functions estimating missing facts are often more complex and can be based on several, even hundreds, of such features. Once the facts and/or attributes missing from a relational model of knowledge can be inferred, queries based on the facts and/or attributes missing from a relational model of knowledge can be resolved.

The process described in this question answering patent describes how Google may go about coming up with an answer to a question. This patent was filed after the one that includes estimates of how answers were created, so it does not include that step:

In one example embodiment, a computer system includes at least one processor and a memory storing a data graph and instructions. The instructions, when executed by the at least one processor, cause the system to generate a template sentence based on a fact including a first node, a second node and a string, wherein the first node and the second node exist in the data graph and the string represents a fact that is absent from the data graph, search the internet for a document including the template sentence, and upon determining the internet includes the document with the template sentence, infer the fact by generating a series of connections between nodes and edges of the data graph that together with the first node and the second node are configured to represent the fact, the series of connections defining a path, in the data graph, from the first node to the second node.

This process isn’t described in too much detail, but the patent does provide an example, which may be helpful in understanding how it may work. Here is that example:

For example, a node may correspond to a fact describing a parent-child relationship. For example, baseball player Bob Boone is the son of baseball player Ray Boone and the father of baseball players Aaron Boone and Bret Boone. Accordingly, the data graph may include an entity as a node corresponding to Bob Boone, which may include an edge for a parent relationship directed to Ray Boone and two edges for child corresponding, respectively, to Aaron Boone and Bret Boone. The entity or node may also be associated with a fact or an attribute that includes an edge (e.g., occupation) between Bob Boone as a node and baseball as a node. Alternatively, the node Bob Boone may include an attribute as a property (e.g., occupation) set to baseball.

However, there may be no edge in the entity (or the graph as a whole) corresponding to a grandparent relationship. Therefore, the relationship between Ray Boone and Aaron Boone may not be shown in the graph. However, the relationship between Ray Boone and Aaron Boone may be inferred from the graph so long as the question answering system knows (i.e., has been instructed accordingly) that there is such an entity as a grandparent.

The inference may be based on the joint distribution of one or more features, which represent facts in the data graph that are related to the missing information. The system may also be used to store the inferences (e.g., as functions or algorithms) and the semantically structured sentence (e.g., X is the attribute of Y) used to generate the inference. It then uses these entities to map new string that corresponds to relationships between nodes. By that system may be configured to learn new edges between existing nodes in the data graph. In some implementations, the system can generate an inference and its algorithm from a very large data graph, e.g., one with millions of entities and even more edges. The algorithm (or function) can include a series of connections between nodes and edges of the data graph. Accordingly, the algorithm can represent an attribute as an edge in a fact. The algorithm (or function) can also include a check of a property of a node (e.g., a gender property is male). While the system in FIG. 1 is described as an Internet search system, other configurations and applications may be used. For example, the system may be used in any circumstance where estimates based on features of a joint distribution are generated.

The mentions of Joint Distributions in this patent are worth studying in more depth as the relationships between properties of different entities may reveal information that worth a system like the knowledge graph knowing about. The son of someone’s son is their grandson. If the knowledge graph doesn’t include that grandson property, then being able to make that connection can mean that a question answering system can start answering questions like Aaron Boone is Ray Boone’s Grandson. Other relations beyond whom is related to whom within a family can use this approach to answer questions as well.

This patent that is aimed at helping fill in missing and incorrect facts for question answering systems is:

Semi structured question answering system
Inventors: Yaniv Leviathan, Eyal Segalis, Yoav Tzur, and Gal Chechik
Assignee: GOOGLE LLC
US Patent: 10,346,485
Granted: July 9, 2019
Filed: November 8, 2017

Abstract

In one example embodiment, a computer system includes at least one processor and a memory storing a data graph and instructions. The instructions, when executed by the at least one processor, cause the system to generate a template sentence based on a fact including a first node, a second node and a string, wherein the first node and the second node exist in the data graph and the string represents a fact that is absent from the data graph, search the internet for a document including the template sentence, and upon determining the internet includes the document with the template sentence, infer the fact by generating a series of connections between nodes and edges of the data graph that together with the first node and the second node are configured to represent the fact, the series of connections defining a path, in the data graph, from the first node to the second node.

Some posts I’ve written about patents involving question answering:

L:ast Update July 11, 2019.

Sharing is caring!

9 thoughts on “How Google May Handle Question Answering when Facts are Missing”

  1. Hi Wolfgang,

    I think the questions are being asked by people. The Web may not be quite ready to answer many of them, and it is possible to see an evolution in how Google is working towards trying to answer such questions. Yes, the son of a son is a grandson – The Web may contain information about sons and not about grandsons yet, but if Google is able to make such a connection, and answer such questions, it is good to see.

  2. Hi AGenzia SEO,

    I think that is a very reasonable point. Spoken queries are likely to grow, and people willbe asking more questions in the future, where they are looking for answers rather than just links to pages. I have a Google Speaker, which I often ask questions to every morning, and it often provides me with answers, and offers to send me a link to more on my phone. That is often a good experience for me.

  3. Hey Bill, these tips are absolutely incredible. Half of these things I didn’t even think about. I’m a beginner in online world and I found these very helpful. Thank you very much!

  4. Hi Nikhil,

    That is one of the reasons why I spend time looking at patents – because many of the topics covered in patents, I wouldn’t have thought about either.

  5. Hey Bil

    Great Tips and really awesome it will definitely help me for better understand Google how to execute queries and how they handle question answer. Thanks for sharing great and valuable article!

  6. The Web may contain information about sons and not about grandsons yet, but if Google is able to make such a connection, and answer such questions, it is good to see.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.