“How Long is Harry Potter?” is asked in a diagram from a Google Patent. The answer to this vague question is unlikely to do with a length related to the fictional character but may have something to do with one of the best-selling books or movies featuring Harry Potter.
When questions are asked as queries at Google, sometimes they aren’t asked clearly, with enough preciseness to make an answer easy to provide. So how does Google Answer vague questions?
Question answering seems to be a common topic in Google Patents recently. I wrote about one not long ago in the post, How Google May Handle Question Answering when Facts are Missing
So this post is also on question answering but involves issues involving the questions rather than the answers. And particularly vague questions.
Early in the description for a recently granted Google Patent, we see this line, which is the focus of the patent:
Some queries may indicate that the user is searching for a particular fact to answer a question reflected in the query.
I’ve written a few posts about Google working on answering questions, and it is good seeing more information about that topic being published in a new patent. As I have noted, this one focuses upon when questions asking for facts may be vague:
When a question-and-answer (Q&A) system receives a query, such as in the search context, the system must interpret the query, determine whether to respond and if so, select one or more answers with which to respond. Not all queries may be received in the form of a question, and some queries might be vague or ambiguous.
The patent provides an example query for “Washington’s age.”
Washington’s Age could be referring to:
- President George Washington
- Actor Denzel Washington
- The state of Washington
- Washington D.C.
For the Q&A system to work correctly, it would have to decide which searcher who typed that into a search box the query was likely interested in finding the age of one of the Washingtons. Trying that query, Google decided that I was interested in George Washington:
The problem that this patent is intended to resolve is captured in this line from the summary of the patent:
The techniques described in this paper describe systems and methods for determining whether to respond to a query with one or more factual answers, including how to rank multiple candidate topics and answers in a way that indicates the most likely interpretation(s) of a query.
How would Google potentially resolve vague questions problem?
Narrowing the question to a specific topic can make a difference. For example, which topic are people most likely asking about?
It would likely start by trying to identify one or more candidate topics from a query. Then, it may try to generate, for each candidate topic, a candidate topic-answer pair that includes both the candidate topic and an answer to the query for the candidate topic.
It would obtain search results based on the query, which references an annotated resource, which would be a resource that, based on automated evaluation of the resource’s content, is associated with an annotation that identifies one or more likely topics associated with the resource. For each candidate topic-answer pair,
There would be a Determination of a score for the candidate topic-answer pair based on:
(i) The candidate topic appearing in the annotations of the resources referenced by one or more of the search results
(ii) The query answer appearing in annotations of the resources referenced by the search results or in the resources referenced by the search results.
A decision would also be made to respond to the query, with one or more answers from the candidate topic-answer pairs, based on the scores for each.
Topic-Answer Scores for Vague Questions
The patent tells us about some optional features as well.
- The scores for the candidate topic-answer pairs would have to meet a predetermined threshold
- This process may decide to not respond to the query with any of the candidate topic answer pairs
- One or More of the highest-scoring topic-answer pairs might be shown
- An topic-answer might be selected from one of several interconnected nodes of a graph
- The Score for the topic-answer pair may also be based upon a respective query relevance score of the search results that include annotations in which the candidate topic occurs
- The score to the topic-answer pair may also be based upon a confidence measure associated with each of one or more annotations in which the candidate topic in a respective candidate topic-answer pair occurs, which could indicate the likelihood that the answer is correct for that question
Knowledge Graph Connection to Vague Questions?
This vague question-answering system can include a knowledge repository that includes several topics, including attributes and associated values for those attributes.
It may use a mapping module to identify one or more candidate topics from the topics in the knowledge repository, which may be determined to relate to a possible subject of the query.
An answer generator may generate for each candidate topic, a candidate topic-answer pair that includes:
(i) The candidate topic, and
(ii) An answer to the query for the candidate topic, wherein the answer for each candidate topic is identified from information in the knowledge repository.
A search engine may return search results based on the query, which can reference an annotated resource. Based on automated evaluation of the content of the resource, a resource is associated with an annotation that identifies one or more likely topics associated with the resource.
A score may be generated for each candidate topic-answer pair based on:
(i) An occurrence of the candidate topic in the annotations of the resources referenced by one or more of the search results
(ii) An occurrence of the answer in annotations of the resources referenced by the one or more search results or in the resources referenced by the one or more search results. A front-end system at one or more computing devices can determine whether to respond to the query with one or more answers from the candidate topic-answer pairs, based on the scores.
The additional features above for topic-answers appears to be repeated in this knowledge repository approach:
- The answering system may decide to respond or not to the query based on a comparison of one or more of the scores to a predetermined threshold
- Each of the numbers of topics in the knowledge repository can be represented by a node in a graph of interconnected nodes
- The returned search results can be associated with a respective query relevance score, and the scoring module can determine the score for each candidate topic-answer pair based on the query relevance scores of one or more of the search results that reference an annotated resource in which the candidate topic occurs
- For one or more of the candidate topic-answer pairs, the score can be further based on a confidence measure associated with each of one or more annotations in which the candidate topic in a respective candidate topic-answer pair occurs, or each of one or more annotations in which the answer in a respective candidate topic-answer pair occurs
Advantages of this Vague Questions Approach
- Candidate responses to a query can be scored so that a Q&A system or method can decide whether to respond to the query.
- If the query does not ask a question or if none of the candidate answers are sufficiently relevant to the query, then no response may be provided
- The techniques described here may interpret a vague or ambiguous query and provide a response that is most likely to be relevant to what a searcher was looking for when submitting the query.
This patent on answering vague questions is:
Determining question and answer alternatives
Inventors: David Smith, Engin Cinar Sahin, and George Andrei Mihaila
Assignee: Google Inc.
US Patent: 10,346,415
Granted: July 9, 2019
Filed: April 1, 2016
Abstract
A computer-implemented method can include identifying one or more candidate topics from a query. For each candidate topic, the method can generate a candidate topic-answer pair that includes both the candidate topic and an answer to the query for the candidate topic. The method can obtain search results based on the query, wherein one or more of the search results references an annotated resource. For each candidate topic-answer pair, the method can determine a score for the candidate topic-answer pair for use in determining a response to the query, based on (i) an occurrence of the candidate topic in the annotations of the resources referenced by one or more of the search results, and (ii) an occurrence of the answer in annotations of the resources referenced by the one or more search results, or in the resources referenced by the one or more search results.
Vague Questions Takeaways
I am reminded of a 2005 Google Blog post called Just the Facts, Fast when this patent tells us that sometimes it is “most helpful to a user to respond directly with one of more facts that answer a question determined to be relevant to a query.”
The different factors that might be used to determine which answer to show if an answer is shown include a confidence level, which may be confident that an answer to a question is correct. That reminds me of the association scores of attributes related to entities that I wrote about in Google Shows Us How It Uses Entity Extractions for Knowledge Graphs. That patent told us that those association scores for entity attributes might be generated over the corpus of web documents as Googlebot crawled pages extracting entity information, so those confidence levels might be built into the knowledge graph for attributes that may be topic-answers for a question answering the query.
A webpage that is relevant for such a query and that an answer might be taken from may be used as an annotation for a displayed answer in search results.
This article has completely changed the way I view the way Google answers questions.
Hi Christopher,
I was excited to write about this patent because it provided a lot of insight into how Google went about trying to understand what was being asked, and choosing (and scoring) topic-answers. I also like that they included information about using knowledge base information if available as well.
Great post. You are the master. I constantly search your website to find ideas for my own small business online.
Thank you, Nhan.
very interesting article, but Google has always had a problem with the answers. Therefore, the way you were prompted was whether we meant a specific topic. However, I feel that I do not quite understand Google’s interpretation sometimes
Hi Lukasoo,
Google is working on trying to provide the best answers they can, but sometimes the questions can be difficult to understand. I came across another patent recently from Google that is very similar, and I will be posting about it soon. Hopefully they will get better at figuring out what is being asked.
Wow that analysis came out quick! Thanks Bill.
Very interesting reading this Bill, very knowledgeable. I enjoy reading your posts!
I found this post quite by accident. I’m amazed at the time and detail you’ve put into this by actually obtaining the patents from Google. However, does Big G have to disclose exactly how their patent works. Since the Patent Office is a government office, I doubt very seriously that anyone there would have the brainpower to actually check out whether it actually worked the way they say it does or not. Remember, you have to take a test to get into Google so I doubt they didn’t have someone there that could fool with this enough to mask the actual plan or results.
Anyway, this is a very thought-provoking post, and I’m impressed.
Hi Elmo,
The United States Patent and Trademark office is referred to in the US constitution a the kind of thing that Congress should pass laws to aid inventors, and the first patent it granted in 1790 was reviewed and approved by George Washington. The USPTO recieves thousands of patents a year, and those are reviewed by prosecuting attorneys. Patents are often reviewed in court when there are charges of infringement, and the US federal court system reviews patents that have been approved by the USPTO. These reviewers and judges are not required to be inventors or scientists, but many of the attorneys who prosecute the granting of patents do have training in fields like engineering.
There are over 100,000 employees at Google these days, many of them with advanced degrees, and they work with each other to invent patents, which are reviewed by others including Google’s lawyers. Google’s most well known patent is t one that the founders of Google worked on as students at Stanford University, and were granted an exclusive license to use. I’ve written about over 1,200 posts here, many about patents, many from Google, and those have gone thouigh the process of being reviewed by the USPTO.The people running Google are smart enough to file patents that described changes to the way it works.
Bill,
Thanks so much for the excellent response. I didn’t get a chance to go through your entire site, but it must be a vast repository of excellent SEO technical stuff.
But to get back to your response, does Google have to tell the patent office every time it alters its algorithm? I mean, couldn’t they theoretically change it somewhat the day after going through the patent process? Not only that but wouldn’t the formula for Google’s algorithm be very complex? You’d need a staff of really competent programmers to be able to understand or back-engineer what
Google was doing. At least that’s the way it would seem.
Thanks again, Bill, for your response.
~Elmo
Hi Elmo,
When a patent filer decides to change the process behind a patent, and they still want the new version protected, they file a continuation patent, which normally has the same title and description as the original version, but has updated claims and usually a statement saying it is a continuation patent. Since the patent office looks primarily at the patent claims when deciding if they will grant a patent, that is what they look at when deciding to grant a continuation patent – so that they know whether or not it infringes upon an existing patent, and meets the guidelines for a new patent (which start at: new, nonobvious, and useful). Since the changes to the patent reflect changes that they have implemented, or likely will implement in the process described in the patent, and they are competent, I tend to look at continuation patents as hints that they have likely implemented a patent, and have updated the processes behind how they are using it. I recently wrote about how Google News Ranking signals were updated by continuation patents for the 5th time, and how Google’s Universal Search was updated by continuation patents for the 4th time. So yes there is a process for handling such changes. A Continuation patent takes on the filing date of the original patent when it is filed, and I like seeing them when they appear.
This is a really interesting post. I have noticed of late Google getting far better in answering questions.
I’ve been an avid fan of SEO by the Sea for many years now Bill and it’s posts like this which explain why. THANK YOU SO MUCH!
Graham
Thanks, Graham.
Very happy to hear that you are enjoying SEO by the Sea.
Very interesting information you shared in the above article. It’s very fascinating to know how Google brain thinks as humans and do the calculations to show the right results. SEO is more technical and powerful these days compare to last decad. It is more on the technical augmentation of the website for search engine.
I guess it has improved a lot what it used to be before. As time goes by it will get more and more relevant data and improve it self.