Spoken Queries and Stressed Pronouns
The future of searches on the Web will likely involve searches by voice, as more and more people are connecting to the web with phones and Google has added voice search interfaces to its search on desktop computers.
I thought it was interesting when I ran across a patent that focused on a problem that might arise with spoken searches, and thought it was worth writing about because it’s something that we will need to become acquainted with as it becomes more commonplace.
When Amit Singhal showed off Google’s hummingbird update, he gave a presentation that showed Google handling searches involving pronouns. It’s worth watching for the information about Hummingbird, but also about how Google is becoming more conversational, and can handle things like stressed pronouns. The video is at:
I remembered the presentation about hummingbird and a more conversational Google, when I saw this patent come out from Google, which explains some of the technology behind aspects of conversational search:
Resolving pronoun ambiguity in voice queries
Inventors: Gabriel Taubman and John J. Lee;
US Patent 9,529,793
Granted: December 27, 2016
Filed: February 22, 2013
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for resolving ambiguity in received voice queries. An original voice query is received following one or more earlier voice queries, wherein the original voice query includes a pronoun or phrase. In one implementation, a plurality of acoustic parameters is identified for one or more words in the original voice query. A concept represented by the pronoun is identified based on the plurality of acoustic parameters, wherein the concept is associated with a particular query of the one or more earlier queries. The concept is associated with the pronoun. Alternatively, a concept may be associated with a phrase by using grammatical analysis of the query to relate the phrase to a concept derived from a prior query.
I did write about some papers that Google researchers had written about pronouns in the post Searching with Pronouns: What are they? Coreferences in Followup Queries
But, the granted patent from this week had an example that was worth sharing about an aspect of conversational search that wasn’t covered in one of those papers, involving stressed pronouns. Here is the example:
A voice query asks: “Who was Alexander Graham Bell’s father?”
The answer: “Alexander Melville Bell”
A followup voice query: “What is HIS birthday?”
The answer to the followup query: “Alexander Melville Bell’s birthday is 3/1/1819”
The point behind this patent is that the search engine decided that it should tell the searcher the birthdate for the inventor’s father. This was done based upon the fact that the “HIS” in that second query was stressed, to indicate that it was about the father, and not the son mentioned in that first query.
The patent tells us of a “stress score” for spoken words or phrases in a voice query that could include “volume, pitch, frequency, duration between each spoken words, and spoken duration of words or phrases,” and it tells us that “By comparing the stress score for the pronoun to a threshold, an implementation may determine that the stress score indicates that the pronoun is stressed or not.”
The impact of a stressed query? The patent says, “For example, if a pronoun is stressed, it may indicate that it refers to a concept from an immediately preceding query, while a pronoun that is not stressed may refer to a concept from a query that occurred earlier in a series of received queries.” It’s an interesting assumption that does sound like it uses how people actually convey information during inquiries when they are having conversations. The patent does tell us about some of the science behind this determination about stressed pronouns:
For example, if the absolute measure for the volume of the pronoun is 80 dB and the average volume for the other words in voice query is 60 db, the ratio of the volumes is 1.33. This relative volume measure for the pronoun indicates that the volume of the pronoun is 33% greater than the volume of the rest of voice query. Alternatively, the relative measures can be a difference between the acoustic parameters for the pronoun and the acoustic parameters for the other words in voice query. For example, if the absolute measure for the time duration of the pronoun is 80 ms and the average time duration of the other words in voice query is 50 ms, the difference in the time duration is 30 ms. This relative time duration measure for the pronoun indicates that the time duration of the pronoun is 30 ms more than the average time duration for the words in voice query. Alternatively, the relative measures of the acoustic parameters for the pronoun can be relative to the acoustic parameters for only the words that immediately precede and follow the pronoun.
The patent provides some other examples of how stresses might be understood, including how grammatical differences may play a role.
It is interesting that these types of things may influence spoken queries. If you’ve been wondering about how Google might understand pronouns, now you have an idea of how it could understand stressed pronouns.