Google Synonyms Update
Over at the Official Google Blog, Google’s Steven Baker just announced a major change in the way that Google handles search results by including synonyms for some words that may be used in queries, in the post
Helping computers understand language.
I wrote about the change on December 22, 2009, in my blog post How Google May Expand Searches Using Synonyms for Words in Queries, which describes a patent published by Google, Determining query term synonyms within query context, naming Steven Baker as a co-inventor.
I also included in my post an example which showed a change in the way that Google highlights query terms to include terms that Google might consider to be “synonyms.” The Official Google Blog provides some information about the change, including the change in highlighting behavior (which wasn’t specifically mentioned in the patent) and my December post digs more deeply into the granted patent.
Google’s Matt Cutts also provides some advice for webmasters on what this change might mean and how to address it in More info about synonyms at Google.
If you blink, you might also miss a link in the Official Google Blog post to a post that Steven Baker tells us is about other “techniques to extract synonyms” pointed to on the Google Public Policy Blog titled Making search better in Catalonia, Estonia, and everywhere else.
That post describes how Google is creating different statistical language models and may use them to “find alternatives for words used in searches.”
A Google patent application published in 2008, and a couple of white papers from Google describe how those language models can be used to expand queries using synonyms during web searches and in Question and Answer (Q&A) results. I wrote about those documents and processes in December 2008 titled How a Search Engine Might Find Synonyms to Use to Expand Search Queries.
The approach described in that patent filing explores how a word or phrase can be translated from one language into another, and then translated back into the original language, and the translation back may include more than one result. For instance, if you translate “automotive parts” from English into French, and then back into English, you might receive at least two possible phrases back – “automotive parts” and “car parts.”
Google may explore the use of the the word “car” as a possible synonym in the second phrase to decide whether or not to include results for both “automotive parts” and “car parts” in search results. Or to offer “car parts” as a query suggestion to someone who might have searched for “automotive parts.”
What’s important to note in how Google is approaching synonyms when expanding queries to include them in search results is that the context of words is very important.
For example, the word “pupil” has more than one meaning – it can be used to describe part of your eye, or a student. While the word “student” is a synonym for pupil in a number of contexts, you shouldn’t see pages returned in searches for “dilated pupils” that include the word “student” instead of “pupil.”