Google Synonyms Update

Over at the Official Google Blog, Google’s Steven Baker just announced a major change in the way that Google handles search results by including synonyms for some words that may be used in queries, in the post
Helping computers understand language.

I wrote about the change on December 22, 2009, in my blog post How Google May Expand Searches Using Synonyms for Words in Queries, which describes a patent published by Google, Determining query term synonyms within query context, naming Steven Baker as a co-inventor.

I also included in my post an example which showed a change in the way that Google highlights query terms to include terms that Google might consider to be “synonyms.” The Official Google Blog provides some information about the change, including the change in highlighting behavior (which wasn’t specifically mentioned in the patent) and my December post digs more deeply into the granted patent.

Google’s Matt Cutts also provides some advice for webmasters on what this change might mean and how to address it in More info about synonyms at Google.

If you blink, you might also miss a link in the Official Google Blog post to a post that Steven Baker tells us is about other “techniques to extract synonyms” pointed to on the Google Public Policy Blog titled Making search better in Catalonia, Estonia, and everywhere else.

That post describes how Google is creating different statistical language models and may use them to “find alternatives for words used in searches.”

A Google patent application published in 2008, and a couple of white papers from Google describe how those language models can be used to expand queries using synonyms during web searches and in Question and Answer (Q&A) results. I wrote about those documents and processes in December 2008 titled How a Search Engine Might Find Synonyms to Use to Expand Search Queries.

The approach described in that patent filing explores how a word or phrase can be translated from one language into another, and then translated back into the original language, and the translation back may include more than one result. For instance, if you translate “automotive parts” from English into French, and then back into English, you might receive at least two possible phrases back – “automotive parts” and “car parts.”

Google may explore the use of the the word “car” as a possible synonym in the second phrase to decide whether or not to include results for both “automotive parts” and “car parts” in search results. Or to offer “car parts” as a query suggestion to someone who might have searched for “automotive parts.”

What’s important to note in how Google is approaching synonyms when expanding queries to include them in search results is that the context of words is very important.

For example, the word “pupil” has more than one meaning – it can be used to describe part of your eye, or a student. While the word “student” is a synonym for pupil in a number of contexts, you shouldn’t see pages returned in searches for “dilated pupils” that include the word “student” instead of “pupil.”

Share

36 thoughts on “Google Synonyms Update”

  1. Pingback: google und der umgang mit den synonymen | blogger zeug
  2. Thanks Bill! I have seen much evidence of this already in searches, good read!

    Also I havent had a chance to wish you Happy New Year, hope all is well with you and the family! With regards, Rob

  3. Google is way ahead of other search engines in keyword relationships/semantics. This shows with their dominace in the search engine industry with well over 60% (at least in the US)

  4. I just hope that this change in the way Google will now be handling search results is for the better. Like you said in the last part, synonyms can really sometimes be tricky especially if it has more than one meaning.

  5. Hi Chris,

    Good point. This change has the potential to make it easier for searchers to find information they may want to find, even if the words they use in their search don’t appear on pages that may contain that information. It may also get Web publishers to think more about what words people will use to try to find their sites.

  6. Hi Rob,

    I’ve seen some changes that were possibly based upon at least one of Google’s synonym processes for a while, but Google’s published a number of patent filings and white papers query suggestion/revision processes over the past few years that it’s been hard to tell if the synonym processes in the two different patents I described abover were responsible or not.

    For instance, there were a number of recent blog posts from SEOs in the UK about Google showing results for [search engine optimization] (with a “z”) when people used the alternative spelling of optimization, as in [search engine optimisation]. Was Google showing those results because it sees optimization as a synonym for optimisation, or was it doing it as part of Google’s spell correction algorithm?

    Likewise, when Google expanded a query like [ft. worth plumbers] to include results for [fort worth plumbers], was the synonym algorithm in use, or was Google using some kind of dictionary lookup of common place names along with an algothm that determined a geographic intent in a search?

    The truth is, we really can’t be sure. Fortunately, Google’s announcements that they have started using these synonym algorithms gives us an idea that at least in some instances where they expand queries, they may be using these synonym algorithms.

    Thanks for your kind wishes for the new year. I hope it’s a good one for you and your family as well.

  7. Hi Thomas,

    Google is doing some very interesting things, but it’s not easy to tell how “far ahead” they might be of the other search engines. There are some very smart and talented people working over at Yahoo and Bing, and Mircrosoft’s acquisition of Powerset’s technology brought them some interesting processes involving semantics.

  8. Hi Andrew,

    I’m going to second that hope. :)

    There were a couple of interesting stats in the Official Google Blog that I didn’t include in my post, and in hindsight wish I had:

    However, our measurements show that synonyms affect 70 percent of user searches across the more than 100 languages Google supports. We took a set of these queries and analyzed how precise the synonyms were, and were happy with the results: For every 50 queries where synonyms significantly improved the search results, we had only one truly bad synonym.

    The first – Synonyms affect 70 percent of user searches – is a very large number of searches. If that’s true, then this change could have a significant impact upon the search results that we see.

    The second – for every 50 queries they only had one “truly” bad synonym – even though that’s a small number, in context it’s still a lot of searches. I hope they find a way to reduce that number to a much smaller percentage.

  9. Yes, I agree Bill. One bad synonym out of every 50 is a high percentage, considering the volume of queries that Google gets. They should make this number much smaller if they want to address this issue successfully. It’d be interesting to compare how the synonyms system is used with the “broad” match terms within Google Adwords.

  10. Hi Paul,

    It is a large number – but I imagine that Google is spending a fair amount of energy in improving how the user-data and/or statistical language models that help identify synonym does so.

    Great question on how this system might be used with Adwords. It’s probably worth doing some research and experimentation to try to learn more.

  11. One of the important factors for the google searchers, this is why I actually like google for, write a word and you will get result of the sister words too. Thanks for such a nice update!

  12. I believe when two or more synonyms come from different languages the Google process is not the same and the results change related to the location you are.

  13. We already include synonyms in our SEO on-page efforts, plus variations of keywords like adding -s, -ed, and -ing.

    I don’t expect the search ranking to change much. Synonyms are treated as separate words by copywriters and SEOers. If Google begins to “merge” the keywords into one serp, then our pages would remain in the same position.

  14. Bill, you are right in that Google will have to pay strict attention to how they expand on searches where there is a possibility of returning words that are out of context.

    Returning ‘bad synonyms’ at a rate of 2% isn’t acceptable.

  15. Hi Allesandro,

    It’s quite possible that you’re right.

    For example, Google has also published a patent application for finding synonyms when a word or phrase is a transliteration of one word into the characters/script of another language.

    When it comes to interactions between multiple languages, language preferences set in your browser, or in your Google settings might also play a role in what you might see, as well as other possible factors.

  16. Hi Nathon,

    Good question. Including additional variations of keywords and synonyms may make it more likely that you could rank for those as well without these kinds of synonym query expansions. But, if Google does merge results based upon such query expansion, your use of those synonyms may not help you much. And your rankings may just be affected.

  17. hi all

    “Completely agree. Some of those 2% searchers might be disappointed enough to might move over to another search engine.”

    I dont agree, even if some get disappointed there is a very small possibility to turn to another SE.
    Why? just because there is none better SE than G.

  18. Hi Chris,

    It can be pretty frustrating when you perform searches, and try a few different queries, and the results you get only bear a passing resemblance to what you might be looking for. I remember back when I stopped searching as much at Alta Vista, and starting using a new search engine named Google more frequently. It could happen with Google as well…

  19. While I agree that even 2% is a little on the high side, I am not sure that people would move over to another search engine as a result. Fair enough, one or two might… but most could probably find it in them to enjoy a laugh at some of the out of context synonym results while others will just (understanding the nature of computers and algorithms) just try researching!

  20. Hi Stacy,

    It’s possible that those of us who work on the web, and pay attention to what search engines are doing on a regular basis might be more likely to laugh when Google provides results that just don’t come close to being relevant for a query than people who are given a choice between Google and looking at another search engine.

    There’s an interesting paper from a couple of Microsoft researchers on switching behaviors that is worth a look, named Characterizing and Predicting Search Engine Switching Behavior (pdf). A short snippet from the paper:

    Of the 14.2 million users in our log sample, 10.3 million (72.6%) used more than one engine in the six-month duration of the logs, 7.1 million (50.0%) switched engines within a search session at least once, and 9.6 million (67.6%) used different engines for different sessions (i.e., engaged in between-session switching). In addition, 0.6 million users (4.4%) “defected”1 from one search engine to another and never returned to the previous engine.

    It isn’t unusual for people to switch from one search engine to another during a search session, and 4% of the searchers studied switched for good. It does happen.

  21. Hi Mike,

    I see Google’s expansion of queries to use synonyms to be a positive step – it makes it more likely that people can find information that they may be searching for.

    I agree with you that it is a good idea to carefully consider putting synonyms and related terms on the pages of your site as well. Most businesses should review and update their business models on a periodic basis – the world changes, and the needs and interests of people who might be your customers change as well.

  22. is this such a bad thing? if you include many synonyms in your page they all help to back up your main keywords and your site will be found more often. It could be an opportunity to reassess your whole business model

  23. It’s not just synonyms which are problematic – it’s google suggesting americanised spellings of UK words. For many marketing companies it must be something of a struggle to decide whether to optimise for “search engine optimisation” or “search engine optimization”!

  24. Hi Jimbo,

    Good point. Another area where this is an issue is in transliteration, where text is converted from one writing system to another, based not upon the meanings or the words involved, but often rather the sounds of those words. Google has published another patent filing on synonyms based upon transliteration, which is a post I’ve started but never quite finished – I spent a great amount of time writing about background information to begin the post.

    I’m not sure what the idea situtation is when it comes to how Google and the other search engines handle dialectical differences except testing and more testing, and even more testing isn’t a bad approach – and then keeping up with possible changes in the way that they handle variants in spelling and presentation.

    Another somewhat related challenge comes with words that are spelled exactly the same, but have very different meanings based upon locations, such as football in the US, Football in the UK, and Football in Australia. All three are sports, but very different sports.

  25. Interesting article Bill. Shame I only stumbled across it nearly a year late! The 2% bad synonyms rate is the thing that really stood out for me. I wonder if they’ve improved on it a year on…

  26. Hi Brendan,

    I’ve been trying to pay attention to see how many results I actually see that provide useful synonyms. Not a scientific approach by any means, but I’ve been keeping an eye out. I’m not sure that I’ve really seen too many results that weren’t useful when synonyms were provided.

    Haven’t seen or heard anything from Google about the bad synonym rate on any of their blogs, or in whitepapers from them.

  27. Google has been highlighting synonyms in search results for a few years now. I have found that mainly, the synonyms matched the basic words suggested by Sets.

  28. Hi Sashwat,

    Google has been showing synonyms for a while, but chances are that Google Sets weren’t the place where they were getting them. The approach that Google Sets used is pretty well known. Google would identify lists found on Web pages, and collect that list information. When someone used Google Sets, they would identify at least two different list items, and Google would identify others based upon their index of list items.

    It’s not a surprise that you might have been seeing synonyms in some Google Sets results, since the chance of someone constructing lists on the Web that might contain synonyms wouldn’t be shocking.

    My post does provide a number of links to Google Blog posts, posts I’ve written, and at least one patent that describe ways that Google does identify synonyms, especially synonyms within the contents of a query.

  29. Hi Bill.

    Very interesting article but really wish I had come across it sooner. Have recently been reading Matt Cutts’ blog and you tube post and you are definitely on topic here.

    Thanks again.

    Sara x

  30. Hi Sara,

    Thanks. I’ve been spending a little more time looking at what Google is doing with synonyms and might have a followup post on this topic sometime soon.

Comments are closed.