We can help make your web site easier to find, and easier to use.

Recommended Reading










Playing the Odds: How Probable Meanings May Influence Search Engine Rankings

When someone types “George Washington” into a search box, they are probably more interested in the Revolutionary War general and President than some random George in Washington. A search for “Washington Hotels” is more likely looking for lodging in Washington than hotels named Washington. Searches for places with signs that say “Washington Slept Here” are probably not about hotels (and those searchers probably have too much time on their hands).

A lithograph of George and Martha Washington with two children, originally copyrighted in 1889 by Kurz and Allison.

When words used in search queries can have more than one meaning, a search engine may provide better search results to searchers if the search engines can calculate a probability of the most likely meaning of that word. That’s the focus of a patent granted to Yahoo this past week:

System for determining probable meanings of inputted words
Invented by David Richardson-Bunbury, Soren Riise, Devesh Patel, Eugene H. Stipp, Paul J. Grealish
Assigned to Yahoo!
US Patent 7,681,147
Granted March 16, 2010
Filed December 13, 2005

Abstract

A system is disclosed for determining probable meanings of words. An input of a word is obtained. Probable meanings of the word may be determined in accordance with a prior probability of probable meanings of the word and a context frequency probability of probable meanings of the word.

Examples in the patent primarily focus upon place names, but the inventors listed in the patent tell us that the processes described could be used for other terms that could be interpreted more than one way. So, a jaguar could be a kind of animal, a car, or a NFL footballer from Jacksonville.

A search engine may attempt to calculate a probability that a search for “jaguar” may be intended to meet one of those meanings. If another term is added, those probabilities may be calculated differently based upon context. A search for “Jacksonville Jaguar” is more likely about someone playing football, while the odds are that a search for “Jaguar carburetor” isn’t.

A web search at Google for Jaguar brings back pictures of cars and cats. Same search at Yahoo shows a couple of images alongside snippets for pages, one of a feline in the wild, and one of a stylized feline in a logo for the automobile.

How might a search engine such as Yahoo (and possibly Bing if they acquire the rights to this patent), use statistical probabilities of meanings of words? The patent’s authors give us the following list on how the best estimate of the meaning of a word might be used in different ways:

  • Web pages may be indexed to a search.
  • News stories location may be plotted on a map.
  • Geographically relevant advertisements may be placed on a web page.
  • Enhanced statistics may be calculated for use in query analysis.
  • Search result listings may be presented to the user in accordance with the probabilities.
  • Ads may focus upon that meaning for pay-for-placement, cost-per-click, pay-per-call and pay-per-act type services.

Instead of attempting to match up queries with pages where those words may be keyword phrases that appear on those pages or in links to those pages, the search engine may rerank search results based upon probabilities that a searcher intended to see something related to one type of search rather than another.

So, someone with the last name “Ind” and the first name “Gary” could possibly have a personal web page that might rank highest on a search for “Gary Ind.” But, the search engine may calculate a higher probability that someone searching for “Gary Ind.” wants to see information about a City named Gary in the State Indiana, than the home page of Gary Ind. Based upon those probabilities, it might rerank search results for “Gary, Ind.” to show pages about the City first.

If you live in the City of Bath in the UK, and you’re in need of a plumber, you may still have problems finding what you’re looking for when you search for “Bath plumber” (Good luck to you). We’re told about the City of Springfield:

For example if there are thirty different places called “Springfield”, then thirty-one prior probabilities may be generated, one for each place plus one for the possibility that it is not a place at all.

The patent does provide a number of examples as well as some details on how probabilities might be calculated for different words used both alone, and within the context of other words. If you’re interested in how probabilities might be used to rerank search result, you may want to spend some time with this patent.

When someone searches for “Washington,” do they mean the State of Washington, the District of Columbia, a City named Washington, George, or something else completely? Probabilities, in addition to ranking signals based upon things such as relevance and quality and link analysis, may play a role in what pages show up where in search results.

  • Share/Bookmark

47 comments to Playing the Odds: How Probable Meanings May Influence Search Engine Rankings

  • This is a really great patent. For words which can mean a lot of things, maybe it’s up to the searcher to include a word clue on his/her search term, like make it more specific. When typing a search term on Google, there’s a drop down that appears that lists all probable things you might want to search. I think that can also help in refining searches.

  • Interesting to consider. I wonder if/how they may combine geographical factors into the probabilities. Does someone in Florida typing in “Jaguar” obtain the football results more often? My service provider routes me through Corpus Christi, so I show up as being in that city for some of these tracking programs, yet when I am testing keywords in searches, I frequently get local results for Pennsylvania. The idea of probabilities is great to improve results, but it would be nice to see how they can combine other factors to produce more relevant results.

  • Thanks for the patent dig up and analyze Bill.

    This patent definitely helps given the fact that nowadays the way searchers search is getting more complicated, given the social and live status streams.

    Came across the anatomy of a large-scale social search engine paper from Aardvark,stating the way searchers are searching are more conversation or question based as compared to the conventional keyword search couple of years ago.

    Would be interesting to see how the search engines present the data other than the conventional blue hyperlinks.

  • I am continually impressed with the advances that are coming out for web. This is really going to impact search engine optimization. I am from a small town, and I think something like this will really help the little businesses get their message across. With the geographical factors, it seems like the local businesses have a better chance of being seen. I am excited to see how it comes about.

  • Hi Andrew,

    I wonder if one of the best things that search engines could do to make it easier for searchers to find what they are looking for is to make search boxes much wider.

    Chances are that if they do, people would type in longer queries that might make it easier for the search engines to get a better idea of the intent behind a search, and the context of words within queries.

    The predictive suggestions that we see when that dropdown appears are often based upon histories of queries and query sessions, but I’m not quite sure that they completely follow the kind of logic that this patent contains. I so think there is some calculation of probabilities going on in those predictive search suggestions.

  • Hi Frank,

    Let’s take your example and make the geographic regions a little larger. Imagine a search for “football,” done by someone in the US, someone in the UK, and someone in Australia. Chances are that the US searcher will get NFL results that the UK and the Australian searchers won’t. The Australians will likely see Australian Rules Football results, and the UK will see results for what people in the US refer to as soccer.

    This patent really doesn’t address where someone is performing their search from, but there have been patents and whitepapers from the search engines where they describe how they might consider those kinds of factors. I’d guess that there’s some calculating of probabilities based upon whether or not someone in Florida searching for “jaguar” is more likely concerned about the Jacksonville team. I’m not sure though that this patent is trying to cover that situation. It is something that I did ask myself though as I was reading through it.

  • Hi Deric,

    Real time search incorporating things like Microblogging tools like Twitter into a search engine’s index does present some serious challenges – indexing short content without much in the way of links to and from it. We have seen how Google is attempting to display tweets and other microblogging content, in sections of a page that automatically scroll and can be paused. That’s a serious departure fom the old ten blue links. I’d live to see a study on how often people select something from that feature when it’s displayed.

  • Hi Ryan,

    I agree with you that if a search engine can be smarter about which place is which when a geographic reference is made in a query, that it can benefit small businesses and small towns. I’m excited about it, too.

  • [...] Playing the Odds: How Probable Meanings May Influence Search Engine Rankings, SEO By The Sea [...]

  • [...] Playing the Odds: How Probable Meanings May Influence Search Engine Rankings – SEO by the Sea [...]

  • A clever patent that would likely be useful in most cases. Although it may be frustrating at other times and may actually have an adverse affect on the results. I may be wanting results for the less likely term and have to put more in to get the results I need. I suppose we will see if people get better results by the search market share figures. Interesting…

  • Hi Lee,

    It seems like a good approach to me as well, though I agree with you that it might present some problems sometimes when people type in queries that could possibly be interpreted as meaning something other than what a searcher intended. If the worse that happens is that you have to revise your query and possibly make it a little more specific, that might not be a bad thing.

    One thing that bothers me with Google’s local search is that if I search for one place in one location, and then try to search for another place in a different location, Google will sometimes try to continue to show me information about the first location. likely based upon a probability that my second query is related to my first one. Search engines basing the results they show upon probabilities can be useful, but sometimes they can create frustrating results as well.

  • [...] How Probable Meanings May Influence Search Engine Rankings – SEO by the SEA [...]

  • Ive noticed some changes in the search engines recently. Local search actually gives me completely different results at home vs at my office which is only about 20 minutes away

  • Hi Jason,

    About five years ago, when I lived in Delaware and worked about 30 minutes away in Maryland, about 30 minutes away, I would see some very different results as well. Some of them probably had something to do with the different locations I was searching from, but others may have had to do with me accessing a different data center at either location. I don’t know if either are the cause of the differences that you’re seeing, but they are a possibility.

  • [...] Playing the Odds: How Probable Meanings May Influence Search Engine Rankings – SEO By The Sea [...]

  • I wonder how this probability search will interact with personalized results. It seems to me that if they are using probabilities it makes sense to calculate based on what they know of your search behavior.

  • Hi David,

    Interesting points.

    Personalized search probably is better if it uses a probability based approach, too. It attempts to make a best quess on what you might want to see based upon past searching and browsing history and other information that it has collected about you.

    The probabilities described in this patent filing are also aimed at making it more likely that someone searching for something finds what they might want to locate, but it’s trying to do so based upon both information found on how language is used on the Web, and in an analysis of aggregated query information and searching and browsing behavior.

    Some combination of those approaches might work out well – as you called it, an “interaction” between the two.

    Someone who might be a programmer who tends to look for information about java programming should probably be shown pages about the programming language, but they may have a present interest in the island or the drink, should be shown a diverse set of search results that include those as well, based upon some probability that they may be interested in something other than programming at the moment.

  • @jason: You dont use Chrome at home, by any chance? And perhaps Firefox or Internet Explorer at work? I`ve seen something strange when using Chrome, that being that the site I click the most after searching for any given phrase actually will rank better and better the more I search for it. Am I just paranoid or is this actually the case? Because a while back I worked on ranking with a site, checking my positions in Chrome and sometimes clicking on my own site. I finally got to the first search result page, and after a while I took the top position – at least I thought so, but when checking my ranking in FF and IE I wasn`t on the first page at all; in Chrome I was on the wery first place.

  • That is pure logic behind all that. It must have been difficult to programme it. As it is with every invention. Yet at last, nobody truly knows what human brain is apt to :)

  • That’s why personalized search results evolved in Google? So, how do we de-personalized our way of searching?

    @wczasy – I agree, it uses great mind power and skills! lol.

  • Hi Orville,

    So, how do we de-personalized our way of searching?

    I’m not sure that we ever can, now that the search engines have been incorporiating personalization and customizations of search results. They want to show us what they think we mean with our searches rather than just providing a list of pages that include the keywords that we enter in our queries.

  • Actually, you can easily de-personalize your Google search results!

    First of all, make sure you haven’t logged in. Make sure you signed out of your Google account.

    Visit the Google search engine and type in some search term. You’ll see a list appear as usual. You can see the View Customizations link on the right side above the search results. Click that link. Now you have all the personalization AND de-personalization options right in front of you!

  • Hi Martijn,

    You can see less impacts of personalization by taking the steps that you outlined, but there are still likely some customizations that you have no control over. For instance, Google will still likely bias the search results you see based upon which country and which language and which location it thinks you prefer. It will still likely expand queries in some ways based upon aggregated user-data, to do things like showing spelling corrections and synonyms for results. It may still show you different results based upon your location, the kind of device you’re using to connect to the Web, and in other ways as well.

  • This post is sooo deep. Context is everything. Sometimes I use Google Wheel, Quinutra or keyword density analyzers to help me determine which terms to make sure I add to support my target keywords. VERY GOOD post. (As always)

  • Hi James,

    Thank you. Google Wheel and quintura provide some interesting visualizations of terms that might be related. I’m less of a fan of keyword density analyzers – they’ve always seem to me to be tools that software makers created some folklore around about how pages are ranked by search engines.

  • Google Algorithm for SEO improves from time to time and for me the way they sort page rankings is more accurate and more reliable.

  • Hi Mikaela,

    The search engines definitely are aiming to improve the results that they show to searchers, but face some interesting challenges.

    For instance, many of the words or phrases that you search for could potentially have more than one meaning, and it may be difficult to decide which results to show at the top for the different meanings. Case in point, someone searching for “java” may mean the software, the island, or the coffee. As a search engine, which pages do you show searchers first? :)

    Another problem that we are seeing some interesting approaches to from the search engines is that sometimes a synonym for a query term might provide a better search result than the term a searcher actually used. Google also looks like they are trying to address this problem. It is challenging, though.

  • This type of patent would help tremendously, in my opinion. Not only from a searcher’s standpoint but also the webmaster so they target their market better. Plus Google would like it since they can make more ad dollars off of more targeted queries. :)
    Thanks for the info.
    Kind regards,
    Jason

  • Hi Jason,

    I think it would be helpful as well. Chances are it would enable searchers to find pages that were more likely what they were interested in, and would enable the search engine to show more relevant advertising as well.

  • As to the discussion about Googles personalization, I can add this:

    I have experienced, that searching with the same phrases from my home computer and the one at my work, gives slightly different results. This is after clicking the de-activation of personalization, not being logged in to any Google accounts and with cookies cleared.

    I expect the difference that must be based on IP-nr, but a bit strange as the distance between these to computers is no more than 5 miles.

  • That is really interesting about what “Per H.” commented. I tried the same thing myself but the results were no different. Although my work is more than 5 miles away. Hmmm…..

    Thanks Bill!

  • Hi Per and Jason,

    I’ve experienced something similar between work and home computers, with a distance of about 14 miles at one job, and a 22 mile commute at another job. One of those jobs was in another state, and it definitely was giving me results from a different data center, but I would see different results at work 14 miles away as well, and that one was likely using the same data center.

  • Aren’t this already the nature of Google search?

  • Hi Ryan,

    The patent that I wrote about above is from Yahoo rather than Google, but the problem is the same for both search engines. Both of them want to provide the best answers possible, and face the challenge that they often only have a handful or less words supplied to them in a query by a searcher.

    It’s much easier to just show searchers documents where the words from a search query either appear in the documents, in links to the documents, or both. It’s harder when those words may have more than one meaning, and the person searching is possibly more interested in one meaning of that word or another.

    It’s a problem that Google and the other search engines are trying to solve.

  • Max

    Hi Bill,

    Thanks for the awesome article as usual.

    I think It would make things easier and hopefully we’ll get better results than before. I just wonder if Bing is going to adopt this patent since its going to power Yahoo Search.

    Will Yahoo continue to come up with useful patents like this after being powered by Bing. Or it will leave it to Microsoft to do all the work?

  • I wonder if Google uses this system already, to me it seems they simply determine the most popular and reputable meaning without any other intelligence.

  • Hi Max,

    Some ineresting questions.

    It’s hard to tell what Microsoft is actually getting in the deal to power Yahoo search. Will they gain access to technologies patented by Yahoo? I don’t know.

    Will Yahoo continue to come up with useful patents? It can take a few years for patents to go from being filed to being granted, so there are likely a good number still in the pipeline, and we should continue to see some from them. Yahoo also files patents covering a wide range of applications, including paid search and technologies associated with running a range of portal services. It’s likely that Yahoo will continue to file patents for those types of services.

  • Hi Clinton,

    I’ve written a very large number of posts here about different methods that Google might use that attempt to look beyond whether or not keywords appear upon pages in retrieving those pages, to attempt to understand the intent behind a search. If you look at some of my posts on how search engines might attempt to rerank search results, you’ll see a few examples.

  • I still don`t understand how they determined the probability of what I want to look at the time. Is this based on the history of my previous searches, browsing?

  • Hi Comor,

    There’s a mix of different information that a search engine might consider when trying to decide upon the intent behind a search, when one of the terms in that search could have more than one meaning.

    Your search and browsing history might play a role in what you see in search results, but there’s other information that a search engine might look at as well.

    For instance, if you live in Florida, and there’s a very recent and popular news story on the Jacksonville Jaguars, you might be more likely to see some pages in your search results on the NFL team. If a jaguar recently escaped from a Florida zoo, you may see more results about the animal.

    Rather than just looking at your past searching or browsing history, the search engine may look at the search and browsing history of many other searchers as well.

  • I think the search engines should just show them a random page for not being able to construct a proper query :)

    It could work, but I can see there being so many problems with it guessing what you are trying to find. Surely the SERP’s represent the pages the SE’s think are the most relevent, based on the seo, and news articles appear at the top of Google as well, shopping results appear if you search for a purchasable item (ie Sony Laptop).

  • Hi Shane,

    I’ve been hearing from people about how hard it can be sometimes to come up with the right words to use to search for something.

    When you don’t know to much about a topic, and you’re trying to find information on it, there are times when a directory structure rather than a search engine can be invaluable.

    Search engines can attempt to guess at the meanings behind a query, provide query refinement suggestions, rerank results based upon what they perceive might be the intention behind a search, but it’s possible that allowing searchers ways to refine their queries by providing more interaction could be helpful. For instance, letting searchers see categories that the search engine might associate with a query could add a helpful level to letting searchers locate what they are looking for.

  • I think some emphasis has to go to the user, I particularly include extra words if search terms have more than one meaning or will probably bring back the results I am not looking for.

  • Hi Andy,

    You would usually expect people to refine their search queries to use terms that may make it more clear what they were looking for. That usually works fine when people want information on a topic that they know something about.

    But when they don’t know much about that topic, it might be a lot harder to include extra words in their queries that help return more meaningful results.

  • By the way this posting is very interesting. I think the search engine should show options when displaying results. i.e. did you mean George Washington the American leader and so on….The sponsored results are even worst than organic. If the keyword was selected in broad match the search results will show any match to it. I have tried the keyword “laser hair removal sex” the Google add was still showing even if I wanted to provide “laser hair removal services” because I have selected a broad matches. It showed for so many combinations.

  • Hi Sargon,

    I believe that I’m seeing more query suggestions now than every before when performing searches. There are the predictive query dropdowns that are shown as you type a query into a search box, as well as suggestions that sometimes show about search results as well as below them.

    As for broad match paid search, I believe that advertisers need to be very careful about how they choose to use paid search, and should carefully monitor their campaigns to make sure that they don’t show up for terms that they don’t want to appear for.

Leave a Reply

 

 

 

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>