Just What Related Queries Are?
Why does Google customize some search results based on previous related queries that you’ve performed? Is there a special relationship between those query terms, and if so, how did Google define that relationship? This is not from some third-party company. It is something that Google is doing – deciding that some queries are related queries.
Imagine searching for a “luxury car” at Google and then performing another search for “Infiniti.” On the second search, you find a page in the search results that looks like it will provide you with information that you are looking for, and you select a page.
Now imagine that many other people perform the same series of searches and select the same page.
An Example of Related Queries
Google might start considering the search for “luxury car” and the search for “Infiniti” as related queries. It’s also possible that the page selected in the second search for “Infiniti” might start ranking more highly for the query “luxury car.
A patent filed by Google in 2003 got granted this week, and it explores how search rankings might be “improved” by looking at related queries. Google appears to be learning by watching searches and paying attention to related queries.
Methods and systems for improving a search ranking using related queries
Invented by Simon Tong, Mark Pearson, and Sergey Brin
Assigned to Google
US Patent 7,505,964
Granted March 17, 2009
Filed: September 12, 2003
The patent tells us that Google might use many different approaches to determining whether queries might become related and uses an example of queries performed back-to-back or consecutively as one of those approaches. Besides tracking which queries a searcher might perform, the patent tells us that it might track the behavior of searchers, such as which pages a searcher might click through in a set of search results:
For example, when a user types in a first search query such as “infinity auto” and then inputs a second search query such as “Infiniti” immediately afterward, the related query processor may define a relationship between the first search query and the second search query.
In this example, the relationship of proximity between search queries would become defined as “back-to-back” or consecutive.
Thus, for the query “infinity auto,” relationships to queries “Infiniti,” “luxury car,” “quality luxury car,” and “Japanese quality luxury car” may get defined if a user inputs these queries immediately following the initial query “infinity auto.” It would consider those to be related queries.
Other types of relationships or proximities can get defined according to the invention and stored by the related query database.
Relationships between queries might get determined and weighed differently based upon a few different considerations.
Weighting Relationships Between Queries
For instance, queries might be more closely related if they are typed in by a searcher consecutively than if there are one or more queries between them.
Or queries might be determined to be related if they are performed by a searcher with a certain period of time, such as within 30 minutes of one another. The patent provides many examples of how queries might get related, which include:
- Having been input as consecutive search queries by users previously (whether once or multiple times),
- Queries input by a user within a defined time range (e.g., 30 minutes),
- Misspellings
- Numerical relationships
- Mathematical relationships
- Translation relationships
If you’ve performed a few searches on Google, you may have noticed a message at the top left of the search results that tell you that your results are “Customized based on recent search activity.” Following a “More Details” link next to that statement might tell you which the previous query influenced the results you see. When I followed an example from the patent and searched for “infinity auto,” and then followed it up with a search for “Infiniti,” I received a message that”
Recent Searches as Related Searches
The following information has gotten used to improve your search results for Infiniti:
Recent Searches You or someone else recently searched for infinity auto using this browser.
A “learn more” link from that message told me that:
Recent searches: We consider whether a particular query followed on the heels of another query. Because recent search activity provides valuable context for understanding the meaning behind your searches, we use it to customize your results whenever possible, regardless of whether you’re signed in or signed out.
To customize your results and show you the customization details, we keep recent searches in a cookie on your browser for approximately 30 minutes. After approximately 30 minutes, this cookie can get removed from your browser. Completely closing your browser will remove this cookie immediately.
We don’t know for certain if this patent provides us with details of how this “related query” process works at present. It’s been more than five years since the patent was originally filed. But, it is interesting to think about how queries might become related. Or how those relationships might influence the search results that you might see when you perform a search.
Reading this in my reader I was thinking the scariest part was that this was 5-6 years ago. Thinking of how things have changed and progressed since then makes me shudder. If they were doing this back then imagine what they are doing now 🙂
Rgds
Richard
Hi Richard,
I was a little surprised to see the 2003 date on this patent as well. 🙂 There are a number of academic whitepapers looking at chains of queries, and query sessions, and related user behavior that didn’t come out until 2005, like this one:
Query Chains: Learning to Rank from Implicit Feedback (pdf)
This quote from that 2005 document sounds a little like some of the description from the patent:
Yeah, what Richard said, that is super scary that they were already looking into this 5 years ago. I personally do not like the custom google searches with this new algorithm. I have to sign out now to find what I want.
An odd thought, but I will let it spill over here.
I read a post about studies that Google did on tracking where a user’s eye focuses on a page, then mouse over tendencies, and the fact that most visitors will click on the first three results. What if the results served were not particularly relevant, but a user clicks through, because he feels it might be relevant or simple curiousity. I would assume then that the relationship might be noted, but deemed ilrelevant, unless users keep following the pattern?
search engines are getting really smarter these days and I believe that they track the behavior of users to show good results
Interesting stuff, but it’s not something a typical search engine optimizer can effect. Eye tracking is a method used in advanced usability testing so it doesn’t surprise me that a multinational company like Google would find similar uses for it.
Hi Florida,
I agree with you. I think search engines are paying more attention to the searching and browsing behaviors of searchers. It’s interesting seeing suggestions of different ways that they might be showing up in patent filings and white papers.
Hi Adam,
That is definitely one of the biggest challenges that web site owners face. Being aware that a search engine may be incorporating signals taken from looking at user behavior is a start in itself.
Trying to understand how that may impact the way that a search engine provides results means trying to get a better understanding of what audiences who are looking for what you might offer might be trying to find it on the Web, and how a search engine might aim to understand and interpret the intent behind their searches.
Hi Frank,
An interesting question. One of the concerns that I’ve read in some white papers from different researchers writing about search results is that many searchers show a bias towards clicking on top results because they believe that those might be the most relevant results because the search engine placed them at the top, regardless of what the page title and description might say.
If a search engine is partially basing rankings of pages on which pages are being clicked through in search results, that might mean that popular pages become more popular, and less popular pages become buried, even though the less popular pages may be higher quality pages. Should some pages be randomly increased in ranking limit that, as described in this paper:
Shuffling a Stacked Deck: The Case for Partially Randomized Ranking of Search Engine Results (pdf)
When you perform a search for something, and a search engine tells you that there are millions of possible results, how much better quality are results in the top three than in the top 100 or so? Should a search engine provide results that sometimes randomly move pages up and down in the top rankings?
If a page that shows up in search results gets selected by a lot of searchers, that may be a better signal that the page may be meaningful for the query used, but I also think that researchers need to keep in mind (and probably do) that there is a bias on the part of searchers when they select certain results, and should consider basing decisions about things like how related queries might be on more information than just clickthroughs.
I cant come to terms with how long ago thsi was patent placed (5 years ago). What the hell will be they (Google qand other search engines) be doing to help with the best possible results for a search term. Its scary. If this is used it could be manipulated quite easily. Pick your top 30 keyword phrases and build a script to auto search, then bingo: association in googles eyes. I cant see that it will influence to a great extent, unless the terms that they look at are very common associations, but saying that I could be wrong and probaly am.
Bill another belter. Keep it up.
Hi Lee,
Thanks.
I’m not sure how easily this can be manipulated. Google collects a serious amount of data about how people use search engines in different locations, and on different topics. It’s hard to gauge how much user data would need to be seen by the search engines to influence other people’s searches, but there are ways that the search engines might use to try to identify scripts that are being used to attempt to influence search results, including looking at IP addresses, and patterns involving the speed of searches, the content of searches, and so on.