How May Click and Query Logs Patterns Influence Search Results?
Imagine that a number of people use Google to perform a search for “orange,” and then “banana,” and then “pineapple” and then choose the web page “http://www.example.com/fruit.htm” in the search results they see.
Now imagine that Google looks at the information it collects about what people do when they search, and finds click and query log patterns showing that there are a large number, a statistically significant number, of people who search for “orange,” and then “banana,” and then “pineapple,” or possibly the same search terms in a slightly different order, and they tend to click on “http://www.example.com/fruit.htm.”
Google may also notice that there are people looking for some very related terms during query sessions such as consecutive searches for “banana,” “apple” and “pineapple.”
Since this second set of queries for “banana,” “apple” and “pineapple,” is so similar to the query sessions that contained the search terms “orange” and “banana” and “pineapple,” where people were choosing the page “http://www.example.com/fruit.htm,” Google may choose to adjust the ranking for “http://www.example.com/fruit.htm,” for people using those very related terms in their search sessions.
Continue reading “Search Engines May Adjust Rankings based on Click and Query Logs Patterns”
When I was fairly young, my family picked up roots and moved from New Jersey to Ohio. As a six-year-old, it was quite a culture shock. I remember how much more slowly people talked in the great Mid-West, how polite they were, and how they had funny names for things, such as calling soda by the name “pop.”
Those half-dozen years in the Garden State were enough to indoctrinate me to the speaking habits of the region, and I remember in our new home fumbling with the fact that I spoke at a quicker rate than my classmates and the neighborhood kids. It wasn’t that they were slow, but rather that they just talked that way. Looking back, I realize that I probably cut off some conversations during pauses, because the delay between words was long enough that it seemed to signal a completed thought.
Seven years later, we found ourselves packing everything up and moving back to central Jersey, close again to our extended family and to a new business that my father had started up with some others in his industry. Seven years in the land of fields of corn and dairy, of Cincinnati Reds and riverboats, and I picked up some of the customs of my midwestern environment.
Returning to New Jersey meant experiencing a culture shock in reverse, where my classmates and neighbors talked much quicker than I did, and interrupted me when I talked. It wasn’t that I was slow, but rather that I just talked that way. I knew better than to ask for “pop” at the local pizzeria, cause they more likely might have tried to help me find my dad than giving me a Soda.
Continue reading “The Importance of Listening”
Value in Being Able to Classify Search Query Traffic From Robots and Humans
Some of the visitors to search engines are people looking for information. Other visitors may have other purposes for visiting search engines, and might not even be humans.
Instead, those automated visitors may be attempting to check rankings of pages in search results, or conducting keyword research, or providing results for games, or even be used to identify sites to spam, or to alter click-through rates. It can be helpful for a search engine to be able to classify search query traffic, to understand if that traffic is coming from human searchers.
These non-human visitors can use up search engines resources, as well as skew possible user data information that a search engine might consider using to modify search rankings and search suggestions.
Google has asked its visitors not to use programs like that for a number of years. On their Google Webmaster Guidelines, they tell us:
Continue reading “How a Search Engine Might Classify Search Query Traffic from Bots and from Humans”
When visitors to search engines use abbreviations or expand abbreviations in queries, it’s possible that they might be missing out on some pages worth visiting.
For example, use Yahoo to search for [NASA Moon bombing] and compare the results to a search for [National Aeronautics and Space Administration moon bombing] and you’ll see some very different results.
Should those search results be more similar? NASA and National Aeronautics and Space Administration are the same organization. Then again, NASA is also an abbreviation for:
- North American Saxophone Alliance
- National Auto Sport Association
Continue reading “How Search Engines Might Expand Abbreviations in Queries”
If you’ve ever heard or seen the phrase “Trustrank” before, it’s possible that whoever was writing about it, or referring to it was discussing a Yahoo/Stanford paper titled Combating Web Spam with TrustRank (pdf). While that TrustRank paper was the joint work of researchers from Stanford University and Yahoo, many writers have referred to it as Google TrustRank since its publication date in 2004.
While Yahoo has a TrustRank approach, Google does not have a similar approach. The Yahoo approach is aimed at identifying Spam on the Web. It has been patented, under the name Link Based Spam Detection. Because that Yahoo patent exists, Google could not be granted a patent that covers the same processes – the USPTO would not grant such a patent. However, there is a Google TrustRank.
The confusion over who came up with the idea of TrustRank wasn’t helped by Google trademarking the term “Trustrank” in 2005. That trademark was abandoned by Google on February 29, 2008, according to the records at the USPTO Tess database:
Continue reading “Google TrustRank”