How a Search Engine might Adjust Rankings based upon Patterns in Query and Click Logs

Imagine that a number of people use Google to perform a search for “orange,” and then “banana,” and then “pineapple” and then choose the web page “http://www.example.com/fruit.htm” in the search results they see.

Now imagine that Google looks at the information it collects about what people do when they search, and finds in its query logs and click logs that there are a large number, a statistically significant number, of people who search for “orange,” and then “banana,” and then “pineapple,” or possibly the same search terms in a slightly different order, and then tend to click on “http://www.example.com/fruit.htm.”

Google may also notice that there are people looking for some very related terms during query sessions, such as consecutive searches for “banana,” “apple” and “pineapple.”

Since this second set of queries for “banana,” “apple” and “pineapple,” is so similar to the query sessions that contained the search terms “orange” and “banana” and “pineapple,” where people were choosing the page “http://www.example.com/fruit.htm,” Google may choose to adjust the ranking for “http://www.example.com/fruit.htm,” for people using those very related terms in their search sessions.

Continue reading

The Importance of Listening

When I was fairly young, my family picked up roots and moved from New Jersey to Ohio. As a six-year-old, it was quite a culture shock. I remember how much more slowly people talked in the great Mid-West, how polite they were, and how they had funny names for things, such as calling soda by the name “pop.”

Those half-dozen years in the Garden State were enough to indoctrinate me to the speaking habits of the region, and I remember in our new home fumbling with the fact that I spoke at a quicker rate than my classmates and the neighborhood kids. It wasn’t that they were slow, but rather that they just talked that way. Looking back, I realize that I probably cut off some conversations during pauses, because the delay between words was long enough that it seemed to signal a completed thought.

Seven years later, we found ourselves packing everything up and moving back to central Jersey, close again to our extended family and to a new business that my father had started up with some others in his industry. Seven years in the land of fields of corn and dairy, of Cincinnati Reds and riverboats, and I picked up some of the customs of my midwestern environment.

Returning to New Jersey meant experiencing a culture shock in reverse, where my classmates and neighbors talked much quicker than I did, and interrupted me when I talked. It wasn’t that I was slow, but rather that I just talked that way. I knew better than to ask for “pop” at the local pizzeria, cause they more likely might have tried to help me find my dad than giving me a Soda.

Continue reading

How a Search Engine Might Distinguish Between Queries from Bots and from Humans

Some of the visitors to search engines are people looking for information. Other visitors may have other purposes for visiting search engines, and might not even be humans.

Instead, those automated visitors may be attempting to check rankings of pages in search results, or conducting keyword research, or providing results for games, or even being used to identify sites to spam, or to alter click-through rates.

These non-human visitors can use up a search engines resources, as well as skew possible user data information that a search engine might consider using to modify search rankings and search suggestions.

Google has asked its visitors not to use programs like that for a number of years. On their Google Webmaster Guidelines, they tell us:

Continue reading

How Search Engines Might Expand Abbreviations in Queries

When visitors to search engines use abbreviations or expand abbreviations in their searches, it’s possible that they might be missing out on some pages worth visiting.

For example, use Yahoo to search for [NASA Moon bombing] and compare the results to a search for [National Aeronautics and Space Administration moon bombing] and you’ll see some very different results.

Should those search results be more similar? NASA and National Aeronautics and Space Administration are the same organization. Then again, NASA is also an abbreviation for:

  • North American Saxophone Alliance
  • National Auto Sport Association

Continue reading

Google Trust Rank Patent Granted

If you’ve ever heard or seen the phrase “Trustrank” before, it’s possible that whoever was writing about it, or referring to it was discussing a paper titled Combating Web Spam with TrustRank (pdf). While the paper was the joint work of researchers from Stanford University and Yahoo!, many writers have attributed it to Google since its publication date in 2004.

The confusion over who came up with the idea of trustrank wasn’t helped by Google trademarking the term “Trustrank” in 2005. That trademark was abandoned by Google on February 29, 2008, according to the records at the USPTO Tess database:

TESS search result for Trustrank showing a service mark claim abandoned on February 29, 2008.

Continue reading

Getting Involved Locally to Overcome Climate Change

Chances are that you’ve seen news on TV or your newspaper or the Web about polar icecaps melting, or rising sea levels, or changing weather patterns. It’s easy to be an observer on the sidelines, and let the news happen on its own.

We can take steps on our own to live more energy efficient lifestyles, like purchasing hybrid cars, or using public transportation more frequently, or buying energy saving household appliances, and conserving energy more wisely where we work and where we live.

The problem of climate change can seem overwhelming though. Most of us aren’t in positions where we can influence public policy, or the actions of large corporations, or come up with scientific breakthroughs that bring clean energy to the world. But we can act locally, and we can help spread awareness, and become informed on the issues involved, and share that knowledge with others.

How much do you know about what is going on in your own community to combat climate change?

Continue reading

10 SEO Questions

I wrote a comment yesterday in response to a couple of blog posts that attacked SEO and the SEO industry, attempting to illustrate to the author of the rants that search engine optimization brings a specialized skill set and a core group of knowledge that can help others, from small businesses with great ideas, to larger organizations that can benefit from an independent voice that has experience and knowledge about search engines.

Unfortunately, my comment went unpublished for whatever reason.

One of the underlying assertions of the post I responded to was that in the hands of a competent web developer, a site should rank well in search engines as long as the people behind the site created something great and beautiful, and told a couple of friends. Another of the underpinnings behind the rants against SEO was that search engine optimization wasn’t a legitimate form of marketing. A third postulated that SEOs were the force behind such things as the botnets, blog spam, and scraped and auto-generated content that appears on the Web.

Continue reading

Yahoo Web Page Segmentation: Distinguishing Noise from Information

In a recent interview with Priyank Garg, Director of product management for Yahoo! Search Technology, conducted by Eric Enge, we were told that Yahoo breaks pages down into template sections to distinquish between noisy, or boilerplate content, and unique content:

One of the things Yahoo! has done is look for template structures inside sites so that we can recognize the boiler plate pages and understand what they are doing. And as you can expect, a boiler plate page like a contact us or an about us is not going to be getting a lot of anchor text from the Web and outside of your site. So there is natural targeting of links to your useful content.

We are also performing detection of templates within your site and the feeling is that that information can help us better recognize valuable links to users. We do that algorithmically, but one of the things we did last year around this time is we launched the robots-NoContent tag, which is a tool that webmasters can use to identify parts of their site that are actually not unique content for that page or that may not be relevant for the indexing of the page.

If you have ads on a page, or if you have navigation that’s common to the whole site, you could take more control over our efforts to recognize templates by marking those sections with the robots-NoContent tag. That will be a clear indicator to us that as the webmaster who knows this content, you are telling us this part of the page is not the unique main content of this page and don’t recall this page for those terms.

Continue reading