How a Search Engine Might Rerank Search Results Based upon Time-Based Data in Query Logs

If you search at Yahoo for the phrase “world cup” (without the quotation marks), chances are good that the search engine will show you mostly pages about the 2010 World Cup, even though the tournament is held every 4 years and there may be many pages relevant for the phrase that don’t focus specifically upon a particular year.

How likely is it that when someone searches for “world cup,” they are looking for information about the upcoming tournament, taking place in South Africa between June 11th, and July 11th, 2010? On the other hand, how likely might it be that they want to find information about the world cup held in 2006? Or just general pages about the sporting event?

If I told you that the search engine was likely reordering those search results based upon time-based data, would it surprise you? Would you expect a Yahoo or Google or Bing to focus upon rerank search results in a manner like this, when they have some temporal aspect to them, such as a search for the Olympics, or the World Series, or the World Cup?

It’s quite possible that a search engine would look through its query logs, and see if a particular query is often included in more specific searches that include some kind of temporal data such as a year, or month, or day or time of day, and rewrite a searcher’s query to include that time-based information. A recent Yahoo patent application explains one fairly simple approach towards showing such information. The patent application is:

Continue reading “How a Search Engine Might Rerank Search Results Based upon Time-Based Data in Query Logs”

How a Search Engine Might Identify Possible Query Suggestions

Like the information architects who organize the content on websites, search engine designers should aspire to provide users with scent at every step of their information-seeking process. Techniques like query suggestions, faceted search and results clustering all offer users the opportunity to make progress on their next step, rather than always having to restart the information-seeking process from scratch. Indeed, faceted search is a popular technique for offering users such guidance.

While users are ultimately responsible for expressing their information needs, it is the search engine’s job to act like a reference librarian and help the users in this process.

Reconsidering Relevance and Embracing Interaction
by Daniel Tunkelang

Continue reading “How a Search Engine Might Identify Possible Query Suggestions”

Google Studies How Search Behavior Changes When Searchers Are Faced with Difficult Questions

A paper by Google researchers Anne Aula, Rehan M. Khan and Zhiwei Guan published last month asks the question How does Search Behavior Change as Search Becomes More Difficult? (pdf)

The paper describes two studies in which participants were given informational tasks to perform – a mix of hard and easy questions – to see if searchers adopted different strategies for searching when they were faced with questions where there were definite answers where answers to those questions might be difficult to find. An example of one of the difficult tasks (can you find the answer?):

You once heard that the Dave Matthews Band owns a studio in Virginia but you don’t know the name of it. The studio is located outside of Charlottesville and it’s in the mountains. What is the name of the studio?

The first study had 23 people performing searches, finding answers to questions like the one above, and examining the searches they performed and the pages they visited to see how they went about finding answers. The second study expanded to 179 searchers, and based some of the processes used on things they learned from the first experiment. A general conclusion from the second study:

Continue reading “Google Studies How Search Behavior Changes When Searchers Are Faced with Difficult Questions”

Google Word Completion and Search Query Suggestions from Social Network Connections?

When you type a query into a search box at Google or Yahoo or Bing on your desktop computer, chances are a drop down listing of suggested query terms will appear below the search box.

If you use a smart phone, and start typing into a text box on your phone, your phone may also offer you some suggestions to complete the word you are typing.

In the case of a cell phone where you need to press numbers to represent alphabetical characters, those suggestions can help save you from typing a lot of keystrokes. The phone offers terms from a dictionary stored on your phone to help you complete those terms.

A recent patent application from Google describes how they might add words to a dictionary like that, taken from social networks where you might be a member. What’s interesting about that is how much information the search engine captures about your use of words on the Web, and that of people whom you might be connected to on the Web.

Why might Google look to social network information for this kind of information?

Continue reading “Google Word Completion and Search Query Suggestions from Social Network Connections?”

Google Defines Semantic Closeness as a Ranking Signal

This post may get you thinking about the benefits of using heading elements and lists on web pages for SEO purposes from a slightly different perspective than you may be used to.

Google uses a large number of signals to decide upon the order of pages shown in search results. Some of those signals measure the quality or importance of a web page, while others may indicate how relevant a page is for a particular search query entered into a search engine’s search box.

One fairly obvious relevancy signal is whether or not the words in a query actually appear upon a page that might be a search result for that query. If those words appear on the page more than once, the page might be considered even more relevant for that particular query than other web pages where the terms only appear once, or not at all.

Another factor that might indicate how relevant a page is for a particular set of terms is how close those terms might be on a page. While you could easily count the number of words between individual query terms to determine how close they are to each other, the formatting of web pages presents some challenges to the approach of simply counting words between terms, such as in a list like the following:

Continue reading “Google Defines Semantic Closeness as a Ranking Signal”

Google’s Reasonable Surfer: How the Value of a Link May Differ Based upon Link and Document Features and User Data

Enter the Reasonable Surfer Patent

Not every link from a page in a link-based ranking system is equal, and a search engine might look at a wide range of factors to determine how much weight each link on a page may pass along.

A diagram showing different values for links passing amongst three different web pages.

One of the signals used by Google to rank web pages looks at the links to and from those pages, to see which pages are linked to by others. Links from “important” pages carry more weight than links from less important pages. An important page under this system is one that is linked to by other important pages, or by a large number of less important pages, or a combination of the two. This signal is known as PageRank, and it is only one of a large number of Google ranking signals used to rank web pages and determine how highly those pages show up in search results in response to a query from a searcher.

An early paper by the founders of Google, The Anatomy of a Large-Scale Hypertextual Web Search Engine, tells us:

Continue reading “Google’s Reasonable Surfer: How the Value of a Link May Differ Based upon Link and Document Features and User Data”