How Search Engines May Try to Match Searchers’ Intents from Analysis of Search Engine Query Logs

When you type a search query term into a search box at Google or Yahoo or Live.com, the search engines might go through their indexes, and try to find the most relevant and important pages in their databases for the word or phrase that you want to find out more about.

But those search engines might try to improve the results that they show to you by trying to understand the intent behind a search rather than just looking for pages that match keywords that you typed as a search query.

Search Engines and Searcher Intent

What do the search engines themselves reveal about the importance of considering the intent behind a search?

Google tells us on one of their search help pages that:

Search is rarely absolute. Search engines use a variety of techniques to imitate how people think and to approximate their behavior. As a result, most rules have exceptions. For example, the query [ for better or for worse ] will not be interpreted by Google as an OR query, but as a phrase that matches a (very popular) comic strip. Google will show calculator results for the query [ 34 * 87 ] rather than use the ‘Fill in the blanks’ operator. Both cases follow the obvious intent of the query.

Yahoo also mentions that they try to consider the intent behind a search on a help page of theirs titled How are web documents ranked:

Search engines don’t have the ability to ask questions, so they rely on the search terms you enter to interpret and determine the intent of your search.

Microsoft’s Satya Nadella, senior vice president of their search, portal and advertising platform group also described how user intent plays a role in the results live.com shows to searchers in a presentation from August of 2008:

I believe this notion of understanding user intent–being able to analyze (search queries) and come up with search patterns and use them to shape the search experience–is one of the most important areas for us.

One set of intentions behind searches were identified in a paper from 2002, A taxonomy of web search, which broke searches into three types; informational, transactional, and navigational.

Informational searches are conducted by a searcher to fill some kind of informational need that they might have. Transactional searches are made to help a searcher conduct a task on the web. Navigational searches are intended to help a searcher find a specific page.

Identifying the intent behind a search can be a difficult task, as Google Researcher Dan Russell has noted in a number of presentations, including one at The San Francisco Bay Area Chapter of ACM SIGCHI in 2006, where he described some of the approaches that Google takes to learn about intentions behind searches. One of those approaches involves looking at the log files of the search engines, and seeing how people refine the searches that they perform during search sessions.

A couple of newly published patent applications from Yahoo describes how that search engine might look at log files to identify whether some queries evidence an intent that might not be easily seen from just looking at the search query itself.

Explicit and Implicit Searcher Intent

The intent behind some searches may be easier for a search engine to interpret than others, and might be considered to have an explicit intent behind them.

For example, if you’re searching for a new pair of shoes, or a camcorder, you might include some words in your search that tell a search engine about that intention, such as “cheap sneakers” or “buy a camcorder.” A search engine might see the use of the words “cheap” or “buy” as an explicit indication that you want to make a purchase online.

If your search includes the word “reviews,” it may signal to the search engine that you want informational pages about specific kinds of products or services. A search for a place type and a geographic location might indicate to a search engine that you are conducting a local search, and you may be shown local search results when you type something like “San Jose Library” into a search box.

Other searches may not have such a clear intent. The patent applications from Yahoo provide an example of a search for the term [olympics]. The best results to show a searcher might involve showing results from the Olympics from a specific year, even though the search didn’t include a year.

The search engine might look through its log of search queries from previous searchers, and see that many of the search queries that people used to find out information about the Olympics included a year within the query, such as:

  • The 2004 Summer Olympics in Athens,
  • The 2006 Winter Olympics in Turin,
  • The 2008 Summer Olympics in China,
  • The 2010 Winter Olympics in Vancouver, or;
  • The 2012 Summer Olympics in London.

Many searches are “time sensitive,” and mining search engine query logs to see a pattern like this might help a search engine understand the intent behind a search, and influence which search results are shown to searchers. It’s possible that a search engine might boost rankings for pages that might show the most popular intents, or that they might rerank search results to show a broad range of intents behind a query that has an implicit intent behind it.

For example, if most of the people searching for the word “Olympics” tend to click on pages for the 2010 Olympics or refine their search query to include the year 2010, then the search engine might start boosting search results that are relevant for the 2010 Olympics.

An alternative approach might be to look at those search engine query logs, and see the percentages of people who click on results for Olympics associated with specific years or refine their search results for a specific year, and show a diverse mix of search results for each of the years.

So, if 50 percent of searchers looking for “Olympics” seem to be looking for the 2010 Olympics, and 30 percent appear to want to find out about the 2012 Olympics, and the remainder of searches for the term “olympics” don’t seem to have a specific date attached to them, then the top ten (or top 100, or some other number) of search results might be half filled with results about the 2010 Olympics, contain results about the 2012 Olympics for almost another third, and have more general pages about the Olympics, without necessarily having years attached to them.

The patent filings are:

Extracting Query Intent from Query Logs
Invented by Priyank S. Garg, Kostas Tsioutsiouliklis, Bruce T. Smith, and Timothy M. Converse
US Patent Application 20090043749
Published February 12, 2009
Filed August 6, 2007

Abstract

Techniques are provided for storing queries received by a search engine are in a query log.

For a particular query term in the query, it is determined how many queries in the query log contain that particular query term and an intent-indicating term, and determined how many queries in the query log contain that particular query term without an intent-indicating term.

Based on the ratio between the number of queries in the query log that contain the particular query term and the intent-indicating term and the number of queries in the query log that contain the particular query term without the intent-indicating term, it is determined whether the particular query term is an intent-qualified query term.

In response to determining that the particular query term is an intent-qualified query term, data is stored in a computer-readable medium that identifies the query term as an intent-qualified query term.

Implicit-intent queries that contain the intent-qualified query term are processed based, at least in part, on the intent associated with the intent-qualified query term.

Estimating the Date Relevance of a Query from Query Logs
Invented by Farzin Maghoul and Kostas Tsioutsiouliklis
US Patent Application 20090043748
Published February 12, 2009
Filed: August 6, 2007

Abstract

Techniques are provided maintaining data that indicates for a plurality of query terms whether the plurality of query terms are date-qualified query terms.

A query is received, and in response to receiving the query, the query is inspected to determine that the query contains a particular date-qualified query term.

Then it is determined that the particular date-qualified query term has been associated with a plurality of dates, and it is determined which of the plurality of dates with which to associate the date-qualified query term for the query, based at least in part on the frequency with which each particular date of the plurality of dates has been associated with the particular date-qualified query term.

Conclusion

The patent filings focus upon providing examples of identifying time sensitive and date sensitive search intents, but the methods that they describe can be used to find other implicit intents behind queries.

We don’t know if or how much of the methods behind these two patent applications have been incorporated in Yahoo’s search results, but we do see that the intent behind some query terms in a search can influence the types of results that we receive at the major search engines, such as navigational queries showing a top result with sitelinks, or a local type query showing one box or ten box map results.

If you perform a search at Google or Yahoo or Live.com, chances are that they will be considering the intention behind your search, and may show you results that are influenced by what the search engine believes the intent behind your query might have been.

One place that a search engine might look at is in their query log files to see if they can glean an implicit intent behind your search terms by seeing which results previous searchers might have chosen as search results, or looking at how searchers might have rewritten, or refined their search queries.

If you search for “Olympics” without including a year, the search results you see might focus upon the 2010 and 2012 Olympics, since it appears to be a time sensitive query.

If you’re a site owner or working with one, and you are performing keyword research on specific search phrases for the pages of a site, it’s also important to keep in mind that the search engines might be considering more than how many times a search phrase shows up on a page in title elements, or headings, or text, or in anchor text pointing to those pages or the PageRank (or link popularity or page quality of pages) when it returns results to searchers.

The search engines might be attempting to understand the intentions behind the search phrase, to show searchers the pages that they believe will match those intents.

Share

23 thoughts on “How Search Engines May Try to Match Searchers’ Intents from Analysis of Search Engine Query Logs”

  1. Great article. Understanding searcher intent as it relates to keyword research is one of the main points of emphasis for any SEO.

    I often tell my clients that they need to understand the intent of a searcher who is looking for their site, and just how they need to understand what keywords may be typed in to find them.

    It’s nice to know SE’s are thinking this way too!

  2. Interesting info. It is definitely something to consider when conducting keyword research. You can see Google monitoring things related to what you are talking about here.

    For example if you do a search for “SEO by the Sae” (misspelling)
    http://www.google.co.uk/search?hl=en&q=seo+by+the+sae

    You will notice that your site comes up twice as matches for “SEO by the Sea”. If you hover over the URL, you will notice that these are not direct links, but they are being routed through Google. I’d imagine this is some automatic mechanism that updates Google’s synonym definitions and once a certain number of people click on the re-directed links, Google will probably show results for “SEO by the Sea” as default – all without anything done manually at Google.

    You get similar URLs for the refinements when searching for Olympics. Whether Google are using this data or just aggregating it to see how they might use it is the question though.

  3. Hi Agent SEO,

    Thanks. I agree with you. Keyword research is one of the most important things that an SEO can do, and understanding searcher intent behind keywords is the most important part of that research.

    I get excited seeing something like these patent applications, which show us some of the ways that a search engine might be taking the limited information they receive in a query (often 2-4 words), and getting some idea of the intent behind them.

    Hi David,

    Thanks for the example of the URLs used to link to SEO by the Sea by Google in that set of search results. We can see in some parts of the URL information that Google might be collecting for that specific query search:

    Something to do with spell correction:

    spellmeleon_result

    The result number for that query:

    resnum=1

    The result number, with possibly domain collapsing being used:

    ct=result&cd=1

    It’s hard to tell what Google might be doing with all of the information that they collect, but I expect that they do collect a lot of information. I suspect that they might collect more than they need or can use, but figuring out what to do with some of it may be a good part of the fun behind making a search engine work. :)

  4. This is a really interesting post. I agree that at the heart of a good SEO strategy is still keyword research. For me the importance of this won’t be diminished as the importance of social etc builds. As keyword research is knowing who your target market is and how they are looking for your services.
    Great information on this blog …

  5. Hi Kieran,

    Thanks for your kind words.

    I agree with you. A good SEO and and a good marketing strategy starts with understanding who your audiences may be and how your objectives can bring value to those audiences, and effective keyword research begins with that understanding.

  6. Hi Bill,

    Effective keyword research has a direct tie to revenue. For example, the website pizza.com. Pizza is a word with “links” to many, many relationship- marketing keywords. It’s been fascinating to watch the pizza.com website “evolve.” Any thoughts on who now owns this “keyword” — search engine?

  7. Ive been reading your blog now for the last month and I really have enjoyed all the advice you give. A question I would like to ask is how you get all this up to date information? You really put some decent content in here and I dont want to sound soppy or anything but you are appreciated for your intensive research. Keep it up and I’ll start to comment more often.

  8. The truth is that when you type “How SEO works?” a simple and obvious question – The first result is a site with the title “How SEO works” and not other better sites which explain the basics of SEO.

    It’s seems that today the keywords structure is more important than the search intentions.

    What do you think?

  9. Hi Sylvia,

    I think it’s a mistake to think of keywords as belonging completely to commercial interests, and tying them directly to revenue. The web may contain commercial sites, but it’s a medium and not an advertising platform. No one can “own” a keyword…

    Hi Jaques,

    Thank you. I subscribe to a very large number of feeds, and spend a fair amount of time looking through patent filings and whitepapers.

  10. 1. First time to this site. I think it’s great. I just subscribed to your feed through my Google homepage…

    2.I’ve found the best way to discern search intentions is to dig threw your log files…people seem to be very specific about what they want…so, on some sites, you have to change your entire site layout to cater to 3 or 4 different ‘types’ of visitors…this can increase earnings pretty big if done right…

    You just got to do the data mining & analysis yourself, that’s all…!

  11. It’s nice to see search engines are still developing however this means that if you can’t find your site anywhere in search results you’re screwed. There’s so many factors that you wouldn’t even be able to tell what went wrong.

    But then again it’s easier for your average joe to find what he’s looking for so maybe that’s good.

  12. Hi Richard,

    Thank you. I appreciate your visiting and your very thoughtful comment.

    I definitely agree that looking through log files can be pretty informative. There might be some limitations if a site doesn’t have enough visits to generate a fair amount of data. But even if it doesn’t, log files can hold some really helpful ideas.

    For ecommerce sites, I also recommend paying attention to the questions and words that people contacting a site owner might use in emails or phone calls, as well as search logs for site search, in addition to any of the search referral terms that people might use to find the site. Those can also provide some ideas for the development of new content, as well as potential changes to site structure and layout.

    Very good point too, about there being “different” types of visitors as well. Knowing that different people want to see different things on a site can lead to some positive growth for the pages of that site.

    Hi Zasłony,

    Search engines have become an index that many sites rely upon to bring visitors to their sites, and it is important to try to understand how to develop a site that makes it easy for them to crawl and index. I think that anyone building a new site these days has to consider them in their marketing plan, but also should try to find other ways to bring visitors to their pages as well.

    It can be hard starting a new business that tries to take on older established businesses, online or offline. It can help to understand the area that you are trying to fill, and the places within it that offer the best possibilities of success. Often, those opportunities don’t exist in taking on those older businesses on directly, but rather in finding the areas where there is a need that isn’t being met. And, it’s quite possible that in those areas, it can be possible to rank well for query terms that are under represented in search results. It can be a challenge to find those, but very helpful.

  13. Very well-written article. I find it a great thing that Google’s search engines are adapting and developing still. With the Internet still in early stages, there is much to be learned from the people putting content on the internet, as well as the people who are using the internet. With search engines trying to determine the intent on the searcher at the same time as determining relevancy to keywords will help make searches more specific and efficient. This may present more of a challenge to some smaller companies, but all it takes is lots of research to stay ahead of the curve.

  14. Excellent post, to reveal the thinking of search engines. As per me if search engines start comparing history of queries performed by a uses they will be able to deliver more relevant results of user’s interests.

  15. Thank you Shailendra,

    I’m not completely convinced that looking at past queries is the answer to matching a searcher’s intent behind a search. But I do like seeing the search engines exploring the idea.

  16. Maybe I am not the normal search engine user, but I would use a date for an event that happens annually or on a regular basis. Maybe you are just using an example, which makes sense here.

    I guess what i’m trying to say is that I have worked with computers all my life, including programming, so I know how to “think” like a computer and know what and how to serach for something. Most internet users are probably not that savvy, so google is trying to make the computer think like the human. Where will artificial intelligence be in 50 years? This may be just the beginning…

  17. Hi Jake,

    Good point. The example that I used was similar to one within the patent filing, and likely is an issue that the people working at the search engine see regularly. Trying to understand the intent behind a search isn’t easy – especially if people don’t provide much in their queries that might help a search engine point to the information that they really want to see as a result of their search.

  18. From what I have seen some of the keywords are specially qualified as informative or brand – like if you type in computers you will see sites like wiki, dmoz oraz yahoo directory pop up. Without human supression or ingeration no search engine ai will be able to know what a human means – at least not in the next tens of years. I wonder how bing will compete with the google supremacy.

  19. Hi Bilety,

    Trying to respond to a searcher’s intent isn’t an easy task. It’s interesting seeing how search engines are attempting to do so. Bing does have some interesting technology that they acquired from Powerset, and people who came to Microsoft from Powerset as well. Will that help? We may have to wait and see.

  20. Awesome post. I run a blog on marketing for pizza places. My trouble with SEO and searcher intent is the fine difference between someone who is looking to order pizza online (not my target audience) versus who is searching for info on local search marketing for pizzerias. Obviously words like “advertising”, “ads”, “marketing”, etc work well for me. But trying to use the long tail without those words gets a little trickier.

    Any thoughts?

  21. Hi Eddie,

    Yes, plenty of thoughts. But to make a possibly long answer short, it really comes down to knowing your audience well, and understanding what words that might use to find the services that you have to offer. For example, where do people who do things like attend Pizza Expo’s (http://www.chron.com/disp/story.mpl/life/hoffman/6945147.html) interact and communicate with each other, and share ideas?

    I suspect that you could also get a lot of ideas from talking to a lot of people who have run pizza places for years about the things that they have tried, and the things that they would like to have tried to market their businesses, from pizza eating contests to Karaoke and Pizza Nights to gluten-free pizzas to many others.

Comments are closed.