Looking at Users’ Final Landing Pages to Develop Suggestions for Query Refinements

It’s getting pretty common for search engines to suggest query revisions when someone does a search these days.

One common query revision strategy is to look at the query sessions from previous searchers who used the same query, and see how they might have refined their searches, including spelling corrections, or adding and deleting words in subsequent queries during the same session.

A paper from Microsoft researchers, Query Suggestion based on User Landing Pages, takes that approach, and looks at using it in conjunction with another approach that looks at what they call “final landing pages.”

This poster investigates a novel query suggestion technique that selects query refinements through a combination of many users’ post-query navigation patterns and the query logs of a large search engine. We compare this technique, which uses the queries that retrieve in the top-ranked search results places where searchers end up after post-query browsing (i.e., the landing pages), with an approach based on query refinements from user search sessions extracted from query logs.

Here’s how those landing pages are found from Web activity logs:

1) Following the submission of a query at Google or Yahoo or Live Search, the sequence of pages viewed from clicks starting at those queries are viewed.

2) Those sequences of pages are referred to as search trails, and they terminate when other actions are taken, such as:

  • New search is started;
  • An activity indicating a search is completed happens – a return to a homepage, the checking of email, a visit to a service like MySpace, etc.;
  • A page is viewed for a long period of time without any activity;
  • The browser window is closed.

When one of those actions happens, it may be a sign that the searcher found what he or she was looking for, and the last page viewed is considered the landing page. Note that a query revision ends one of these search trails.

Query revisions are then calculated by looking through the search engine’s logs to find other query terms that those landing pages rank highly for (a top ten result).

Conclusion

It’s easy to see how this strategy for finding query refinements might locate a different range of suggested searches than looking at query revisions that people type into a search box.

Combining both approaches to generate suggested refinements makes sense – since it both offers a wider range of possible suggestions and examines user behavior where it may appear that a searcher is satisfied with the results that they viewed.

Share

7 thoughts on “Looking at Users’ Final Landing Pages to Develop Suggestions for Query Refinements”

  1. It’s an interesting idea. I wonder though how a search engine can tell that the action taken meant success, failure, or something else came up.

    You might check email for example because you found what you wanted and are moving on or because you didn’t find what you wanted and are tired of looking. You might have gotten an alert that a new email arrived or you might never have been all that interested in your search and just feel like checking email.

    It’s still an interesting idea, but I’m wondering how it might work in practice.

  2. Hi Steven,

    The results do seem like they hinge upon that assumption, that some signals might seem to indicate that a person is satisfied with the results of their query. What if it appears that it’s more likely that they are, but that there is no guarantee? I suspect that these things could sometimes be influenced by something other than satisfaction with what was found.

    But, what if the results of looking at these final landing pages still produced an interesting set of query refinements that mostly appear relevant and useful? I would suspect that those would be at least as good as some of the query refinements that people purposefully chose to use when they weren’t satisfied with the initial searches that they performed.

  3. Hi,

    I’m really glad that somebody blogs about these papers; it saves me having to dig them out! This technique isn’t astoundingly new but it’s good to see that it’s being actively researched. It could be great as part of (e.g.) a toolbar, which would provide even more accurate information regarding the “final” page required. There are problems in that if the information can be quickly read, it’s hard to automatically detect that it was found. Nevertheless, this should be a good help to search engines. Dare I suggest that changing algorithms to reduce landing page bounce rate could be a great way of filtering out spam?

  4. It’s getting pretty common for search engines to suggest query revisions when someone does a search these days.

    Yes it is common, but this a more complicated query suggestion process involving a termination page at the end of a search trail and subsequently returning the top search queries that users made in accessing that termination page as query suggestions.

    Dare I suggest that changing algorithms to reduce landing page bounce rate could be a great way of filtering out spam?

    You would have to account for query types that naturally result in a high bounce rate, e.g. Q+A queries that are satisfied by wikipedia, dictionary or even the SERPs themselves!

    Also, I suspect that a toolbar doesn’t quite cut it as a data-gathering device, which is a good reason why the major search engines are getting their hands dirty in the analytics area. That is, web analytics providers would be an excellent source for collecting search trail data.

  5. Hi Jonathan,

    I really like that we are seeing a lot of papers from Microsoft where they are studying different aspects of how user behavior can be used to improve how a search engine works.

    Looking at these landing pages does seem like a logical next step after starting to explore query revisions to offer query refinements. What they are doing does look like an attempt at a holistic approach. I’ll second your “way to go.”

  6. You would have to account for query types that naturally result in a high bounce rate, e.g. Q+A queries that are satisfied by wikipedia, dictionary or even the SERPs themselves!

    There’s no explicit mention of them accounting for those types of results, but they probably do need to account for result sets that include Question Answering and Definition results, and Query sets that infer completely different avenues of investigation that may seem unrelated.

    Also, I suspect that a toolbar doesn’t quite cut it as a data-gathering device, which is a good reason why the major search engines are getting their hands dirty in the analytics area.

    The mix of different information gathering tools can provide more information than any single source – the difficulty is in coordinating that information. Analytics is the kind of thing that does make this kind of inquiry easier.

Comments are closed.