Microsoft Tracking Search and Browsing Behavior to Find Authoritative Pages

Between December 2005 and April 2006, researchers from Microsoft collected information about the searching and browsing activies of hundreds of thousands of Windows Live Toolbar users, with permission, to learn about the sometimes unranked and unindexed final destination pages that searchers ended up at in response to queries entered at Google, Yahoo, and Microsoft’s Live.com.

So much of what search engines try to do when presenting relevant results to searchers is based upon assumptions found in algorithms like PageRank.

Can tracking actual user search and browsing behaviors better help a search engine understand which pages may best answer queries posed by searchers at search engines?

Microsoft on Final Destination Pages

Last year, a trio of Microsoft researchers were awarded the Best Paper Award at SIGIR’07 for a paper titled Studying the Use of Popular Destinations to Enhance Web Search Interaction (pdf) that looked at the searching and browsing behavior of a large number of people.

The focus of the research was to find pages that seem to be final destinations, or stopping points where people may have found the answers to their search in response to the queries they submitted to the search engine. I wrote a little about the original paper before the Conference in Microsoft Study Takes Navigational Sitelinks a Step Further.

What they told us then about those final destinations was that:

The destinations may not be among the topranked results, may not contain the queried terms, or may not even be indexed by the search engine. Instead, they are pages at which other users end up frequently after submitting same or similar queries and then browsing away from initially clicked search results.

Final Destinations as Authoritative Pages?

They’ve built upon the original research in a new paper, Leveraging Popular Destinations to Enhance Web Search Interaction. Here’s what they tell us in the abstract:

This article presents a novel Web search interaction feature that for a given query provides links to Web sites frequently visited by other users with similar information needs.

These popular destinations complement traditional search results, allowing direct navigation to authoritative resources for the query topic.

Destinations are identified using the history of search and browsing behavior of many users over an extended time period, and their collective behavior provides a basis for computing source authority.

The researchers looked for what they called search trails to follow a path from pages that searchers located in search results to pages where the searcher’s inquiry around that search seemed to stop for one reason or another. Their research provided some interesting statistics about search trails:

The statistics suggest that users generally browse far from the search results page (i.e., around five steps), and visit a range of domains during the course of their search. On average, users visit two unique (non search-engine) domains per query trail, and just over four unique domains per session trail.

This suggests that users often do not find all the information they seek on the first domain they visit. For query trails, users also visit more pages, and spend significantly longer, on the last domain in the trail compared to all previous domains combined. These distinctions of the last domains in the trails may indicate user interest, page utility, or page relevance.

They also provided some interesting statistics about search queries too:

For frequent queries, most popular destinations identified from Web activity logs could be simply stored for future lookup at search time.

However, we have found that over the six-month period covered by our dataset, 56.9% of queries are unique, while 97% queries occur 10 or fewer times, accounting for 19.8% and 66.3% of all searches respectively (these numbers are comparable to those reported in previous studies of search engine query logs [Silverstein et al. 1999, Spink et al. 2002]).

In addition to studying the query logs from the many Windows Live Toolbar users, the researchers brought a number of people into their labs to conduct studies with them. We are given a number of details about that study in the paper.

Conclusion

The “final destinations” in this study aren’t query refinement suggestions, but rather are pages that may be relevant ones for the searches conducted as seen by actual searching and browsing behavior of searchers.

The appendix in the paper shows some of the tasks that were given to people tested in the Microsoft Labs. What final destination pages do you end up on looking for answers to questions like these, which are some of the ones listed:

Known-item task descriptions:

  1. Identify three positive achievements of the Hubble telescope since its launch in 1991.
  2. Find three hotels in Paris, France, that include a spa and health club.
  3. Identify three interesting things to do during a weekend in Kyoto, Japan.

Exploratory task descriptions:

  1. You have been talking to a friend about increases in size and diversity of the United States student population. You decide to find out how the student population has actually changed over the past five years.
  2. A colleague has recently been diagnosed with a dust allergy. You are curious about causes of dust allergies and medications that ease the symptoms, so you decide to learn more about them.
  3. You have to plan a five day vacation along the west coast of Italy. You want to find out what are the must-see sightseeing spots along the Italian west coast, and learn about Italian wine and the best vineyards in Tuscany to visit on your trip.
Share

22 thoughts on “Microsoft Tracking Search and Browsing Behavior to Find Authoritative Pages”

  1. Thanks, Bill. Interesting possibilities (not new or earth shattering but…) both for structuring site search returns and content additional/optional reading suggestions by also accounting for the initial SE query that landed the visitor. Should be able to increase time on site and page views.

    I am not so happy about ‘systems offering popular destinations led to more successful and efficient searching‘ as that might cement a given return as becoming popular requires being offered as a return in the first place. Rather Catch-22. Non-SE popularity may be increasingly needed to overome rigid ‘once an authority always an authority’ query rankings.

  2. You’re welcome.

    I think that this points out well that if you are going to try to optimize for a specific phrase, that you want to spend time making sure that the page optimized, or other pages on the site fulfill the needs of people searching for that phrase.

    Good point on the non-search engine popularity. I think that it was interesting that a number of the final destination pages being found through this process were unranked, and unindexed pages. Be interesting to see how this branch of research continues in the future.

  3. I never thought it was possible tracking your browsing behaviour when Microsoft is finding with my opnion their authoritative websites. So if I search a long time from another Ip than mine, to my website, microsoft is thinking my website is authorative.

  4. My brain hurts after reading this :p There was me thinking that all I had to do was optimise pages. I’ve just printed off the MSN paper to try and see what I can get from it.
    Thanks for the heads up

  5. Greetings Bill….superior post as per usual! Thanks for this. Am glad that the big players are looking beyond pagerank. What percentage of web pages are indexed / indexable vs total amount of pages out there? It never ceases to amaze me when I see how many pages are designed without search engines in mind.

  6. HI Jacques,

    Thank you. It is encouraging to see a process from one of the major search enignes that does look beyond pagerank.

    There are probably billions of pages out in the deep web that are purposefully uncrawlable, as well as sites that dynamically create new pages as they are asked for (such as search pages on a blog for visitor defined queries). Definitely more sites should be designed considering search engines.

  7. @ Mr Happy,

    It does look like the usabilty of pages, and the ability of visitors to find what they want will play an increasing role in where search engines send people. I don’t think that is a bad change at all.

  8. @ William Slawski

    Google is doing that for authorative sites. If you search for an authorative site, you can see an search box within Google which gives you the ability to find something within a site. I heard about it last saturday on a blog and is a new function into Google. I have to admit, I don’t like it.

  9. My own opinion is the information has to be useful, otherwise why would they go to the trouble of collating the data. That said Microsoft have not improved much in terms of clawing back the ground they have lost to Google, if the bid for Yahoo fails the monopolisation will continue for many years to come.

  10. Hi Pete,

    I think that Microsoft, Google, and Yahoo are all coming to some of the same conclusions regarding delivering people to pages that they may want to see when they conduct navigational searchers. It does seem to be useful information. :)

  11. Interesting. So maybe Google and other search engines could look at the final destination page and award it a higher relevancy score for the initial search term?

    That would certainly reduce the power of spam pages.

  12. To me the word Teleportation brings some confusion. To me it mean that it could bring you to a place where you didn’t intent to go…

    But the Searchbox itself really improves the quality of the SERP.
    From the official Google website one can read exactly what the search enige is all about and why this Searchbox gives better feedback to the users: [... presenting users with a search box as part of the result increases their likelihood of finding the exact page they are looking for].

    In other words, it is a Box that appears within some of the search results themselves. This feature will occur when Google detects a high probability that a user wants more refined search results within a specific site.

  13. I am 100% agree what Internet marketing experts said to this. I have also the same opinion that what Teleportation brings some confusion if Microsoft implement that.

  14. Hi Internet marketing experts and meta pasban,

    Teleportation might not be a good choice of words considering how it’s been used in the past in places like the random surfer model in PageRank that indicates the small likelihood that someone moving around the web might just randomly go somewhere else at some point.

    But in the Microsoft paper, it’s used as a smaller possibility – the possibility that a searcher might decide to go back to a previously viewed page. In the Google search within a site, it’s used to indicate that the additional search allows visitors to search within a site before actually going into the site’s pages and using the site navigation.

    I don’t think that there will be much confusion over the term “teleportation” by either Google or Microsoft -it likely won’t be used much by either.

  15. never thought it was possible tracking your browsing behaviour when Microsoft is finding with my opnion their authoritative websites. So if I search a long time from another Ip than mine, to my website, microsoft is thinking my website is authorative.

  16. @ MSN Hacken

    Well could be. Black hat I guess, but possible. I read some people do that with their Adsense competition too, just click click click, with different IP’s and they’re soon over their budget and out of the picture…

  17. Hi MSN Hacken,

    There might need to be a certain amount of activity from different people or IP addresses before the search engine might determine a page to be “authoritative” for a particular query. It’s likely that for fairly popular queries, there will be a lot of activity from a lot of different people and many different IP addresses.

    Hi vertaalbureau engels,

    There are people do try to engage in some pretty underhanded activity on the Web. When a search engine comes up with something that is based upon measuring the behavior of users, they do need to keep in mind that there are people who will try to find ways to abuse those processes. I think most of them consider the possibility that there will be abuses, and try to find ways to filter those out.

  18. First, I fear that URL parameters which allow power users to easily refine their queries, e.g. pws=0, gl=US, etc, will disappear. This seems to be already underway as evidenced with Google’s Adwords preview tool. When originally launched, Google documented URL parameters which allowed a power user to simulate their geographic location. But currently there’s no mention of these parameters in the relevant documentation. If you’re not in the know, you need to do your best with the AJAX interface. Worse, Google could decide to deprecate or obsolete the URL parameters. I do hope I’m very wrong on these points….

  19. Hi Maqool,

    Honestly, I rarely use those URL parameters. There are too many possible content filters and reranking processes that can come into play to make them completely useful. I’d rather spend time looking at the analytics for a site, and see what is happening there.

Comments are closed.