User Intent and Characteristics of Search Queries

One of the short posters at the recent WWW 2007 Conference in Banff, Alberta, Canada, provides an indepth look at classifications of search queries after sampling more than 5 million queries, taken from transaction logs from three different search engines.

They use that data to come up with a classification algorithm, which was then used on a “separate Web search engine transaction log of over a million queries submitted by several hundred thousand users.” The results are interesting.

The article is Determining the User Intent of Web Search Engine Queries, from Bernard J. Jansen and Danielle L. Booth of Pennsylvania State University, and Amanda Spink of the Queensland University of Technology.

Their findings indicated that approximately 80 percent of the queries classified were informational in nature, with the remaining queries being split almost equally between navigational and transactional queries.

As a followup, they manually coded 400 more queries to compare to those results, and note that their accuracy in classification was about 74 percent. They tell us that within the remainding queries, “the user intent is generally vague or multi-faceted, pointing to the need to for probabilistic classification.”

As part of this process, they defined characteristics for the different types of queries: informational, transactional, and navigational. For example, here are a few of the characteristics that they noticed for informational queries:

  • Uses question words (i.e., “ways to,” “how to,” “what is”, etc.)
  • Queries containing informational terms (e.g., list, playlist, etc.)
  • Queries where the searcher viewed multiple results pages

The “separate Web search engine transaction log” that they reviewed was from Dogpile, and they point to another longer paper that describes the study of that transaction log, which goes beyond identifying classifications for search queries. The cited paper is:

Jansen, B. J., Spink, A., Blakely, C. and Koshman, S.
forthcoming. Web Searcher Interaction with the Dogpile.com Meta-Search Engine. (pdf) Journal of the American Society for Information Science and Technology.

They compare the results of this study of Dogpile queries to studies of non-meta search engines. Some interesting statistics from that study, which are shown in a table within the paper. Here’s a glimpse at some of them:

Session size

1 query – 288,231 – 53.9%
2 queries – 88,875 – 16.6%
3 queries – 157,401 – 29.4%

Results Pages Viewed Per Query

1 page – 1,052,554 – 69.07%
2 pages – 253,718 – 16.6%
3 pages – 217,521 – 14.2%

The rest are worth a close look.

Understanding user intent during a search can be an important aspect of delivering relevant results to searchers. The percentage of informational search queries from this report is higher than in previous studies I’ve seen on the subject. We aren’t told if that is because the logs used were from a metasearch engine or not, but it’s still an result worth considering.

Other papers cited as references in the WWW 2007 document:

  • Baeza-Yates, R., Calderon-Benavides, L. and Gonzalez-Caro, C. 2006. The Intention Behind Web Queries. In Proceedings of String Processing and Information Retrieval (Spire 2006). Glasgow, Scotland, 98-109.
Share

9 thoughts on “User Intent and Characteristics of Search Queries”

  1. Bill –

    Very interesting indeed and thank you so much for rounding out his subject and for providing some good focused resources to look at.

    Well – at least we know now that there are still some searches that are transactional in nature!! It consoles me to know that people still do look to transact business online. Hooray, we’re still here for a reason.

    I’d like to see a study on how one can determine how far along someone is in a transaction based on the search query. I suppose it comes down to length of the query – right? Like – longer tail = later in the transaction process….?

  2. Hi Jake,

    Thank you. Hopefully some of those information searches and navigational searches are leading to transactional searches, too.

    I’m not sure that longer searches are always good indications of whether or not a query is transactional. Sometimes people shorten their queries when they use a long one, and don’t find relevant results.

  3. Good stuff Bill. Lasting data that still transcends the nature of search queries and the intention behind them. Taking the time for in-depth keyword research can lead to a great content strategy that tailors to both informational queries and commercial queries (tastefully of course).

Comments are closed.