Study on the Structure of Search Queries
Would it surprise you if over 40 percent of the queries entered into search boxes at search engines consist of proper nouns, such as the names of specific people or places or things?
Or that combinations of proper nouns and nouns might make up over 70 percent of most searches?
At least those are a couple of the conclusions from researchers at Yahoo who are trying to find effective ways to better understand the structure of search queries used by searchers.
A study of queries entered into Yahoo’s search engine in August of 2006 took a close look at The Linguistic Structure of English Web-Search Queries (pdf), and tried to get an understanding of the way that people phrase what they are looking for when they search.
The researchers behind the study came up with some interesting information about the queries that people use, and the structure of those queries.
For example, when people perform a second search related to a previous search, one of the most common things that they change in their search terms are numbers. Someone looking for information about the first Spiderman movie might type into a search box the phrase “spiderman 1.” Following up, they may then type in “Spiderman 2.”
The type of word most likely to be reformulated is “number.” Examples included changing a year (“most popular baby names 2007″ ! “most popular baby names 2008″), while others included model, version and edition numbers (“harry potter 6″ ! “harry potter 7″) most likely indicating that the user is looking at variants on a theme, or correcting their search need.
The study also looked at the use of capitalization in search queries. While many people search for proper nouns at a search engine, only a small percentage of those searches use capital letters for the first letter of names:
However, the great majority of queries contain no informative capitalization, so the great majority of proper nouns in search queries must be uncapitalized. We cannot, therefore, rely on capitalization to identify proper nouns.
The paper provides some insight into how tagging different words in a query as a proper noun, or noun, or verb, or number, or conjunction (or other parts of speech) might help people working at search engines to understand how people search, and how people reformulate their searches.
When you perform a search at a search engine, how do you formulate your query?
- Do you ask questions? (where are seafood restaurants in new york)
- Do you include punctuation in your query? (where are seafood restaurants in new york?)
- Do you type in a string of words in any order? (seafood new york)
- Do you use grammatically correct phrases (new york seafood restaurants)
- Do you capitalize proper nouns? (New York seafood restaurants)
- Do you include adjectives with your search terms? (best new york seafood restaurants)
- Are you placing conjunctions in your searches? (new york and seafood restaurants)
If you own a website, you probably want to understand how people use your site, and how you might improve your site to make it more likely that people will return and take advantage of what you have to offer on your site. It’s not surprising that search engines would want to gain more insight into how people search, and how they can provide better results based upon the way that people perform searches.
This paper is interesting in that it provides us with a window into one aspect of how a search engine might be trying to better understand their audience.
While this study might provide people who do keyword research and search engine optimization with some questions to ponder about the keyword research that they do, it should also provoke people who provide information or services or goods on their web sites to question how they might work to better understand their audiences.
If you own a web site, and have a site search on your site, do you look at what people are searching for and how they are searching, to try to understand better what they might be trying to find on your site?
Do you take a look at the query terms that people use to find your site in referrals from search engines in your log files or analytics programs, and question those, and consider whether the people finding your site from the major search engines are getting their informational or transactional needs addressed when they come to your site?
If you receive emails or phone calls about what is offered on your site, do you pay attention to the questions and queries that people might have, and see if your site might (or could) answer some of those questions in a helpful way?
What do people ask for from your site, and how do they do it?