When you search at Google or Yahoo, or Bing, you’ll see a set of search results that include a page title, a summary or search snippet of the page, and a URL indicating the page’s address.
Often, that combination of title, snippet, and URL will decide whether or not someone clicks through search results to a page.
The search snippet performs a couple of functions – it gives you a summary of what the page is about, and it shows you the context within which your query terms might appear on a page.
Sometimes a search engine will show you the Meta Description that the publisher of the page has come up with for a page, especially if the Meta Description contains the words found in the query.
Sometimes a search engine will show you a description that isn’t even found on the page if it decides that the page is relevant for a query. Still, the description for the page at the Yahoo Directory or DMOZ makes a better search snippet than the meta description or any of the content found on the page.
It’s also quite possible and prevalent for a search engine to use the content found on a page to show as a snippet for that page. The chances are that search engines show text from a page’s content as a search snippet for most queries more often than they do the meta description for a page or a description from an alternative source such as Yahoo or DMOZ directories.
If a snippet shown to a searcher isn’t very informative, searchers may click on pages in search results that don’t contain the information they are looking for, or they may not click on pages that may be helpful. Poorly chosen search snippets can lead to bad searching experiences.
A recent Yahoo patent application describes how they might make that experience better when deciding which search snippet to show searchers, looking at the quality of choices they have for a snippet within the text on a page, how relevant those choices might be, and how well that text might match up to the intent behind the search.
Ranking Lines of Text on a Page
Imagine that a search engine considered each line of text that appears upon a page as a possible snippet to show in search results.
Would the search engine rank each of those lines based upon some quality signal that doesn’t depend upon the query used to find the page? Or would it rank those lines upon how relevant they might be to the intent behind a query that the page is located for? Or would it combine both a quality ranking and a relevance ranking to decide what to show to searchers?
The Yahoo patent filing tells us that it could look at the following for each line on a page to come up with a score for each line to use as a snippet:
- A query-independent relevance for each line of text – a degree to which the line of text of the document summarizes the document.
- A query-dependent relevance of each of the lines of text – a relevance of the line of text to the query.
- The intent behind a query.
The patent application is:
System and Method for Automatically Ranking Lines of Text
Invented by Tapas Kanungo, and Donald Metzler
Assigned to Yahoo
US Patent Application 20090292683
Published November 26, 2009
Filed: May 20, 2008
Disclosed are apparatus and methods for ranking lines of text. In one embodiment, an intent of a query is ascertained. The relevance of each one of the plurality of lines of text of a document is determined based upon the intent of the query, content of the query, and content of each of the plurality of lines of text. The plurality of lines of text may then be ranked according to the determined relevance of each of the plurality of lines of text.
We’re given some examples of features that might be included in each of these three different kinds of signals.
Query Independent Relevance
A web page might be broken up into lines of text, and query-independent features might be identified for each line. Since these features are independent of the query used in a search, they focus upon quality rather than potential relevance to any specific query that a searcher might submit.
Some examples of query-independent features:
- How common one or more words in the line are (a frequency with which various words are typically used, possibly excluding words such as “the,” “and,” and “or.”)
- Many names in the corresponding line. For instance, the existence of one or more names may indicate greater relevance of the line to the page.
- A position of a line within the page, which may indicate the importance and therefore the relevance of the line to the document, such as the beginning, middle, or end of a document, the first, middle, or last line of a paragraph, the first line of a document, etc.
- The number of words in the line of text, and/or;
- The number of common words (e.g., a, an, the) in the line of text.
Query Dependent Relevance
A web page might be broken up into lines of text, and query-dependent features might be identified for each line. Since these depend upon the query used by a searcher, the features focus upon indications of how relevant a possible line might be to a specific query.
Some examples of query-dependent features:
- A percentage of the query terms that are found in each line.
- A number of times a particular query term is found in the line, and/or;
- Whether the query is a substring of the line of text.
The intent of a query
Features and patterns that might indicate the intent of a query may be identified for each line on a page.
Some examples of features used to identify the intent of a query:
- Whether the query includes one or more names (organizational or product) – which may indicate that the query is navigational rather than informational. If the query contains the name of a business or product, it might be more navigational in nature.
- Click characteristics associated with a query – which may indicate how often someone might click on a page corresponding to the name provided in the query when the query is submitted, and/or;
- The number of words in the query.
These are only some of the features that might be used, and a search engine might look for more than the examples included in the patent filing. In July, I wrote a previous post about another Yahoo patent filing, which includes some other features that a search engine might use to determine the quality of a snippet, which shares an inventor with this patent filing. The post is at: Search Engines Evaluating Snippets in SERPS.
I’ve written some other posts on snippets here as well. Here are some of them:
- How does Google Pick Snippets for Your Pages to Show in Search Results?
- The Influence of Search Result Listings (Captions) on Clickthroughs
- Search Result Snippets and the Perception of Search Quality
Search Snippets Conclusion
Snippets do play an important role in whether or not someone will click through a link that they see for a page in search results, and it can be worth spending time and effort on the meta description that may sometimes be shown to searchers.
It’s also not a bad idea to check the analytics for a site to see which query terms are being used to find pages, and to check what a search engine might have chosen to show as a snippet for that page, for those queries. Editing content on a page in response to a search engine’s decision of what to show as a snippet may result in a better snippet shown for that page for that query.
There hasn’t been much known about how a search engine decides what content to show as a snippet when it decides to show content from a page rather than a from a meta description, except that the content will usually contain as many of the query terms used as possible, to show searchers how those words are used within the context of a page.
There might be other features, based upon quality, relevance, and intent, that search engines may use, such as how readable a snippet might be, how close to each other query terms are in sentences that appear on a page, how “spammy” a snippet might appear, and more signals. If you read some of my earlier posts on snippets, many of those signals are contained within those.
It’s also possible, and not unusual, to see snippets from search engines that show more than one line from a page, so that the use of those query terms might be displayed to searchers.
If you’re a site owner, how often do you look at search results that pages from your site appear in, to see what the snippets are like when your pages are shown?
Those snippets may determine whether or not someone visits your site.
28 thoughts on “How a Search Engine May Choose Search Snippets”
Thanks Bill. I have often wondered how search engines like Google and others determine where to pull the snippet from for their search results.
Nice. This really explains a lot why snippets are important. Site owners should really need to read this if they want to have more visits on their sites.
Hi People Finder.
Between this Yahoo patent, and my earlier post about Google Snippets linked to above, we have some ideas of what they may be looking for. But I still find myself surprised sometimes at some of the snippets I see.
great post bill,
lots of content in there that’s very useful. i would say that tweaking your site and testing different snippets is definitely a good way for determining which keywords are working well and making users click on your link. as with seo it’s a question of experimenting with different methods and techniques and seeing what works best.
As always an excellent post and great in depth information!
I included some of this information and a link back to this post on a blog post that I just wrote about Yahoo and Google snippets:
Great post that’s gonna take a while to digest. In the early days it was easy to control snippets for the most part (the very early days), but it’s a different game now. They’re still just as important, though. With the Google Personalization announcement, getting them to click becomes even more important, imho.
Great tip re looking at analytics data for keywords & seeing what snippets are actually shown. It won’t always be what we expect or what we designed to happen, and that snippet is so important in getting the click.
Thanks. It is worth spending the time looking, and like you say, it isn’t always what you might expect, but it definitely is important.
Actually, I have very similar conception of choosing snippets, as far as I remember Matt Cutts talking about it on one of his videos
The more one analyzes what guides the search engines in crawling and ranking pages, the less we know about what rules they actually follow.
It is interesting that you repeatedly use the word “may” rather than “do”, through your piece. This makes it more objective than those piece, where “gurus” are making authoritative statements about things that most people know little about.
Great analysis. I noticed that they some times pull from Dmoz which is interesting, makes you want to make sure that your Dmoz listings are accurate, keyword rich and as compelling as they are allowed to be
One thing I like to do for my blog is to use Google Alerts to search for my website name. When I post something, those alerts are emailed to me, and in those emails are not just the title of my post, but the snippet that Google has chosen to use…my description or theirs. It’s kind of like a reminder system to see if I need to fix my description.
Thank you. There’s more to showing up in search results than just ranking well. You also hope that people will click your link, and paying attention to what shows up as a snippet can help.
There’s definitely no question about the value of testing and experimenting, and paying attention to how search engines might be treating your site. Thanks.
Thank you – I liked your post.
The technology described in the patent doesn’t seem that advanced on its face, but I like patent filings that take you step-by-step through a process, and make you think about something that a search engine might be doing in a way that you might not have before. This one did that for me. It was nice to see that there are probably more elements to choosing a snippet by a search engine than just whether or not keywords appear within the snippet selected. Especially the part about trying to match the intent behind the query – whether informational or navigational.
Good point regarding personalization. I’ve been wondering how personalization might play a role in snippet selection. Will things like customizations based upon past searches and browsing history, that might influence the rankings of pages also influence which snippets are shown? The information collected for personalization might provide some help in determining searcher intent, and if searcher intent plays a role in deciding which snippets are presented, then it could.
I seem to remember that video, but I’m going to have to hunt it down and watch again. Thanks.
Thanks. I try to be careful when it comes to patents, and descriptions of what they contain. While they are from the search engines directly, that’s no guarantee that a search engine might be doing what they describe in their patent filings. I do think that patents often raise more questions than they might answer – but often those questions are worth asking, and they provide ideas to test and experiment with that we otherwise might not have come across.
Same here. I’ve been seeing Google pull descriptions from DMOZ for a number of years now, like Yahoo pulling descriptions from the Yahoo Directory.
Unfortunately, in the case of DMOZ, I don’t think it was ever intended to help a search engine like Google supplement what might be shown in search results.
Thank you – that’s a useful approach. But, if your pages rank well for other queries, the search engines might show other snippets – that’s where things get interesting. 🙂
While sometimes I don’t understand why Google uses the snippets it does, more often than not I can’t complain. It seems like if they have nothing else they just use keywords, unless you have individual meta tags on every page. Even though I don’t like having mid sentence snippets that don’t really make sense, that is where the keywords are. So I guess without certain snippets we wouldn’t even rank well for that search.
There are instances where a page ranks well for a specific keyword phrase, and the words from that phrase don’t even appear on the page. For example, the Adobe Reader download page often shows up at the tops of search results for the phrase “click here” in Google, and yet those words don’t appear on that page.
Another example is when you perform something like a “site” search for a page, such as “site:www.example.com”
In instances like that, it’s not uncommon for a search engine to try to choose a snippet that describes the content of the page rather than looking for content on the page that might match the query.
Comments are closed.