What are Anchor Text Snippets
Somewhere out there is a universe that looks exactly like this one, and appears to run exactly like this one. Except something’s a little different. A little off. It’s as if search engines took a left turn instead of a right turn, back in the early 2000s. Instead of using only using meta descriptions and possibly body text from web pages for descriptive text, or snippets, for those pages in search results, they learned a new trick. Imagine that the content surrounding anchor text in a link to a page was collected and evaluated based upon a quality score, and that this associated and usually descriptive text was used to generate snippets instead?
My thought on the possibility is that often anchor text doesn’t do the best job of describing a page, and often links to a page are from a third party who might not have the same interest in writing text that might make a good snippet for a page. But, Google filed an Anchor Text Snippets patent for such an approach back in 2003. And it was granted this week – so they pursued what was described within the patent for over a decade as well. The Anchor Text Snippets patent does mention that headings on pages might also be used as potential snippets for pages, and provide the following example: “Computers > Algorithms > Compression”. But that’s a small part of the Anchor Text Snippets patent. They don’t limit it to anchor text that a site might provide itself, like in breadcrumb trail navigation for a page.
There’s also a part of this approach that recognizes that many pages have more than one link to them, so a choice would need to be made as to the best “snippet” to show.
The anchor text and text surrounding it is called a “web quote” in this version of the Anchor Text Snippets patent. The patent refers to an earlier version of the patent (a provisional version) that doesn’t use the term “Web Quote” to refer to the text associated with anchor text. In that earlier version, U.S. Provisional Application No. 60/363,559, filed Mar. 13, 2002, there’s an alternative description of that kind of text:
In one technique for improving the quality of a document index, additional terms found near hyperlinks in documents are used to enhance the description of the linked document. The premise of this technique is that web authors tend to described or comment about the content of other web pages in the descriptive text located near the link to the other page. This descriptive text may be used to enhance the quality of the index.
That earlier version of the Anchor Text Snippets patent tells us that using such text might help in the creation of a more comprehensive document index. It also tells us that this kind of associated text often accurately summarized the linked web page being pointed towards.
The older version of the Anchor Text Snippets patent doesn’t use the term “web quote”, but the idea that this text near a link to a page being potentially very useful in creating a description of a page is the same in both.
Meta Descriptions as Snippets
During audits for websites, one recommendation that I frequently make a recommendation for is for clients to review and rewrite meta descriptions for pages. If they are well written, contain the query terms used to find a page, and are good fits for the page, a search engine might use them as a snippet to describe that page. If they are engaging and persuasive, they might influence people to click through from search results to the pages they describe. I’ve written a few posts in the past year about when Google might also decide to use content from pages as snippets:
- Why Google May Change Search Result Snippets
- How Google Might Generate Snippets for Search Results
- How Classification of Page Elements and Search Results May Influence Alternative Titles and Snippets Displayed in Google
None of those even begin to hint at the possible use of anchor text and text associated with or surrounding it, as possible snippets for those pages.
Web Quotes as Snippets
In some cases, neither meta descriptions nor content from the pages provided the best choices as snippets for a page in Google’s search results. Would Google instead use anchor text (and text that might be associated with that text, pointed to the page from another page, as a description of a document in a snippet? In 2002 – 2003 when this patent was being developed, it sounds like an idea worth exploring. If there’s any chance that the use of anchor text might be a good choice, that would explain why Google wouldn’t just abandon the idea, and abandon the patent. Then again, Google’s patent for the Google Directory was granted almost two years after Google sun-setted the Google Directory.
One aspect of this Anchor Text Snippets patent that I find interesting is collecting “Web Quotes” as possible snippets based upon both anchor text and text surrounding it that might be within the same paragraph, and follow some other rules that might indicate that it might be a good choice for a snippet. There might be multiple choices of “Web Quotes” to use as a snippet for a page since a page can have a lot of links pointing to it from across the Web. These Web Quotes might be ranked base upon a “quality metric associated with the web quotes of the examined documents.”
The Anchor Text Snippets patent includes many search engineers who have been with Google from some of the earliest days, and a number of those are still with Google. (I think there’s a typo in the patent, and the last name should be Georges R. Harik, who was instrumental in launching Google Adsense and Adwords, and worked on projects such as “Gmail, Google Talk, Google Video, Picasa, Orkut, Google Groups and Google Mobile.” Here’s the patent:
Using text surrounding hypertext links when indexing and generating page summaries
Invented by Jeffrey A. Dean, Martin Farach-Colton, Sanjay Ghemawat, Benedict Gomes, Georges R. Hank
Assigned to Google Inc.
US Patent 8,495,483
Granted July 23, 2013
Filed: March 12, 2003
Abstract
Web quotes are gathered from web pages that link to a web page of interest. The web quote may include text from the paragraphs that contain the hypertext links to the page of interest as well as text from other portions of the linked web page, such as text from a nearby header. The obtained web quotes may be ranked based on quality or relevance and may then be incorporated into a search engine’s document index or summary information returned to users in response to a search query.
What to look at in the Anchor Text Snippets patent
I don’t think that Google will decide to start using anchor text and text associated with it as snippets for search results in the future. The three links I listed above all describe some of the factors that Google might look for within the content of a page to decide what to show as a snippet for that page.
But if you enjoy alternative histories and science fiction like Phillip K. Dick’s The Man in the High Castle, which describes an America in the 1960s after Germany and Japan won World War Two, and was occupied by both those countries, you might get that feeling of being in a left-handed universe while reading through the patent. There are some descriptions of how a search engine works back then, and how this approach to snippets might improve upon the experience.
Could Web Quotes ever be used by Google to provide an alternative snippet for pages, based upon links to the page instead of content that appears on the page itself?
I’m going to experiment a little with the idea, and try to create hypertext links to pages about things like Google using alternative titles and snippets in some cases, from sources that might be like this paragraph on a page that links to other pages on my site.
The patent does tell us that it might filter Web Quotes using a quote generator that might look for certain features in those Web Quotes. Here are the features listed in the patent:
- The web quote’s length
- Punctuation
- Use of verbs
- Positions of verbs
- Use of adjectives
- Etc.
- How similar or different a Web Quote on one page pointed to a particular page might be compared to Web Quotes on other pages pointed to the same page
- How similar a Web Quote might be to other Web Quotes with links pointed to multiple Pages
- The PageRank of the page upon which the Web Quote appears upon
Does that last one mean that it’s more likely that a Web Quote with a link pointed to a page is more likely to be a snippet for that page if it’s on the front page of the New York Times that if it were on the front page of my local paper’s website (The Fauquier Times)? Maybe. The patent does tell us that how relevant a Web Quote is to the search terms used by a stronger consideration than a quality value like PageRank.
I will be keeping an eye out for the use of web quotes – anchor text plus text associated with that anchor text – as snippets in search results. If you see any being used by Google, please let me know. 🙂
Easy to see how an approach like that could help Google better understand a document and what it is about.
Since they are comparing web quotes from different pages to one another, it seems reasonable that they might use web quotes with the Key Word In Context (KWIC) algorithm that generates the description as well. For example, today different queries produce different snippets in the search results for a given URL and using what is described in the patent – different web quotes could help inform Google about what is the best description to show for a given query instead of having the same description for every query a given page might be relevant for.
I think one place you could try an isolated test with this would be to create a page that is blocked by robots.txt. Link to the page using link text like “click here” but surround the link itself with nouns as an example. Since Google can’t crawl the page, they fall back to third party signals – but since you’re not giving any useful signals in the anchor text – it’d be interesting to see if Google might use the surrounding text, such as the nouns, as the title or description of the document. I doubt you’ll see it in the description, from what I’ve noticed – Google appear to be quite consistent with how they display the description in the SERPs for a page that is blocked by robots.txt but it might flow through to the title.
Surprised this was not already patented actually.
Hi Bill,
Great post. I have been waiting for someone to come up with this analogy of snippets for a while. I know there is more info underneath the covers…but a good start.
Cheers
Virginia
Bill,
This is a brilliant start to uncovering the future of anchor text variation/snippets. Thanks a lot for taking the time to follow google patents.
Gregory Smith
Hi Bill,
Nice as always, As we know Google is already using different different methods to gauge and represent value of the webpages like from meta des. tag, body text, open directory project and even different title for brand specific search terms. so I strongly feel that this one will added in account too and would be important factor in ink-building.
I was sure I saw a case like this quite a while back. I Googled “Tiggerito snippet” and it came up third. Now I know why I have a unique handle 🙂
link no longer available
So in 2010 I believe they were doing this.
Triggerito, I like your case study. It really drives the point home. Bill, thanks for staying on top of the patents. While what they are doing is nothing new, we always find new “snippets” of information (pun intended) in these patent filings. Thanks.
Thanks, Nick
You’re welcome. Sometimes we learn more from some patents than from others, but I’ve been doing this for a long time, and I usually always learn something new, even if it’s a different perspective on things that we might be taking for granted sometimes. 🙂
Great snippet conversation, Bill your post nails this topic. I think we as marketers could take away that different types of writings and information can be viewed differently based on the intent of the publisher. Not every publisher uses quotes, or anchor text in links in quotes, using such anchor text in quotes could be construed as citing a source of the quote, thus in certain instances have additional “value”.