Systems and methods consistent with the principles of the invention may provide a reasonable surfer model that indicates that when a surfer accesses a document with a set of links, the surfer will follow some of the links with higher probability than others. This reasonable surfer model reflects the fact that not all of the links associated with a document are equally likely to be followed. Examples of unlikely followed links may include “Terms of Service” links, banner advertisements, and links unrelated to the document.
PageRank under the Random Surfer Model
Google’s PageRank algorithm is based on what its inventor called the Random Surfer Model, where it ranked pages on the Web on a probability that a person following links at random might end up upon a particular page:
The rank of a page can be interpreted as the probability that a surfer will be at the page after following a large number of forward links. The constant Î± in the formula is interpreted as the probability that the web surfer will jump randomly to any web page instead of following a forward link.
The Reasonable Surfer Model Replaced the Random Surfer Model
The Reasonable Surfer Model is an update to the original Random Surfer Model at Google. It looks at different probabilities involving the likelihood that a person might click upon specific links, based upon features associated with those links and those probabilities can determine how likely it might be that someone might click upon those links. The amount of PageRank a link might pass along is based upon the probability that someone might click on a link.
Those link features can include a wide range of factors such as the color, the size, and the styles of fonts, the anchor text used in the links, and a number of other factors. The Reasonable Surfer Model told us that the average visitor to web pages does not click on links at random but is more likely to click upon certain links on pages, and the reasonable surfer model reflects the probability that someone will click on links, based upon the features related to them.
I wrote about the Reasonable Surfer model and the many features involved in a post from 2010 which I titled, Google’s Reasonable Surfer: How The Value Of A Link May Differ Based Upon Link And Document Features And User Data
Patents do sometimes get updated by the people who originally file them. These updates often take the shape of changes to the claims within the patents. That has happened to the Reasonable Surfer model.
These changes may reflect a change in the way that the processes described within the patent operate.
It’s the claims section that is changed when one of these continuation patents is filed, because patent examiners from the patent office look at the claims, and compare those to claims from other patents to make sure that the new claims don’t copy other granted patents, and could be said to infringe those patents.
A continuation patent is called that because it “continues” the protection given by the original version of the patent and is given a date of coverage that begins with the original filing date of the original version of the patent.
The continuation Reasonable Surfer model patent is:
Ranking documents based on user behavior and/or feature data
Inventors: Jeffrey A. Dean, Corin Anderson, and Alexis Battle
Assigned to: Google
US Patent 9,305,099
Granted April 5, 2016
Filed: January 10, 2012
A system generates a model based on feature data relating to different features of a link from a linking document to a linked document and user behavior data relating to navigational actions associated with the link. The system also assigns a rank to a document based on the model.
As I pointed out in my original post about the Reasonable Surfer patent, it changes the amount of PageRank that might flow through a link based upon different features associated with a link. If a link is in the main content area of a page, uses a font and color that might make it stand out, and uses text that may make it something likely that someone might click upon it, then it could pass along a fair amount of PageRank. On the other hand, if it combines features that make it less likely to be clicked upon, such as being in the footer of a page, in the same color text as the rest of the text on that page, and the same font type, and uses anchor text that doesn’t interest people, it may not pass along a lot of PageRank.
So, how have the Claims for this patent changed, changing the Reasonable surfer model?
I’m seeing it refer to anchor text in those claims more frequently, and how much weight might be passed along based upon the probability that people might click upon a link. Here is some language that stands out to me, from the first new claim in this new Reasonable Surfer patent:
… a rank for a particular document, generating the rank including: determining particular feature data associated with a link to the particular document, the particular feature data identifying one or more attributes of the link, determining a weight indicating a probability of the link being selected, the weight being determined based on the particular feature data and selection data, the selection data identifying user behavior relating to links to other documents …the weight indicating a higher probability of the link being selected when the particular feature data corresponds to feature data associated with the one or more links than when the particular feature data corresponds to feature data associated with the one or more other links…words in anchor text associated with the links, and a quantity of the words in the anchor text
The claims in the original version of Ranking documents based on user behavior and/or feature data are different, and these newer claims seem to emphasize more that the weight that is passed along by links seems to be based upon the probability that people will click upon a link found upon a page.
It’s no longer a “random” probability but now seems to be even more “reasonable” than it was even in the first version of the reasonable surfer patent.
I’ve written a few posts about links. These were ones that I found interesting:
5/30/2006 – Web Decay and Broken Links Can be Bad for Your Site
12/11/2007 – Google Patent on Anchor Text Indexing and Crawl Rates
1/10/2009 – What is a Reciprocal Link?
5/11/2010 – Google’s Reasonable Surfer: How the Value of a Link May Differ Based upon Link and Document Features and User Data
8/24/2010 – Google’s Affiliated Page Link Patent
7/13/2011 – Google Patent Granted on PageRank Sculpting and Opinion Passing Links
11/12/2013 – How Google Might Use the Context of Links to Identify Link Spam
12-10-2014 – A Replacement for PageRank?
4/24/2018 – PageRank Update
Last Updated July 1, 2019.