We can help make your web site easier to find, and easier to use.

Recommended Reading










Google's Reasonable Surfer: How the Value of a Link May Differ Based upon Link and Document Features and User Data

Not every link from a page in a link-based ranking system is equal, and a search engine might look at a wide range of factors to determine how might weight each link on a page may pass along.

A diagram showing different values for links passing amongst three different web pages.

One of the signals Google uses to rank web pages looks at the links to and from those pages, to see which pages are linked to by others. Links from “important” pages carry more weight than links from less important pages. An important page under this system is one that is linked to by other important pages, or by a large number of less important pages, or a combination of the two. This signal is known as PageRank, and it is only one of a large number of ranking signals that are used by Google to rank web pages and determine how highly those pages show up in search results in response to a query from a searcher.

An early paper by the founders of Google, The Anatomy of a Large-Scale Hypertextual Web Search Engine, tells us

PageRank can be thought of as a model of user behavior. We assume there is a “random surfer” who is given a web page at random and keeps clicking on links, never hitting “back” but eventually gets bored and starts on another random page. The probability that the random surfer visits a page is its PageRank.

Under that approach, any link from the same page might carry the same amount of weight, or importance, when pointed to another page.

A Google patent filed in 2004 and granted today takes a somewhat different approach to the value that links might have when they appear on the same page:

Systems and methods consistent with the principles of the invention may provide a reasonable surfer model that indicates that when a surfer accesses a document with a set of links, the surfer will follow some of the links with higher probability than others.

This reasonable surfer model reflects the fact that not all of the links associated with a document are equally likely to be followed. Examples of unlikely followed links may include “Terms of Service” links, banner advertisements, and links unrelated to the document.

The patent is:

Ranking documents based on user behavior and/or feature data
Invented by Jeffrey A. Dean, Corin Anderson and Alexis Battle
Assigned to Google Inc.
United States Patent 7,716,225
Granted May 11, 2010
Filed: June 17, 2004

Abstract

A system generates a model based on feature data relating to different features of a link from a linking document to a linked document and user behavior data relating to navigational actions associated with the link. The system also assigns a rank to a document based on the model.

In this “reasonable surfer” model, not every link that appears upon a page is equal in value. Different features associated with links, and the pages they appear upon and point to, may determine how much value those links pass on to the pages to which they link.

Features of Links and Documents

Under this patent, when a search engine crawls and indexes pages on the Web, it may create a model that it uses to help rank those pages which looks at features associated with the source pages that links appear upon, the target pages that links point to, and the links themselves. The search engine may also collect data about how visitors to pages use those pages, such as which links they click upon, what query terms they use to find pages, and other information that could be collected from a web browser or an add-on to a browser, such as a toolbar.

The following lists provide examples of features, and not all features listed may be used, while other features could be considered as well.

Examples of features associated with a link might include:

  1. Font size of anchor text associated with the link;
  2. The position of the link (measured, for example, in a HTML list, in running text, above or below the first screenful viewed on an 800 X 600 browser display, side (top, bottom, left, right) of document, in a footer, in a sidebar, etc.);
  3. If the link is in a list, the position of the link in the list;
  4. Font color and/or other attributes of the link (e.g., italics, gray, same color as background, etc.);
  5. Number of words in anchor text of a link;
  6. Actual words in the anchor text of a link;
  7. How commercial the anchor text associated with a link might be;
  8. Type of link (e.g., text link, image link);
  9. If the link is an image link, what the aspect ratio of the image might be;
  10. The context of a few words before and/or after the link;
  11. A topical cluster with which the anchor text of the link is associated;
  12. Whether the link leads somewhere on the same host or domain;
  13. If the link leads to somewhere on the same domain,
    • whether the link URL is shorter than the referring URL; and/or
    • whether the link URL embeds another URL (e.g., for server-side redirection)

Examples of features associated with a source document might include:

  1. The URL of the source document (or a portion of the URL of the source document);
  2. A web site associated with the source document;
  3. A number of links in the source document;
  4. The presence of other words in the source document;
  5. The presence of other words in a heading of the source document;
  6. A topical cluster with which the source document is associated; and/or
  7. A degree to which a topical cluster associated with the source document matches a topical cluster associated with anchor text of a link.

Examples of features associated with a target document might include:

  1. The URL of the target document (or a portion of the URL of the target document);
  2. A web site associated with the target document;
  3. Whether the URL of the target document is on the same host as the URL of the source document;
  4. Whether the URL of the target document is associated with the same domain as the URL of the source document;
  5. Words in the URL of the target document; and/or
  6. The length of the URL of the target document.

User behavior data associated with documents and links may also be considered, such as:

  1. Information about how people access and interact with documents, such as navigational actions (e.g., links selected, web addresses entered, forms completed, etc.),
  2. The language of the users,
  3. Interests of the users,
  4. Query terms entered,
  5. How often a link is selected,
  6. How often links aren’t selected when one link is chosen,
  7. How often no links are selected on a page,
  8. etc.

This user behavior data could be obtained from a web browser or a browser assistant program such as Google’s Toolbar.

How Features May Influence the Weight of a Link

This model based upon features is intended to determine how likely a link on a page might be selected based upon positive and negative aspects of those features.

For example, a link with anchor text that is bigger than a certain size may have a higher probability of being selected than links with anchor text of a smaller size. Links positioned closer to the top of a page may also be more likely to be clicked upon. If the topic of the document being pointed to is related to the topic of the page the link appears upon, it may also have a higher probability of being selected by a visitor to the page. So, a link in a larger font, near the top of a page, leading to a page covering a similar topic as the page it appears upon may have a much higher probability of being chosen by a visitor than a link using smaller text, appearing at the bottom of a page, pointing to a page on an unrelated topic.

The patent provides a number of other examples of rules that might be applied to different features to determine how likely it might be that different links on a page might be selected and clicked upon by a visitor. Those probabilities are used to determine a dynamic weight for each of the links that can influence how highly the pages they point to might rank. The different weights for the links might determine how much PageRank that each link passes along to other pages.

Or, as the patent filing tells us:

The rank of a document may be interpreted as the probability that a reasonable surfer will access the document after following a number of forward links.

Conclusion

How much value might a link on a page pass along in a link-based ranking system like PageRank?

Under the patent filing granted today, the value of a link may be different based upon a large number of factors, such as where the link is located on a page, whether the link is a different color or font style than other links, how many words are used in the anchor text for the link, whether the link text used is commercial or not, what the topic of the page is that the link appears upon and the topic of the page pointed to by the link, and many others.

It’s likely that in the early days of Google, the search engine quickly moved past the 1999 description of PageRank in The PageRank Citation Ranking: Bringing Order to the Web, where the weight of links were shown as split equally amongst links pointing out from a page. This patent describes a number of approaches that Google may have used to weight the value of links differently, though it’s likely that the lists above provide more value as possible examples of how links might be weighed than as definitive guidelines.

It does offer one broad rule of thumb that might be helpful. Which links on a page are most likely to be selected by a reasonable surfer – those are the links that probably carry the most weight.

  • Share/Bookmark

124 comments to Google’s Reasonable Surfer: How the Value of a Link May Differ Based upon Link and Document Features and User Data

  • Great stuff Bill and thanks for the heads up – this is something i’ve heard about and thought about before – identifying those important links is always high on the agenda :)

    Be in touch very shortly ;)

  • Jen

    Great Information here. Thanks for the article. Granted, a lot of this has most likely been happening for some time now, this one really stuck out to me: “How commercial the anchor text associated with a link might be”. If that’s not an argument for long-tail and variation, I don’t know what is.

  • Ever wonder how smart Google is? Now we know how smart they were in 2004. This is one of the best search posts I have read in a long time.

  • Great find Bill – another great overview. Theres a lot of pretheorised concepts in there but its always good perhaps seeing a bit more method to the madness

  • This is possibly one of the most important Google Patents in quite a while, great piece.

  • Thanks for the breakdown Bill and for pointing out this patent. Especially liked the link weight examples – font-size, placement etc. Good to know this was 2004… who knows what the G-men are up to now? ;~)
    ~ Jim

  • Great article Bill, there’s been plenty of talk about what could happen but you really brought it down to the ground in this post. Time to start taking some of these points on board. Won’t do any harm to start getting used to implementing some it.

  • [...] – and would probably think Google is getting better at crawling deeper and faster too, and identifying better links, which is could well be the reason if you are experiencing traffic [...]

  • Bill, your explanation is likely the best I’ve seen so far. I tweeted it just now, so hopefully you’ll get a few more visits. Great job explaining the often misunderstood PageRank of Google and how links are weighted. Very comprehensive.

  • Hi Shaun.

    Thanks. I think we’ve seen hints of some of the features listed in this patent elsewhere, and referenced by other search engines as well, such as links within different parts of a page carrying different weights, which was mentioned in Microsoft’s paper on Block Level Link Analysis of web pages.

    I remember an interview with one of the search engineers from Yahoo last year also revealed that Yahoo weighs links differently based upon a number of different signals. I recall a similar statement from Matt Cutts within the last year, but haven’t been able to locate it, though I will keep on looking.

  • Hi Jen,

    There are some interesting “features” that were listed in the patent, like the one you point out about commercial anchor text. I also thought the one about how related the anchor text is to the topic of the page it appears upon was pretty interesting as well.

  • Hi Gyi,

    I was pretty excited when I started reading this, inspite of the 2004 filing date, because of how well provided some potential answers to questions I’ve seen over the past few years about links, and raised new questions that I haven’t seen anyone asking.

    Thank you.

  • Hi Peter,

    Thanks. I did like seeing an intelligent framework around how different features from links, the pages they are on, and the pages they point to might be factored together, which is what this patent provides. The specific examples in the patent may or may not be in use, and if the process described in the patent is being used presently, new features may have been developed since 2004, but the basic concept is pretty simple – some links on a page are more likely to be clicked on than others, and are more meaningful than others.

  • Thank you, Roy.

    I think it is one of the more important patents from Google as well.

  • Hi Jim,

    You’re welcome. I really enjoyed going through the different features that they listed within the patent and thinking about how much value each might have and what they might mean to how pages are ranked. Some of the issues that I’ve seen discussed in many different ways and forms on the Web, such as “do sitewide links matter,” or “how might search engines identify and address paid links,” or “how much value do footer links have,” might be addressed slightly differently within the framework described in this patent.

  • Hi Darragh,

    If not implementing, at least testing. I think the patent raises a lot of issues worth exploring.

  • Hi Lou,

    Thank you for your kind words, and your tweet, and thanks to everyone else who tweeted or referred to this post in some manner.

    We know that Google most likely started tweaking PageRank from day 1, but specifics on how they might have made changes to it have been fairly elusive. This patent presents some possibilities that perhaps haven’t been addressed too deeply.

  • I’ve wonder about the effect of Google’s jihad encouraging placement of rel=”nofollow” tags into links (see point man Matt Cutts) upon PageRank and importance of links. If the majority of new links in the web universe now come with “nofollow” tags and google really does not count these links in its ranking algorithm anymore, how does a website owner move up in rank? It seems like the tools for doing so are now less about links and more about the other factors in the google algorithm. It’s not that links aren’t important. It seems legitimate ways of acquiring links are drying up due to the explicit policies of google. Is their goal to freeze all the pagerank juice in the large, established websites?

  • This was very interesting. I also briefly read the filing which was also quite detailed. What strikes me is that a very broad method of analyzing data can be patented. The question I would have is that if Bing were to analyze a text link and the five words before and after, would they be in violation of Google’s patent and liable for damages?

  • So this may mean that footer links are devalued – i.e. if you want to use footer links you will need to use way more of them!

  • [...] entire process. Although the dozens of link building tools out there definitely can be useful, and many advanced factors can come into play, the essence of link building all comes down to answering one single [...]

  • I’ve always wonder how much of what google puts into their patents actually ends up getting used. They just seem to patent things so their competitors don’t use them. For example, I’m no engineer but do they actually use latent semantic indexing? I’ve never really found any hard facts to say they do. And as for this one… font styling effecting links… yeah maybe but I don’t think it’s worth digging into. I say just keeping more good varied links :)

    With that said, still a good post and very insightful.

  • Thank you for this post. It explains clearly what page rank really is. It’s a three thumbs up!!!!

  • This is totally different from we know about a page rank transfer algorithm, This have cleared my confusions. Thanks Bill.

  • Great things to consider. This adds more fuel to the debate on whether topical links make any difference or not: YES – “what the topic of the page is that the link appears upon and the topic of the page pointed to by the link”.

  • Interesting stuff, thanks for the post. While most of this stuff is taken for granted in SEO circles, this may help to prove to the powers that be, that stuffing home pages with banners for promotions is, in fact, stifling SEO, and would be better served with optimized content and text-links.

  • Pete Gronland

    Thanks Bill,

    An excellent (& detailed post)copies saved for future reference, thanks for your time on this, it is appreciated.

    Its been a while since I have read through Google patents but I remember how time consuming this is.

    It is true that this can be soul destroying research especially as a large proportion will not come into affect, however the skill is to identify which patents are practical & would improve the search experience.

    Thanks again

    Pete

  • Hi jjray,

    I really haven’t been a big fan of the rel=nofollow attribute and value either. Initially conceived as a way to try to cut down on blog comment spam, Google expanded its use so that people could tag paid links with it, and some site owners and SEOs tried to use it to redistribute the way PageRank flowed through the pages of their sites, not realizing that the search engines were quite possibly doing that all along anyway – see what the patent has to say about passing along less link value to links on a site like “terms of service.”

    Ultimately, the best way to attract links to your pages is to do something remarkable, something unique, something noteworthy that people will take notice of, and bookmark your pages, refer friends to your site, and link to you. Of course, that’s hard to do, but it’s worth the effort.

    Google is looking at signals other than links as well, which isn’t necessarily a bad thing. Note that some of the signals mentioned in this patent involve user behavior data, and some even focus upon areas where customizations and personalization may play a role in which links are shown to which viewers. A smaller site that understands its audience well, and focuses upon them might have the ability to rank higher than larger sites, especially ones that are slow to change and to adapt to new interests from its audiences.

  • Hi Bill,

    Good questions.

    I write about a lot of pending and granted patents here, and arguments can be made against many of them that the areas that they cover are fairly broad, that the particular processes they involved may not be new or useful or nonobvious. Rather than focusing upon those arguments, I’d much rather see what it is we can learn about the search engines, and the underlying assumptions that they may make about search and searchers and the Web.

    Chances are that a lot of the patent applications that get filed won’t become granted patents. Since the major search engines all have similar end goals or targets, such as delivering meaningful and relevant search results to searchers, there’s often a fair amount of similarity in the patents that they file for patent protection. If two patents include some methods or processes, such as looking at a number of words before and after a link, but the overall process described in each of the patents are different enough in significant ways, then the granting of one of those patents likely wouldn’t invalidate the possible granting of the other, or the use of such a process by someone who doesn’t have a patent but uses that particular part within a larger process.

    The idea of looking at text associated with a link to understand more about the page being pointed towards is quite possibly within the public domain at this point, and is something that search engines have likely been doing for more than a decade, perhaps even back to the use of hypertext analysis by the World Wide Web Worm in the mid-90s.

    I have seen Google, Yahoo, and Microsoft cover some very similar search-related problems using approaches that are worded very differently, including steps that are unique to each. I’ve been keeping my eyes open to the possibility of law suits with some of them, but really haven’t seen that happening.

  • Hi James,

    It’s quite likely that links that appear in the footers of pages carry less weight than links appearing within the main content areas of pages. We’ve heard something to that effect from both Matt Cutts, and from one of the chief search researchers at Yahoo, and Microsoft’s paper on Block Level Link Analysis points to that approach as well.

    That’s not to say that footer links don’t have any value independent of search engines. They can be useful for visitors to a page, especially if they make it more likely that someone who has scrolled down to the bottom of a page might click on a link to another page of the site without having to bother to scroll back up to the top of the page. I’m not sure that I would focus upon adding more footer links to pages mainly to try to influence the amount of PageRank going to those pages, however.

  • Hi David,

    Thank you.

    Sometimes it’s easy to see when Google has developed something described in a patent, and sometimes what one of their patents describes may be something almost impossible to determine whether or not it is in use. Some patents seem like they are intended to protect an idea that may or may not be economically feasible, but they want to keep the possibility open. Others seem like they cover areas that Google may not go into, and may have filed to try to stop others from entering that field.

    Google has published a number of patents that describe the possible use of probabilistic latent semantic indexing (PLSI), which is different from latent semantic indexing in some very meaningful and significant ways, and it’s possible that they use PLSI.

    There are a number of possible features listed above that are significantly more impactful than whether or not the font size, font color, and font styling of a link may be different, and those are worth considering.

  • Hi Andrew,

    Thanks for the thumbs up.

    The truth of the matter about PageRank is that there have been a lot of possible variations of how PageRank could work published by Yahoo, Microsoft, IBM, and academic reseachers over the years since it was first introduced. It likely started evolving from the moment it began being used by Google, and has probably been tranformed significantly since then.

    This patent provides some possible hints as to some parts of that evolution, but it was filed 6 years ago itself. I think it’s helpful, but what Google is using now is probably even more complex.

  • Hi Alok,

    I was actually hoping that publishing a post about this patent would raise more questions than it answered. :)

  • Hi Tom,

    The topical aspect raised in the patent had me start asking myself questions about how Google might be identifying topics, and how much weight that particular feature might have, too.

    The section in the patent on user behavior, and its mention of user groups and user interests seemed aimed at a personalized PageRank as well. The patent only really brushed the surface of the significance of that feature in its description.

  • Hi Jeremy,

    I have seen some of these factors mentioned in forums and blogs, as well as variations of them, often with limited support or evidence to back them up, which made it hard for me to take them from granted. It was good seeing them listed in an actual patent from the search engine, within an intelligent framework for their use. In spite of that, I’d still approach them with a healthy amount of skepticism and actually do some testing on my own.

    It is good to have something more than SEO folklore to go on, though. :)

  • Thank you, Pete

    I do look through a lot of patents each week, and only write about a few of them. It is time consuming, but it can be very educational and worth the effort.

  • Very lengthy and informative post on seo and page rank, in the end lot of it is common sense, if you get a link from a page that has relevant and plenty of links, then that link for you will be worth more than a link froma page or site that is less popular and or relevant to your keyword.

    great post on seo, please feel free to take a look at my online marketing blog also

  • Thanks again Bill for spending that much time to sum up patents for us. Man, I managed to read an entire patent once, and once only… you’re a true genius (or maybe an alien, I don’t know ;)

    We can say that we kind of knew Google was using most of these features already (although the patent was filled in 2004, no one can tell if they already used some of them or all of them). I think what will catch reader’s attentions the most is that trafic received by each link can have an influence on its value. I believe this factor isn’t used yet (or not often…), as we’ve seen many websites ranking well for keywords whose external links were mainly on footers. So one can just hope that Google uses click data more often, this could truly improve search results. Of course spammers could build lots of bots to click on their links, but that’s another story, and we know that Google’s very good at catching click fraud activity.

    And I agree with you, haven’t been a fan of nofollow either. Lots of webmasters seem to use them systematically for outgoing links, even for links they choose to add on their content themselves! (one example I saw today was Kosmix http://www.kosmix.com/topic/Cell_%28microprocessor%29)

    In that case, does Google really not count a nofollow link if it’s within a text and clicked on my many visitors? If they do, then seriously, it’s time to find another way to fight link spam.

  • This is great information, Bill. Where do you get information like this? Anyway, is it better to get a link from a relevant, PR0 site than from an unrelated site with a PR of 4?

  • This is an outstanding find Bill – thanks very much for bringing this to the community’s attention.

    This seems to very much marry in with page segmentation, but is distinctly different from it too.

    I’d really love to see how the search engines today really go about measuring things like font size/text colour/text location in practice. DOM parsing alone can give you lots of information (e.g. figuring out which elements are site-wide) but not the full picture. Headless browsers are quite intensive to run compared to more simple spiders, but would give the most accurate results. I’m guessing they’ve found a “good enough” compromise somewhere.

  • Hi Tahire,

    Thank you. There are a lot of potential directions that a search engine could go in when trying to determine how to rank web pages and to decide in which features or signals should matter, and many of those are likely common sense as well. For instance, decisions like attempting to incorporate user behavior data into giving links on the same page different weights could be considered common sense, but the search engine would also need to have some kind of tecnology in place to calculate that user behavior.

  • Hi Dominic,

    Thanks. I find information like this from spending a lot of time searching for and reading through patents and white papers and blog posts and RSS feeds. :)

    If all other things about the PR0 and the PR4 page where equal, and the other features involved with those links were the same, I’d probably prefer the link from the PR4 site than the PR0 site, even though the topic was unrelated. There’s probably a better chance that the PR4 site would be visited by more people, and the odds that I might get more visitors from that link would likely be greater.

  • Hi Nadir,

    Thanks. After spending a good number of years reading through patents, I’m finally starting to get used to them. :)

    We’ve guessed and surmised and likely experimented with many of the concepts described in this patent, and probably missed a few as well. Traffic through a link can be a signal on its own, regardless of the location on a page or other features involved, which is definitely why the patent looks at those features together as a whole rather than independently of one another.

    As for links in footers, it’s possible on some sites that footer links are clicked upon frequently as reasonable navigational aids, while on other sites they are large keyword stuffed lists of links that tend to be ignored. It’s possible that some footer links have more value than others when it comes to SEO.

    I’m not sure if Google would ignore a nofollow value on a link if the link were in the main content area of a page, and clicked upon by a large number of visitors to a page. Perhaps they wouldn’t pass along PageRank and relevance through hypertext, but may pass along some kind of authority or quality value because of the prominence of the link and the traffic that it receives. We’ve heard that Google considers more than 200 different signals in determining the rankings of pages in search results, and that PageRank and hypertext relevance are only a couple of them.

  • Hi Ian,

    You’re welcome.

    This does fit in nicely with both Page Segmentation and with the idea that a search engine might ignore some sections of a page if they might be considered boilerplate.

    It probably is fairly cost intensive to try to crawl sites and analyze their structures from both a DOM approach and a visual analysis. We do have to keep in mind that Google does collect information about web pages which they display as cached versions of pages. They can spend a lot of time going through those cached copies on their own servers and analyze the structures of those pages.

  • [...] SEO for PRs as well as SEO techniques in general? Here’s the link for an article about the Google Patent – basically it’s more reason to put the most important link in the first paragraph, and [...]

  • [...] can read through the examples over at Bill´s post (or in the patent itself) – I just want to point out one more really interesting thing [...]

  • I am thinking of trying to utilize the font size aspect of links as design element. Using CSS i think one could make some interesting visual effects and maybe get a bit more link juice. I have seen some nice use of CSS on some sites that really bump the text size up. That could be used on a link.

  • [...] eine gute Zusammenfassung und ein paar wirklich tolle Kommentare gibt es im Post von Bill Slawski, Google’s Reasonable Surfer: How the Value of a Link May Differ Based upon Link and Document Fe…. Examples of features associated with a link might [...]

  • I think this is my favorite post you have ever done. Probably the best indication of a good post is the number of questions it raises in your mind. This is inspiring me to wonder all kinds of things and I’m dying to try some tests on some of these metrics you’ve pointed out.

    The thing that makes me the happiest is the idea that all the factors point towards what a reasonable surfer would do. I wish I could make an intelligent comment right now but too many ideas are swirling around.

  • [...] 11 reflects that. An excellent post by Bill Slawski explores the recent patent awarded to Google, Google's Reasonable Surfer: How the Value of a Link May Differ Based upon Link and Document Features…. The patent revises how Google regards a random surfer – you may recall that the original formula [...]

  • Great find! It’s nice to see that some clarification on how a link is weighted is finally being uncovered. Common sense seems to meet these criteria exactly, but it is interesting how something as seemingly insignificant, such as colour of text for the link, could actually impact its value. Thanks for the resource.

  • Excellent article Bill, a fantastic breakdown of factors we can expect to play a huge part in Google’s ranking algorithm, as these factors play a larger part, it will be interesting to see how many of these companies with the larger pockets and ranking well take a hit from their participation in purchasing thousands of links mainly in other sites sidebars and footers… Especially those in the SEO industry.

  • [...] started following Bill Slawski of SEO By The Sea. I was very impressed by the analysis of the new PageRank patent awarded to Google last week (a must read if you care about PR), as well as Google Defines Semantic Closeness as a [...]

  • [...] but has got a lot more to share – good stuff! He mentions Bill Slawski’s post about the ‘reasonable surfer’ patent granted recently, and stresses the importance of reading that.  Whereas SEOs have been using the [...]

  • First off, this is a great post.
    My question is as follows.
    For Features associated with a link, what about if there are two links which go to the same page? Lets just say they have the same anchor text for now (to keep things simple). Would the one that shows up first on the page (in the code) pass more link juice? Or would they be equal? Or could they actually be penalized?

  • Thanks for the fascinating analysis. One can’t help but wonder how long it will be before we’re all trying to figure out the relative significance of sky blue links vs. aqua. This will definitely give me a lot to think about.

  • Hi Superstar,

    If the processes described in this patent are in use by Google, it is possible that the design of different elements of a page might now impact how much PageRank may flows through a link. Colors, font sizes and styles, and more could have an impact. What isn’t quite clear is how Google might interpret presentation through the use of cascading style sheets. It’s an issue that has been worth thinking about for a while, and perhaps even more so now.

  • Hi Marjory,

    Thank you for your kind words.

    The patent has raised a large number of questions for me as well, and a number of ideas on how to explore some of them as well. I agree with you. I do like that they introduced the concept of a “reasonable surfer,” as a way of thinking about this patent, opposed to the “random surfer,” often used in conjunction with PageRank.

  • Hi Kennedy,

    Thanks. If we were told by Google that different links would be weighed differently, which we have been, but not given some examples like we are here in the patent, I think we might have been able to guess at a number of these link and page features. If Google is doing something like this, which is probably a good bet, we can’t be sure whether or not they are using all of the factors that they listed, and whether they might be considering others as well.

    If a link stands out because it uses a different color text, and that makes it more likely to be clicked upon, perhaps that is something a search engine should consider. The question is, would it truly be more likely that someone would click upon it/

  • Hi Geoff,

    Thank you.

    One of the early paragraphs from the description of the patent reads as follows:

    Search engines attempt to return hyperlinks to web documents in which a user is interested. The goal of the search engine is to provide links to high quality documents to the user. Identifying high quality documents can be a tricky problem and is made more difficult by spamming techniques.

    Chances are that if Google were to pursue an approach like this, that they’ve already begun to do so in some fashion. Many of the features that they list in the patent could indeed be considered “reasonable.” What kind of impact might they have upon paid links? I would guess that they might make people think more about how much value there is to buying links, and about how useful those links might be. If a Google Toolbar PageRank indicator shows that a page has a PageRank of 5, it may not mean that a page being linked to from that page gains the value of a link from a PageRank 5 page.

    Chances are that was true in the past; hopefully after this patent has been published, that is more clear to people pursuing paid links.

  • Hi SoFlaWeb,

    This patent doesn’t explicitly describe the situation of more than one link from the same page pointing to another specific page, and whether or not PageRank value flows from both links.

    It is possible that a search engine might merge together links that are on the same page pointing to the same page. See my post How a Search Engine Might Analyze the Linking Structure of a Web Site, which describes how Microsoft may have approached that issue. It’s also possible that Google might be doing something similar.

  • Hi Jody,

    I’m hoping the decision on whether to use links that are sky blue vs. aqua remains in the hands of a designer trying to communicate ideas with visuals rather than someone tryng to pass along PageRank. Definitely somethings to think about here, though.

  • Thanks for the post Bill great info. In your opinion do you think it would be better to have a banner w/ a followed link at the top of a page or a text link on the right hand side of a page? From what I gather the text link would have more value?

  • Great article and thanks for getting this info up so quickly. It is nice to see some public evidence for a theory I and others had been talking about for a while – that not all links are equal.

  • [...] Google Patents is normally something I leave to the rain-man SEO types but Google’s Reasonable Surfer Patent will confirm a lot of the ‘gut feelings’ you’ve had about onsite link optimisation. And it’s [...]

  • Hi Bill,

    This is an excellent summary post. It was referenced recently by Rand at SMX London so I thought I would swing by and catch up. Didn’t disappoint.

    Regards,
    Ben

  • Hi Dave,

    While the patent filing gives us a large list of features that it might view in determining which links might carry the most weight, I don’t think it’s helpful to view each of those features as isolated ones. A dynamic model might be created for each page that looks at both positive and negative aspects of features that attempt to determine which link would more likely be followed by reasonable surfer.

    The search engine might look at the words used as anchor text for the link and alt text for the image to determine whether that text was relevant to the topic discussed upon the page they appear upon and the target page of the links. It might look at how many times in the recent past the banner and the text-based link were clicked upon. It may check to see if either of those links point to a page on the same domain as the page the links appear, or on a different domain.

    The features that I listed above were examples from the patent filing, and Google could possible look at other things as well, but it would likely look at all of the features together to come up with a score for each link.

    It’s possible that a text-based link might be considered more valuable than an image based link, and that a link in the main content area of a page at the top of that content section might carry more weight than one in a sidebar, but the other characteristics might be considered as well. We also don’t know from the patent filing how much weight each of the features might carry.

    Because of that, it’s possible that a link further down on a page might have a higher link value than one higher on the page based upon features of those links, the documents they appear upon, the documents they point towards, and user behavior data that might be associated with them.

  • Hi Adrian,

    You’re welcome. It is nice to see something like this patent come out and provide some confirmation of ideas. Of course, we don’t know how much of this patent may have been applied to what Google does, but it does seem reasonable that they would follow through with a lot of the ideas considered within the patent.

  • Hi Ben,

    Thanks. I was happy to hear that it was one of the topics of conversation at SMX. Hopefully it will lead to some discussions about the topic elsewhere as well, and exploration of the ideas within the patent.

  • [...] Google’s Reasonable Surfer: How the Value of a Link May Differ Based upon Link and Document Featur… – Bill Slawski [...]

  • [...] How the Value of a Link May Differ Based upon Link, Document Features, & User Data [...]

  • [...] a fundamental shift in how value is passed through links on a page, apart from the excellt post at SEO By The Sea, I don’t seem to see to many SEOs discussing this in more [...]

  • Thanks for the informative post!
    First a random surfer model, then a reasonable surfer model.
    I wonder what’s next – a sophisticated surfer model? :-)

  • Hi Udi,

    You’re welcome.

    A couple of years ago, some researchers from Lehigh University proposed a Cautious Surfer model, which suggested that an element of trust be incorporated into calculating PageRank.

    I wrote about it in a blog post titled Link Analysis, Web Spam, and the Cautious Surfer.

    I also took a closer look at one of the orginal PageRank patents in that post, and what I found was that a number of the elements from this “Reasonal Surfer” approach were already discussed in the Random Surfer model of PageRank. A snippet from that post:

    For instance, the following factors are noted as possibilities that could weigh in on the value of a link from a source page, to increase the probability that someone would end up at an important page if they were surfing the Web and followed the link:

    Whether the links are from different domains
    If the links are located at different servers
    Which institutions and authors maintain the links
    Where those institutions and authors are located geographically
    Whether the links are highly visible near the top of a document
    If the links are found in web locations such as the root page of a domain
    If the links are in large fonts, or are emphasized in other ways
    If the pages that the links are upon have been modified recently

    The original random surfer may have been more of a reasonable surfer than we give him credit for.

  • Nice information I wasn’t aware that google keep track that much.

  • great to read and follow. a yellow brick road in itself. cheers

    k

  • [...] ovat vaikuttaneet hakukonesijoitusten kehitykseen. SEOMozin Rand Fishkin rakensi puheenvuoronsa random vs. reasonable surfer -käsitteen pohjalle. Googlen algoritmi on hänen mukaansa muuttunut kevään aikana yhä enemmän [...]

  • [...] would not be inconsistent with the reasonable surfer patent that was recently granted to Google (which they may or may not be using in that form these days – but the direction of travel seems [...]

  • [...] but has got a lot more to share – good stuff! He mentions Bill Slawski’s post about the ‘reasonable surfer’ patent granted recently, and stresses the importance of reading that.  Whereas SEOs have been using the [...]

  • Hi Bill, thanks for yet another masterpiece. As of now I’ve stopped examining the relevance of links ‘too deeply’. I Guess I just need to visit here more often to keep up with the latest developments. Along with SEOmoz, StomperNet and of course Google themselves, you provide all the information about SEO I could ever hope to digest.

    This article along with most others you produce emphasises that quality content still figures high in the list of key ingredients for online success.

  • [...] links are not equal – some are much more important than others. In 2004 Google created the Reasonable Surfer Model which attempts to algorithmically determine which links are more important than others. Systems and [...]

  • brilliant essay! Thaks a lot! Sometimes other people just succeed in bringing to paper what oneself has had in mind for ages… ;-)
    It is indeed one of the most successful strategies to identify these valuable links and keep away from all the crappy stuff that you can get very easily! And that makes the difference between wannabe and professional SEOs.

    Regards
    Nico

  • Hi Bit Doze,

    As a Google search engineer once remarked at a conference I attended when asked about how much information they track, “We have lots and lots of computers.”

  • Hi twocans,

    Thank you.

  • Hi Andy,

    Thanks for your kind words.

  • Thank you, Nico

    Many, but not all of the signals described in the patent have been discussed as possible influences upon PageRank for a few years. Regardless of that, many people discussing PageRank have been writing about it as if each link on a page contained an equal amount of PageRank. I agree with you – being able to distinguish between valuable links and and less valuable links can really make a difference.

  • Jesse

    After reading the first illustration of http://seomoz.org/blog/10-illustrations-on-search-engines-valuation-of-links, I though of this article and your analysis on “first link wins” at http://www.seobythesea.com/?p=2929#comment-208633.
    Now I’m very confused about that if google sees its relevance or just pass more power to the first link which higher up the second one when google encountered two links associated with same anchor text.
    Could you explain it for me?
    Regards,
    Jesse

  • Hi Jesse,

    There are at least two different kinds of ranking signals associated with links, and that’s partially where some of the confusion comes from.

    One of those signals is a query dependent ranking signal, based upon the words used in anchor text pointing to a page. For instance, a number of links pointing to this page that use the term “reasonable surfer” may make this page appear more relevant for the term “reasonable surfer” when the search engine shows search results if someone uses that term as a query.

    The other kind of ranking signal is query independent, and is suppposed to be a measure of how important a page is rather than how relevant it might be to a specific query. One of the most well know query independent ranking signals that Google uses is PageRank. The reasonable surfer model helps describe how important a link is on a page, and how much PageRank that link might pass along to the page being linked to. If a number of pages point to this page, and they have a number of positive features associated with them under the reasonable surfer model, this page may be seen by the search engine as more important.

    The combination of query dependent signals like anchor text and query independent signals like pagerank determine how highly this page might rank within search results. Google tries to return the most relevant and important pages in response to a search.

    My post that you pointed towards, Search Engines Applying Different Anchor Text Relevance from the Same Site and Related Site Links, involves how much weight the anchor text pointing to a link might have in determining how relevant a page is for the text in links pointing to the page.

    Instead of anchor text, the Reasonable Surfer model is about PageRank, and how much PageRank might be passed along to a page.

    I think Rand’s assessment of the power of the first link might be a little misleading the way it’s presented within his post.

  • Max

    Hi Bill,

    Thanks for the great article. I have learned a lot from your articles and reply on my comments.

    I think Google give more priority to links from relevant websites for search ranking. If you have backlink from relevant website with a specific keyword, it will help in search rankings. Therefore I think its better to have backlinks from relevant pages ain’t matter if its PR0 because these links will help you in gaining better search rank.

    Where good PR links from irrelevant may help you gain Page Rank but the won’t help in Search Rank which matters the most.

    I also think that 2 way backlinks are completely useless and Google doesn’t count them at all.

  • Excellent information, it is crazy to think just to what level Google is looking at things such as size or color of the anchor text. Most webmasters are happy to get a link but so much more matters.

  • [...] Bill Slawski/SEO by the Sea: Google’s Reasonable Surfer: How the Value of a Link May Differ Based upon Link and Document Featur… [...]

  • [...] forget that links are still an important part of SEO and recent research indicates that links within the text likely carry more trust and authority. And while backlinks are [...]

  • [...] and/or feature data. If poring over the patent is tedious, this analysis might be useful: http://www.seobythesea.com/?p=3806. [...]

  • Hi Mike,

    Thank you. We are given much more to think about when it comes to links, but I do like that it all boils down to the fact that a link that is most likely to be clicked upon is the one that probably carries the most weight.

  • Hi Max,

    You’re welcome.

    I think it can help if a link is from a relevant site, but I’m not convinced that a link from a site that isn’t related is stripped of all value. There are often good reasons for providing links to other pages that visitors might find useful, even if they are about other topics. For instance, a page about naming a business might benefit from a link to a tax attorney – unrelated in terms of themes, but perfectly helpful to someone who needs small business advice.

    I’m not convinced either that links back and forth between two different sites are completely useless and not counted by Google. I think the search engine is more concerned about excessive reciprocal linking, done primarily to increase the rankings of pages rather than for some other legitimate purpose.

  • this is, IMO, was in place since 2006 (atleast). We have experimented with customize footers and navigation and did achieved some remarkable results although things are more sophisticated now in terms of evaluation of a link on the page, especially if its an internal link. The idea of structuring a webpage is not new. We have read some Google patents on how they can analyze and structure a web page and then assign some value to different areas depending upon various signals like CTR, Prominence, Freshness etc

    The new thing is, we don’t have some sticky rules now. For instance customize footers has started to loose its value compare to the value it had in past. So what link weighs what, will be the next Best SEO Tip you can have :)

  • [...] Surfer model. As always ’search engine patent spotter’ Bill Slawski found it and wrote a great post about [...]

  • Sandra

    What I would like to know is: If there are two links on the page A, both pointing to the page B, and they have different anchor and are in different place on the page A, how does Google know which one of the two is clicked?

    I think their analysis with regards to user behaviour and link placement can only be done based on links that appear only once on the page.

  • [...] The reasonable surfer – Bill Slawski (but please read this one, too) [...]

  • Hi Haseeb,

    It’s possible that Google has been giving different weights to links in different locations on a page sometime shortly after the search engine launched. I think the best take-away from this patent filing is a confirmation that not all links on a page may pass along the same amount of PageRank, and that the analysis that a search engine does in determining the amount of pagerank that might be passed along is much more complex that just a simple statement like “the higher the link on a page, the more pagerank it passes along.”

  • Hi Sandra,

    One way for Google to know which link was clicked upon would be for the search engine to identify the next URL that someone visits through the Toolbar, and match it up with the URLs that are known about on the page the person is on presently. The toolbar may also be able to identify how far down a page that the person visiting might scroll, which could help tell them which link may have been clicked upon.

  • this a debate that i have to endure every monday. how google index a link and what value a certain back link carry depends on a large mix of factors. reputable links with PR 7+ will go a long way in promoting your PR rankings.

  • Hi Ron,

    I’ve seen links from pages with fairly high PageRank have substantial impacts on the rankings of pages that they link to. It is helpful though to look at something like this patent discussing a “reasonable” surfer model to get a sense of how much impact a link can have based upon the link itself, the page that it appears upon, and the page that it is pointing towards, and finding something like the patent makes it easier to have something at your fingertips that you can point people towards when your asked about the values of links.

  • [...] On May 11, 2010, Google was granted a new patent that basically states that all links on a page do not have to carry (or pass) the same weight. The concept is that the value a link should pass to a target page will be largely based on the probability that a user would click on it (hat tip to Bill Slawski who wrote a great post deconstructing Google’s reasonable surfer patent). [...]

  • I think it stands to reason that the new(er) paradigm should be renamed to “Reasonable Webmaster”. All of the link parameters that are becoming important are directly affected by the webmaster of the linking site. There is nothing “reasonable” about surfer’s response to a larger font or a contrasting color of the link. Likewise, a surfer clicking on the link that’s above the fold (ATF) vs. the one below the fold gives no useful indication that the ATF link is better (however you define “better”)- it’s just simply visible vs. not visible.

    It seems rather strange and somehow backwards that Google would give the webmaster more control over the link’s weight – hence ability to directly affect Google rank of the linked page. You’d think they’d be looking for more independent ways to verify this important parameter.

    Could this not be a “smoke screen” type of patent?

  • [...] of its potential to cultivate a sense of protectionism the Reasonable Surfer Patent means that link building could become a much harder sell. SEO strategists may be weary of adding outbound [...]

  • [...] a user would click on it. If that doesn’t make sense to you, check out Bill Slawski’s Google’s Reasonable surfer post and Eric Enge’s SEO Implications of Google’s Reasonable [...]

  • Bill,

    Great work. This is one of the best search posts I have read in recent years. Thank you.

  • Hi Scriptster,

    I’m still leaning towards this being a reasonable surfer approach, rather than a reasonable webmaster method. The patent focuses upon how a surfer might react to different links on a page, in a way that attempts to understand how they might interact with each, rather than giving us a clear idea on how to best set up links on our pages.

    There are a few articles that have now been written on this patent that make some broad generalizations such as the “top link” carries more weight than links lower down on a page, and that just simplifies things too much. The patent doesn’t pick out just one thing, such as the color or size of a link, or its placement on a page, but rather looks at a score associated with all of the features that the search engine might be using, in addition to user-behavior data to attempt to determine which link should be given how much weight.

  • Hi Bill,

    Well, that’s just the thing: I understand that font size and color are not the only things that matter. I presume they know enough (too much IMHO) about the visitor’s behavior from tool bar and Analytics, AdSense etc. and can see that if a lot of clicks come from a certain XY(top left) – XY(bottom right) rectangle on the page to count it as navigation and discount it or something like that.

    I’m just saying that the link is where it is on the page and looks like is does because the webmaster put it there. We also have to assume that Google already has to rely on the webmaster’s vote for the linked page for the simple reason that he/she decided to put the link there in the first place. But location/appearance of the link is such a weak and potentially misleading signal (not to mention, prone to abuse) that I find it strange Google actually needed it given that they can measure other factors – like the bounce rate for example.

    As far as being misleading signal: say, I list a number of important links in the alphabetic order (let’s say it’s in English for simplicity to avoid opening another can of worms). If the list is long enough, the last (but not least!) links are guaranteed to be pushed below the fold. They are guaranteed to be clicked less. Does it make the destination page any less important?

    You can say that listing those links in such order is the webmaster’s mistake. But how many of us attended web design schools and know/follow the best practices? There guaranteed to be simple human errors (if we can call them that) with link positioning / appearance.

    Anyways, thanks again for the write-up. It’s all the stuff that webmasters need to keep on the back of their minds designing pages. I sometimes wonder how many of us can actually still afford designing for visitors rather than Google … (but I’ll leave than rant for another time)

    Cheers!

  • Hi Scriptster,

    Thanks, you raise some great points.

    One of those that I’ve been thinking about a lot is the example that you provide with lists – I was wondering whether Google might weigh each link within a list of links differently or the same, even though the order of those links might be random or alphabetical or chronological (with newer articles listed first or last) or in other ways. Thinking about that, and following the logic describing how Google might treat list items in this recent post – Google Defines Semantic Closeness as a Ranking Signal, ideally Google would treat each link as if it were the same distance away from the title of the list.

    I was actually thinking about a followup post, possibly with the title “I am an Unreasonable Surfer,” but I think I’d rather address some potential issues involving the reasonable surfer model here, especially since you’ve brought the topic up so well.

    There are a few sites that I like to visit because they have blogrolls or sidebar links that go to some incredible sites. I often visit those pages regardless of whether they have new content or not so that I can visit the pages listed in those sidebars. Google also provides a sidebar widget where people can list blogs and have links to the latest posts from those blogs appear within the widget – it’s evident that they know that a curated list like that can have a fair amount of value as a resource that people will use. The types of sidebar links that I describe possibly should have less weight than a link in the main content area of a page, but how much less when they are important features on those pages?

    The location and appearance of those links is only one of the signals that the search engines is looking at though. I’m not a big believer in the search engines using “bounce rate” as a ranking signal because it’s one of the weakest signals that a search engine could use. But there are other user-behavior signals that are stronger. Bounce rate can be a stronger signal for pages that attempt to have visitors visit another page on a site such as a checkout page, but much weaker when the page itself answers an information need or fulfills a transactional need on its own.

    Thanks again, for asking some interesting questions.

  • [...] Como a força de um link pode mudar de acordo com o texto que ele está incluido [...]

  • [...] של גוגל שהוגש כבר בשנת 2005 אושר בתחילת יולי. הפטנט מאפשר לעקוב אחרי מיקום העכבר של הגולש בדפי [...]

  • Extremely insightful post and one to which I continue to refer to for refreshed nuggets of perspective on the valuation of links. I think the factors presented here can be ranked, analyzed, and speculated on ad infinitum, but from a practical standpoint what this boils down to for me is the need for broad domain variance in a linking profile.

    When one is out and about slogging through the tedium of manual link building, the insights presented in this article are highly valuable in helping one gain the most value from any particular vote a webmaster is willing to cast. But at the end of the day, one isn’t going to be able to convince every webmaster out there to do exactly what would be ideal–from the standpoint of this very detailed analysis of what Google considers to be the most valuable link qualities for the most reasonable of surfers–when placing a link.

    Since people are people–even if we can disect Google’s patent, we still have to interact with each other–I think the best means to achieve the ideal factors here is to simply build more quality links. When you do the work to identify and build more quality links, more of these factors are present in a link profile. It boils down to hard, smart work, IMO. When that gets done, many of these criteria are met in a very natural way.

    At any rate, excellent piece. I’ll be back to reference this again soon, I’m sure.

  • Hi Ramsay,

    Thank you.

    This patent does give us some hints at what Google might be looking for when deciding how much weight to give to links, and I agree with you about the best approach being to build quality links which will likely fulfill some of the approaches described within the patent.

  • [...] Rand Fishkin recommends you should read Bill Slawski’s Reasonable Surfer Model – his presentation can be found [...]

  • Bill, thanks for the scientific take on the matter. Yay, there’s plenty of great posts here. I’m glad I stumbled. Alex.

  • [...] Affiliated Page Link patent was filed the same month in 2004 as Google’s Reasonable Surfer patent, which told us that the weight or contribution of a link on a page to the ranking score of a page [...]

  • [...] In any case, if it isn't a factor now, we can be sure it will be. See: Google's Reasonable Surfer: How the Value of a Link May Differ Based upon Link and Document Features and User Data. And whether it's a ranking factor or not, it's a usability factor of the highest order, and that [...]

  • I think the biggest issue is once you start feeling like you understand how Google is ranking sites they change them, and this is why I still use a mix of old and new SEO techniques because you just never know. & like always what was once old becomes new again.

  • Hi Jason,

    We do know that Google is constantly looking to improve the quality of their search results, and they’ve been mentioning in a number of places that they average at least one change to their core ranking algorithms everyday.

    Some of those changes may be identified in places like their patents or papers or blog posts, but chances are that most of the changes happen without being written about.

    At it’s core though, the ultimate goal of most of those changes is to improve the experiences of searchers, and to make it easier to help searchers find what they are looking for. Many older SEO techniques can continue to be effective because they also pursue those goals – making a site more relevant for the people interested in what it has to offer to the people most likely to be searching for it.

Leave a Reply

 

 

 

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>