Google’s Reasonable Surfer: How the Value of a Link May Differ Based upon Link and Document Features and User Data

Not every link from a page in a link-based ranking system is equal, and a search engine might look at a wide range of factors to determine how might weight each link on a page may pass along.

A diagram showing different values for links passing amongst three different web pages.

One of the signals used by Google to rank web pages looks at the links to and from those pages, to see which pages are linked to by others. Links from “important” pages carry more weight than links from less important pages. An important page under this system is one that is linked to by other important pages, or by a large number of less important pages, or a combination of the two. This signal is known as PageRank, and it is only one of a large number of Google ranking signals used to rank web pages and determine how highly those pages show up in search results in response to a query from a searcher.

An early paper by the founders of Google, The Anatomy of a Large-Scale Hypertextual Web Search Engine, tells us:

PageRank can be thought of as a model of user behavior. We assume there is a “random surfer” who is given a web page at random and keeps clicking on links, never hitting “back” but eventually gets bored and starts on another random page. The probability that the random surfer visits a page is its PageRank.

Under that approach, any link from the same page might carry the same amount of weight, or importance, when pointed to another page.

A Google patent filed in 2004 and granted today takes a somewhat different approach to the value that links might have when they appear on the same page:

Systems and methods consistent with the principles of the invention may provide a reasonable surfer model that indicates that when a surfer accesses a document with a set of links, the surfer will follow some of the links with higher probability than others.

This reasonable surfer model reflects the fact that not all of the links associated with a document are equally likely to be followed. Examples of unlikely followed links may include “Terms of Service” links, banner advertisements, and links unrelated to the document.

The patent is:

Ranking documents based on user behavior and/or feature data
Invented by Jeffrey A. Dean, Corin Anderson and Alexis Battle
Assigned to Google Inc.
United States Patent 7,716,225
Granted May 11, 2010
Filed: June 17, 2004

Abstract

A system generates a model based on feature data relating to different features of a link from a linking document to a linked document and user behavior data relating to navigational actions associated with the link. The system also assigns a rank to a document based on the model.

In this “reasonable surfer” model, not every link that appears upon a page is equal in value. Different features associated with links, and the pages they appear upon and point to, may determine how much value those links pass on to the pages to which they link.

Features of Links and Documents

Under this patent, when a search engine crawls and indexes pages on the Web, it may create a model that it uses to help rank those pages which looks at features associated with the source pages that links appear upon, the target pages that links point to, and the links themselves. The search engine may also collect data about how visitors to pages use those pages, such as which links they click upon, what query terms they use to find pages, and other information that could be collected from a web browser or an add-on to a browser, such as a toolbar.

The following lists provide examples of features, and not all features listed may be used, while other features could be considered as well.

Examples of features associated with a link might include:

  1. Font size of anchor text associated with the link;
  2. The position of the link (measured, for example, in a HTML list, in running text, above or below the first screenful viewed on an 800 X 600 browser display, side (top, bottom, left, right) of document, in a footer, in a sidebar, etc.);
  3. If the link is in a list, the position of the link in the list;
  4. Font color and/or other attributes of the link (e.g., italics, gray, same color as background, etc.);
  5. Number of words in anchor text of a link;
  6. Actual words in the anchor text of a link;
  7. How commercial the anchor text associated with a link might be;
  8. Type of link (e.g., text link, image link);
  9. If the link is an image link, what the aspect ratio of the image might be;
  10. The context of a few words before and/or after the link;
  11. A topical cluster with which the anchor text of the link is associated;
  12. Whether the link leads somewhere on the same host or domain;
  13. If the link leads to somewhere on the same domain,
    • Whether the link URL is shorter than the referring URL; and/or
    • Whether the link URL embeds another URL (e.g., for server-side redirection)

Examples of features associated with a source document might include:

  1. The URL of the source document (or a portion of the URL of the source document);
  2. A web site associated with the source document;
  3. A number of links in the source document;
  4. The presence of other words in the source document;
  5. The presence of other words in a heading of the source document;
  6. A topical cluster with which the source document is associated; and/or
  7. A degree to which a topical cluster associated with the source document matches a topical cluster associated with anchor text of a link.

Examples of features associated with a target document might include:

  1. The URL of the target document (or a portion of the URL of the target document);
  2. A web site associated with the target document;
  3. Whether the URL of the target document is on the same host as the URL of the source document;
  4. Whether the URL of the target document is associated with the same domain as the URL of the source document;
  5. Words in the URL of the target document; and/or
  6. The length of the URL of the target document.

User behavior data associated with documents and links may also be considered, such as:

  1. Information about how people access and interact with documents, such as navigational actions (e.g., links selected, web addresses entered, forms completed, etc.),
  2. The language of the users,
  3. Interests of the users,
  4. Query terms entered,
  5. How often a link is selected,
  6. How often links aren’t selected when one link is chosen,
  7. How often no links are selected on a page,
  8. etc.

This user behavior data could be obtained from a web browser or a browser assistant program such as Google’s Toolbar.

How Features May Influence the Weight of a Link

This model based upon features is intended to determine how likely a link on a page might be selected based upon positive and negative aspects of those features.

For example, a link with anchor text that is bigger than a certain size may have a higher probability of being selected than links with anchor text of a smaller size. Links positioned closer to the top of a page may also be more likely to be clicked upon. If the topic of the document being pointed to is related to the topic of the page the link appears upon, it may also have a higher probability of being selected by a visitor to the page. So, a link in a larger font, near the top of a page, leading to a page covering a similar topic as the page it appears upon may have a much higher probability of being chosen by a visitor than a link using smaller text, appearing at the bottom of a page, pointing to a page on an unrelated topic.

The patent provides a number of other examples of rules that might be applied to different features to determine how likely it might be that different links on a page might be selected and clicked upon by a visitor. Those probabilities are used to determine a dynamic weight for each of the links that can influence how highly the pages they point to might rank in Google. The different weights for the links might determine how much PageRank that each link passes along to other pages.

Or, as the patent filing tells us:

The rank of a document may be interpreted as the probability that a reasonable surfer will access the document after following a number of forward links.

Conclusion

How much value might a link on a page pass along in a link-based ranking system like PageRank?

Under the patent filing granted today, the value of a link may be different based upon a large number of factors, such as where the link is located on a page, whether the link is a different color or font style than other links, how many words are used in the anchor text for the link, whether the link text used is commercial or not, what the topic of the page is that the link appears upon and the topic of the page pointed to by the link, and many others.

It’s likely that in the early days of Google, the search engine quickly moved past the 1999 description of PageRank in The PageRank Citation Ranking: Bringing Order to the Web, where the weight of links were shown as split equally amongst links pointing out from a page. This patent describes a number of approaches that Google may have used to weight the value of links differently, though it’s likely that the lists above provide more value as possible examples of how links might be weighed than as definitive guidelines.

It does offer one broad rule of thumb that might be helpful. Which links on a page are most likely to be selected by a reasonable surfer – those are the links that probably carry the most weight.

Share

248 thoughts on “Google’s Reasonable Surfer: How the Value of a Link May Differ Based upon Link and Document Features and User Data”

  1. Great stuff Bill and thanks for the heads up – this is something i’ve heard about and thought about before – identifying those important links is always high on the agenda :)

    Be in touch very shortly ;)

  2. Great Information here. Thanks for the article. Granted, a lot of this has most likely been happening for some time now, this one really stuck out to me: “How commercial the anchor text associated with a link might be”. If that’s not an argument for long-tail and variation, I don’t know what is.

  3. Great find Bill – another great overview. Theres a lot of pretheorised concepts in there but its always good perhaps seeing a bit more method to the madness

  4. Thanks for the breakdown Bill and for pointing out this patent. Especially liked the link weight examples – font-size, placement etc. Good to know this was 2004… who knows what the G-men are up to now? ;~)
    ~ Jim

  5. Great article Bill, there’s been plenty of talk about what could happen but you really brought it down to the ground in this post. Time to start taking some of these points on board. Won’t do any harm to start getting used to implementing some it.

  6. Bill, your explanation is likely the best I’ve seen so far. I tweeted it just now, so hopefully you’ll get a few more visits. Great job explaining the often misunderstood PageRank of Google and how links are weighted. Very comprehensive.

  7. Hi Jen,

    There are some interesting “features” that were listed in the patent, like the one you point out about commercial anchor text. I also thought the one about how related the anchor text is to the topic of the page it appears upon was pretty interesting as well.

  8. Hi Gyi,

    I was pretty excited when I started reading this, inspite of the 2004 filing date, because of how well provided some potential answers to questions I’ve seen over the past few years about links, and raised new questions that I haven’t seen anyone asking.

    Thank you.

  9. Hi Peter,

    Thanks. I did like seeing an intelligent framework around how different features from links, the pages they are on, and the pages they point to might be factored together, which is what this patent provides. The specific examples in the patent may or may not be in use, and if the process described in the patent is being used presently, new features may have been developed since 2004, but the basic concept is pretty simple – some links on a page are more likely to be clicked on than others, and are more meaningful than others.

  10. Hi Jim,

    You’re welcome. I really enjoyed going through the different features that they listed within the patent and thinking about how much value each might have and what they might mean to how pages are ranked. Some of the issues that I’ve seen discussed in many different ways and forms on the Web, such as “do sitewide links matter,” or “how might search engines identify and address paid links,” or “how much value do footer links have,” might be addressed slightly differently within the framework described in this patent.

  11. Hi Lou,

    Thank you for your kind words, and your tweet, and thanks to everyone else who tweeted or referred to this post in some manner.

    We know that Google most likely started tweaking PageRank from day 1, but specifics on how they might have made changes to it have been fairly elusive. This patent presents some possibilities that perhaps haven’t been addressed too deeply.

  12. Hi Shaun.

    Thanks. I think we’ve seen hints of some of the features listed in this patent elsewhere, and referenced by other search engines as well, such as links within different parts of a page carrying different weights, which was mentioned in Microsoft’s paper on Block Level Link Analysis of web pages.

    I remember an interview with one of the search engineers from Yahoo last year also revealed that Yahoo weighs links differently based upon a number of different signals. I recall a similar statement from Matt Cutts within the last year, but haven’t been able to locate it, though I will keep on looking.

  13. I’ve wonder about the effect of Google’s jihad encouraging placement of rel=”nofollow” tags into links (see point man Matt Cutts) upon PageRank and importance of links. If the majority of new links in the web universe now come with “nofollow” tags and google really does not count these links in its ranking algorithm anymore, how does a website owner move up in rank? It seems like the tools for doing so are now less about links and more about the other factors in the google algorithm. It’s not that links aren’t important. It seems legitimate ways of acquiring links are drying up due to the explicit policies of google. Is their goal to freeze all the pagerank juice in the large, established websites?

  14. This was very interesting. I also briefly read the filing which was also quite detailed. What strikes me is that a very broad method of analyzing data can be patented. The question I would have is that if Bing were to analyze a text link and the five words before and after, would they be in violation of Google’s patent and liable for damages?

  15. I’ve always wonder how much of what google puts into their patents actually ends up getting used. They just seem to patent things so their competitors don’t use them. For example, I’m no engineer but do they actually use latent semantic indexing? I’ve never really found any hard facts to say they do. And as for this one… font styling effecting links… yeah maybe but I don’t think it’s worth digging into. I say just keeping more good varied links :)

    With that said, still a good post and very insightful.

  16. This is totally different from we know about a page rank transfer algorithm, This have cleared my confusions. Thanks Bill.

  17. Great things to consider. This adds more fuel to the debate on whether topical links make any difference or not: YES – “what the topic of the page is that the link appears upon and the topic of the page pointed to by the link”.

  18. Thanks Bill,

    An excellent (& detailed post)copies saved for future reference, thanks for your time on this, it is appreciated.

    Its been a while since I have read through Google patents but I remember how time consuming this is.

    It is true that this can be soul destroying research especially as a large proportion will not come into affect, however the skill is to identify which patents are practical & would improve the search experience.

    Thanks again

    Pete

  19. Hi jjray,

    I really haven’t been a big fan of the rel=nofollow attribute and value either. Initially conceived as a way to try to cut down on blog comment spam, Google expanded its use so that people could tag paid links with it, and some site owners and SEOs tried to use it to redistribute the way PageRank flowed through the pages of their sites, not realizing that the search engines were quite possibly doing that all along anyway – see what the patent has to say about passing along less link value to links on a site like “terms of service.”

    Ultimately, the best way to attract links to your pages is to do something remarkable, something unique, something noteworthy that people will take notice of, and bookmark your pages, refer friends to your site, and link to you. Of course, that’s hard to do, but it’s worth the effort.

    Google is looking at signals other than links as well, which isn’t necessarily a bad thing. Note that some of the signals mentioned in this patent involve user behavior data, and some even focus upon areas where customizations and personalization may play a role in which links are shown to which viewers. A smaller site that understands its audience well, and focuses upon them might have the ability to rank higher than larger sites, especially ones that are slow to change and to adapt to new interests from its audiences.

  20. Hi Bill,

    Good questions.

    I write about a lot of pending and granted patents here, and arguments can be made against many of them that the areas that they cover are fairly broad, that the particular processes they involved may not be new or useful or nonobvious. Rather than focusing upon those arguments, I’d much rather see what it is we can learn about the search engines, and the underlying assumptions that they may make about search and searchers and the Web.

    Chances are that a lot of the patent applications that get filed won’t become granted patents. Since the major search engines all have similar end goals or targets, such as delivering meaningful and relevant search results to searchers, there’s often a fair amount of similarity in the patents that they file for patent protection. If two patents include some methods or processes, such as looking at a number of words before and after a link, but the overall process described in each of the patents are different enough in significant ways, then the granting of one of those patents likely wouldn’t invalidate the possible granting of the other, or the use of such a process by someone who doesn’t have a patent but uses that particular part within a larger process.

    The idea of looking at text associated with a link to understand more about the page being pointed towards is quite possibly within the public domain at this point, and is something that search engines have likely been doing for more than a decade, perhaps even back to the use of hypertext analysis by the World Wide Web Worm in the mid-90s.

    I have seen Google, Yahoo, and Microsoft cover some very similar search-related problems using approaches that are worded very differently, including steps that are unique to each. I’ve been keeping my eyes open to the possibility of law suits with some of them, but really haven’t seen that happening.

  21. Interesting stuff, thanks for the post. While most of this stuff is taken for granted in SEO circles, this may help to prove to the powers that be, that stuffing home pages with banners for promotions is, in fact, stifling SEO, and would be better served with optimized content and text-links.

  22. Hi James,

    It’s quite likely that links that appear in the footers of pages carry less weight than links appearing within the main content areas of pages. We’ve heard something to that effect from both Matt Cutts, and from one of the chief search researchers at Yahoo, and Microsoft’s paper on Block Level Link Analysis points to that approach as well.

    That’s not to say that footer links don’t have any value independent of search engines. They can be useful for visitors to a page, especially if they make it more likely that someone who has scrolled down to the bottom of a page might click on a link to another page of the site without having to bother to scroll back up to the top of the page. I’m not sure that I would focus upon adding more footer links to pages mainly to try to influence the amount of PageRank going to those pages, however.

  23. Hi David,

    Thank you.

    Sometimes it’s easy to see when Google has developed something described in a patent, and sometimes what one of their patents describes may be something almost impossible to determine whether or not it is in use. Some patents seem like they are intended to protect an idea that may or may not be economically feasible, but they want to keep the possibility open. Others seem like they cover areas that Google may not go into, and may have filed to try to stop others from entering that field.

    Google has published a number of patents that describe the possible use of probabilistic latent semantic indexing (PLSI), which is different from latent semantic indexing in some very meaningful and significant ways, and it’s possible that they use PLSI.

    There are a number of possible features listed above that are significantly more impactful than whether or not the font size, font color, and font styling of a link may be different, and those are worth considering.

  24. Hi Andrew,

    Thanks for the thumbs up.

    The truth of the matter about PageRank is that there have been a lot of possible variations of how PageRank could work published by Yahoo, Microsoft, IBM, and academic reseachers over the years since it was first introduced. It likely started evolving from the moment it began being used by Google, and has probably been tranformed significantly since then.

    This patent provides some possible hints as to some parts of that evolution, but it was filed 6 years ago itself. I think it’s helpful, but what Google is using now is probably even more complex.

  25. Hi Tom,

    The topical aspect raised in the patent had me start asking myself questions about how Google might be identifying topics, and how much weight that particular feature might have, too.

    The section in the patent on user behavior, and its mention of user groups and user interests seemed aimed at a personalized PageRank as well. The patent only really brushed the surface of the significance of that feature in its description.

  26. Hi Jeremy,

    I have seen some of these factors mentioned in forums and blogs, as well as variations of them, often with limited support or evidence to back them up, which made it hard for me to take them from granted. It was good seeing them listed in an actual patent from the search engine, within an intelligent framework for their use. In spite of that, I’d still approach them with a healthy amount of skepticism and actually do some testing on my own.

    It is good to have something more than SEO folklore to go on, though. :)

  27. Thank you, Pete

    I do look through a lot of patents each week, and only write about a few of them. It is time consuming, but it can be very educational and worth the effort.

  28. Very lengthy and informative post on seo and page rank, in the end lot of it is common sense, if you get a link from a page that has relevant and plenty of links, then that link for you will be worth more than a link froma page or site that is less popular and or relevant to your keyword.

    great post on seo, please feel free to take a look at my online marketing blog also

  29. Thanks again Bill for spending that much time to sum up patents for us. Man, I managed to read an entire patent once, and once only… you’re a true genius (or maybe an alien, I don’t know ;)

    We can say that we kind of knew Google was using most of these features already (although the patent was filled in 2004, no one can tell if they already used some of them or all of them). I think what will catch reader’s attentions the most is that trafic received by each link can have an influence on its value. I believe this factor isn’t used yet (or not often…), as we’ve seen many websites ranking well for keywords whose external links were mainly on footers. So one can just hope that Google uses click data more often, this could truly improve search results. Of course spammers could build lots of bots to click on their links, but that’s another story, and we know that Google’s very good at catching click fraud activity.

    And I agree with you, haven’t been a fan of nofollow either. Lots of webmasters seem to use them systematically for outgoing links, even for links they choose to add on their content themselves! (one example I saw today was Kosmix http://www.kosmix.com/topic/Cell_%28microprocessor%29)

    In that case, does Google really not count a nofollow link if it’s within a text and clicked on my many visitors? If they do, then seriously, it’s time to find another way to fight link spam.

  30. This is great information, Bill. Where do you get information like this? Anyway, is it better to get a link from a relevant, PR0 site than from an unrelated site with a PR of 4?

  31. This is an outstanding find Bill – thanks very much for bringing this to the community’s attention.

    This seems to very much marry in with page segmentation, but is distinctly different from it too.

    I’d really love to see how the search engines today really go about measuring things like font size/text colour/text location in practice. DOM parsing alone can give you lots of information (e.g. figuring out which elements are site-wide) but not the full picture. Headless browsers are quite intensive to run compared to more simple spiders, but would give the most accurate results. I’m guessing they’ve found a “good enough” compromise somewhere.

  32. Hi Tahire,

    Thank you. There are a lot of potential directions that a search engine could go in when trying to determine how to rank web pages and to decide in which features or signals should matter, and many of those are likely common sense as well. For instance, decisions like attempting to incorporate user behavior data into giving links on the same page different weights could be considered common sense, but the search engine would also need to have some kind of tecnology in place to calculate that user behavior.

  33. Hi Dominic,

    Thanks. I find information like this from spending a lot of time searching for and reading through patents and white papers and blog posts and RSS feeds. :)

    If all other things about the PR0 and the PR4 page where equal, and the other features involved with those links were the same, I’d probably prefer the link from the PR4 site than the PR0 site, even though the topic was unrelated. There’s probably a better chance that the PR4 site would be visited by more people, and the odds that I might get more visitors from that link would likely be greater.

  34. Hi Nadir,

    Thanks. After spending a good number of years reading through patents, I’m finally starting to get used to them. :)

    We’ve guessed and surmised and likely experimented with many of the concepts described in this patent, and probably missed a few as well. Traffic through a link can be a signal on its own, regardless of the location on a page or other features involved, which is definitely why the patent looks at those features together as a whole rather than independently of one another.

    As for links in footers, it’s possible on some sites that footer links are clicked upon frequently as reasonable navigational aids, while on other sites they are large keyword stuffed lists of links that tend to be ignored. It’s possible that some footer links have more value than others when it comes to SEO.

    I’m not sure if Google would ignore a nofollow value on a link if the link were in the main content area of a page, and clicked upon by a large number of visitors to a page. Perhaps they wouldn’t pass along PageRank and relevance through hypertext, but may pass along some kind of authority or quality value because of the prominence of the link and the traffic that it receives. We’ve heard that Google considers more than 200 different signals in determining the rankings of pages in search results, and that PageRank and hypertext relevance are only a couple of them.

  35. Hi Ian,

    You’re welcome.

    This does fit in nicely with both Page Segmentation and with the idea that a search engine might ignore some sections of a page if they might be considered boilerplate.

    It probably is fairly cost intensive to try to crawl sites and analyze their structures from both a DOM approach and a visual analysis. We do have to keep in mind that Google does collect information about web pages which they display as cached versions of pages. They can spend a lot of time going through those cached copies on their own servers and analyze the structures of those pages.

  36. I am thinking of trying to utilize the font size aspect of links as design element. Using CSS i think one could make some interesting visual effects and maybe get a bit more link juice. I have seen some nice use of CSS on some sites that really bump the text size up. That could be used on a link.

  37. I think this is my favorite post you have ever done. Probably the best indication of a good post is the number of questions it raises in your mind. This is inspiring me to wonder all kinds of things and I’m dying to try some tests on some of these metrics you’ve pointed out.

    The thing that makes me the happiest is the idea that all the factors point towards what a reasonable surfer would do. I wish I could make an intelligent comment right now but too many ideas are swirling around.

  38. Great find! It’s nice to see that some clarification on how a link is weighted is finally being uncovered. Common sense seems to meet these criteria exactly, but it is interesting how something as seemingly insignificant, such as colour of text for the link, could actually impact its value. Thanks for the resource.

  39. Excellent article Bill, a fantastic breakdown of factors we can expect to play a huge part in Google’s ranking algorithm, as these factors play a larger part, it will be interesting to see how many of these companies with the larger pockets and ranking well take a hit from their participation in purchasing thousands of links mainly in other sites sidebars and footers… Especially those in the SEO industry.

  40. First off, this is a great post.
    My question is as follows.
    For Features associated with a link, what about if there are two links which go to the same page? Lets just say they have the same anchor text for now (to keep things simple). Would the one that shows up first on the page (in the code) pass more link juice? Or would they be equal? Or could they actually be penalized?

  41. Thanks for the fascinating analysis. One can’t help but wonder how long it will be before we’re all trying to figure out the relative significance of sky blue links vs. aqua. This will definitely give me a lot to think about.

  42. Hi Superstar,

    If the processes described in this patent are in use by Google, it is possible that the design of different elements of a page might now impact how much PageRank may flows through a link. Colors, font sizes and styles, and more could have an impact. What isn’t quite clear is how Google might interpret presentation through the use of cascading style sheets. It’s an issue that has been worth thinking about for a while, and perhaps even more so now.

  43. Hi Marjory,

    Thank you for your kind words.

    The patent has raised a large number of questions for me as well, and a number of ideas on how to explore some of them as well. I agree with you. I do like that they introduced the concept of a “reasonable surfer,” as a way of thinking about this patent, opposed to the “random surfer,” often used in conjunction with PageRank.

  44. Hi Kennedy,

    Thanks. If we were told by Google that different links would be weighed differently, which we have been, but not given some examples like we are here in the patent, I think we might have been able to guess at a number of these link and page features. If Google is doing something like this, which is probably a good bet, we can’t be sure whether or not they are using all of the factors that they listed, and whether they might be considering others as well.

    If a link stands out because it uses a different color text, and that makes it more likely to be clicked upon, perhaps that is something a search engine should consider. The question is, would it truly be more likely that someone would click upon it/

  45. Hi Geoff,

    Thank you.

    One of the early paragraphs from the description of the patent reads as follows:

    Search engines attempt to return hyperlinks to web documents in which a user is interested. The goal of the search engine is to provide links to high quality documents to the user. Identifying high quality documents can be a tricky problem and is made more difficult by spamming techniques.

    Chances are that if Google were to pursue an approach like this, that they’ve already begun to do so in some fashion. Many of the features that they list in the patent could indeed be considered “reasonable.” What kind of impact might they have upon paid links? I would guess that they might make people think more about how much value there is to buying links, and about how useful those links might be. If a Google Toolbar PageRank indicator shows that a page has a PageRank of 5, it may not mean that a page being linked to from that page gains the value of a link from a PageRank 5 page.

    Chances are that was true in the past; hopefully after this patent has been published, that is more clear to people pursuing paid links.

  46. Hi Jody,

    I’m hoping the decision on whether to use links that are sky blue vs. aqua remains in the hands of a designer trying to communicate ideas with visuals rather than someone tryng to pass along PageRank. Definitely somethings to think about here, though.

  47. Hi SoFlaWeb,

    This patent doesn’t explicitly describe the situation of more than one link from the same page pointing to another specific page, and whether or not PageRank value flows from both links.

    It is possible that a search engine might merge together links that are on the same page pointing to the same page. See my post How a Search Engine Might Analyze the Linking Structure of a Web Site, which describes how Microsoft may have approached that issue. It’s also possible that Google might be doing something similar.

  48. Thanks for the post Bill great info. In your opinion do you think it would be better to have a banner w/ a followed link at the top of a page or a text link on the right hand side of a page? From what I gather the text link would have more value?

  49. Great article and thanks for getting this info up so quickly. It is nice to see some public evidence for a theory I and others had been talking about for a while – that not all links are equal.

  50. Hi Bill,

    This is an excellent summary post. It was referenced recently by Rand at SMX London so I thought I would swing by and catch up. Didn’t disappoint.

    Regards,
    Ben

  51. Hi Dave,

    While the patent filing gives us a large list of features that it might view in determining which links might carry the most weight, I don’t think it’s helpful to view each of those features as isolated ones. A dynamic model might be created for each page that looks at both positive and negative aspects of features that attempt to determine which link would more likely be followed by reasonable surfer.

    The search engine might look at the words used as anchor text for the link and alt text for the image to determine whether that text was relevant to the topic discussed upon the page they appear upon and the target page of the links. It might look at how many times in the recent past the banner and the text-based link were clicked upon. It may check to see if either of those links point to a page on the same domain as the page the links appear, or on a different domain.

    The features that I listed above were examples from the patent filing, and Google could possible look at other things as well, but it would likely look at all of the features together to come up with a score for each link.

    It’s possible that a text-based link might be considered more valuable than an image based link, and that a link in the main content area of a page at the top of that content section might carry more weight than one in a sidebar, but the other characteristics might be considered as well. We also don’t know from the patent filing how much weight each of the features might carry.

    Because of that, it’s possible that a link further down on a page might have a higher link value than one higher on the page based upon features of those links, the documents they appear upon, the documents they point towards, and user behavior data that might be associated with them.

  52. Hi Adrian,

    You’re welcome. It is nice to see something like this patent come out and provide some confirmation of ideas. Of course, we don’t know how much of this patent may have been applied to what Google does, but it does seem reasonable that they would follow through with a lot of the ideas considered within the patent.

  53. Hi Ben,

    Thanks. I was happy to hear that it was one of the topics of conversation at SMX. Hopefully it will lead to some discussions about the topic elsewhere as well, and exploration of the ideas within the patent.

  54. Thanks for the informative post!
    First a random surfer model, then a reasonable surfer model.
    I wonder what’s next – a sophisticated surfer model? :-)

  55. Hi Udi,

    You’re welcome.

    A couple of years ago, some researchers from Lehigh University proposed a Cautious Surfer model, which suggested that an element of trust be incorporated into calculating PageRank.

    I wrote about it in a blog post titled Link Analysis, Web Spam, and the Cautious Surfer.

    I also took a closer look at one of the original PageRank patents in that post, and what I found was that a number of the elements from this “Reasonable Surfer” approach were already discussed in the Random Surfer model of PageRank. A snippet from that post:

    For instance, the following factors are noted as possibilities that could weigh in on the value of a link from a source page, to increase the probability that someone would end up at an important page if they were surfing the Web and followed the link:

    Whether the links are from different domains
    If the links are located at different servers
    Which institutions and authors maintain the links
    Where those institutions and authors are located geographically
    Whether the links are highly visible near the top of a document
    If the links are found in web locations such as the root page of a domain
    If the links are in large fonts, or are emphasized in other ways
    If the pages that the links are upon have been modified recently

    The original random surfer may have been more of a reasonable surfer than we give him credit for.

  56. Hi Bill, thanks for yet another masterpiece. As of now I’ve stopped examining the relevance of links ‘too deeply’. I Guess I just need to visit here more often to keep up with the latest developments. Along with SEOmoz, StomperNet and of course Google themselves, you provide all the information about SEO I could ever hope to digest.

    This article along with most others you produce emphasises that quality content still figures high in the list of key ingredients for online success.

  57. brilliant essay! Thaks a lot! Sometimes other people just succeed in bringing to paper what oneself has had in mind for ages… ;-)
    It is indeed one of the most successful strategies to identify these valuable links and keep away from all the crappy stuff that you can get very easily! And that makes the difference between wannabe and professional SEOs.

    Regards
    Nico

  58. Hi Bit Doze,

    As a Google search engineer once remarked at a conference I attended when asked about how much information they track, “We have lots and lots of computers.”

  59. Thank you, Nico

    Many, but not all of the signals described in the patent have been discussed as possible influences upon PageRank for a few years. Regardless of that, many people discussing PageRank have been writing about it as if each link on a page contained an equal amount of PageRank. I agree with you – being able to distinguish between valuable links and and less valuable links can really make a difference.

  60. Hi Jesse,

    There are at least two different kinds of ranking signals associated with links, and that’s partially where some of the confusion comes from.

    One of those signals is a query dependent ranking signal, based upon the words used in anchor text pointing to a page. For instance, a number of links pointing to this page that use the term “reasonable surfer” may make this page appear more relevant for the term “reasonable surfer” when the search engine shows search results if someone uses that term as a query.

    The other kind of ranking signal is query independent, and is suppposed to be a measure of how important a page is rather than how relevant it might be to a specific query. One of the most well know query independent ranking signals that Google uses is PageRank. The reasonable surfer model helps describe how important a link is on a page, and how much PageRank that link might pass along to the page being linked to. If a number of pages point to this page, and they have a number of positive features associated with them under the reasonable surfer model, this page may be seen by the search engine as more important.

    The combination of query dependent signals like anchor text and query independent signals like pagerank determine how highly this page might rank within search results. Google tries to return the most relevant and important pages in response to a search.

    My post that you pointed towards, Search Engines Applying Different Anchor Text Relevance from the Same Site and Related Site Links, involves how much weight the anchor text pointing to a link might have in determining how relevant a page is for the text in links pointing to the page.

    Instead of anchor text, the Reasonable Surfer model is about PageRank, and how much PageRank might be passed along to a page.

    I think Rand’s assessment of the power of the first link might be a little misleading the way it’s presented within his post.

  61. After reading the first illustration of http://www.seomoz.org/blog/10-illustrations-on-search-engines-valuation-of-links, I though of this article and your analysis on “first link wins” at http://www.seobythesea.com/2009/09/search-engines-applying-different-anchor-text-relevance-from-the-same-site-and-related-site-links/#comment-208633

    Now I’m very confused about that if google sees its relevance or just pass more power to the first link which higher up the second one when google encountered two links associated with same anchor text.
    Could you explain it for me?
    Regards,
    Jesse

  62. Hi Bill,

    Thanks for the great article. I have learned a lot from your articles and reply on my comments.

    I think Google give more priority to links from relevant websites for search ranking. If you have backlink from relevant website with a specific keyword, it will help in search rankings. Therefore I think its better to have backlinks from relevant pages ain’t matter if its PR0 because these links will help you in gaining better search rank.

    Where good PR links from irrelevant may help you gain Page Rank but the won’t help in Search Rank which matters the most.

    I also think that 2 way backlinks are completely useless and Google doesn’t count them at all.

  63. Excellent information, it is crazy to think just to what level Google is looking at things such as size or color of the anchor text. Most webmasters are happy to get a link but so much more matters.

  64. Hi Mike,

    Thank you. We are given much more to think about when it comes to links, but I do like that it all boils down to the fact that a link that is most likely to be clicked upon is the one that probably carries the most weight.

  65. Hi Max,

    You’re welcome.

    I think it can help if a link is from a relevant site, but I’m not convinced that a link from a site that isn’t related is stripped of all value. There are often good reasons for providing links to other pages that visitors might find useful, even if they are about other topics. For instance, a page about naming a business might benefit from a link to a tax attorney – unrelated in terms of themes, but perfectly helpful to someone who needs small business advice.

    I’m not convinced either that links back and forth between two different sites are completely useless and not counted by Google. I think the search engine is more concerned about excessive reciprocal linking, done primarily to increase the rankings of pages rather than for some other legitimate purpose.

  66. this is, IMO, was in place since 2006 (atleast). We have experimented with customize footers and navigation and did achieved some remarkable results although things are more sophisticated now in terms of evaluation of a link on the page, especially if its an internal link. The idea of structuring a webpage is not new. We have read some Google patents on how they can analyze and structure a web page and then assign some value to different areas depending upon various signals like CTR, Prominence, Freshness etc

    The new thing is, we don’t have some sticky rules now. For instance customize footers has started to loose its value compare to the value it had in past. So what link weighs what, will be the next Best SEO Tip you can have :)

  67. What I would like to know is: If there are two links on the page A, both pointing to the page B, and they have different anchor and are in different place on the page A, how does Google know which one of the two is clicked?

    I think their analysis with regards to user behaviour and link placement can only be done based on links that appear only once on the page.

  68. Hi Haseeb,

    It’s possible that Google has been giving different weights to links in different locations on a page sometime shortly after the search engine launched. I think the best take-away from this patent filing is a confirmation that not all links on a page may pass along the same amount of PageRank, and that the analysis that a search engine does in determining the amount of pagerank that might be passed along is much more complex that just a simple statement like “the higher the link on a page, the more pagerank it passes along.”

  69. Hi Sandra,

    One way for Google to know which link was clicked upon would be for the search engine to identify the next URL that someone visits through the Toolbar, and match it up with the URLs that are known about on the page the person is on presently. The toolbar may also be able to identify how far down a page that the person visiting might scroll, which could help tell them which link may have been clicked upon.

  70. this a debate that i have to endure every monday. how google index a link and what value a certain back link carry depends on a large mix of factors. reputable links with PR 7+ will go a long way in promoting your PR rankings.

  71. Hi Ron,

    I’ve seen links from pages with fairly high PageRank have substantial impacts on the rankings of pages that they link to. It is helpful though to look at something like this patent discussing a “reasonable” surfer model to get a sense of how much impact a link can have based upon the link itself, the page that it appears upon, and the page that it is pointing towards, and finding something like the patent makes it easier to have something at your fingertips that you can point people towards when your asked about the values of links.

  72. I think it stands to reason that the new(er) paradigm should be renamed to “Reasonable Webmaster”. All of the link parameters that are becoming important are directly affected by the webmaster of the linking site. There is nothing “reasonable” about surfer’s response to a larger font or a contrasting color of the link. Likewise, a surfer clicking on the link that’s above the fold (ATF) vs. the one below the fold gives no useful indication that the ATF link is better (however you define “better”)- it’s just simply visible vs. not visible.

    It seems rather strange and somehow backwards that Google would give the webmaster more control over the link’s weight – hence ability to directly affect Google rank of the linked page. You’d think they’d be looking for more independent ways to verify this important parameter.

    Could this not be a “smoke screen” type of patent?

  73. Pingback: Google Reasonable Surfer Patent: Interlinking, Link Building and SEO Strategy Implications | Ask Enquiro
  74. Pingback: Two Great Posts On Google’s Reasonable Surfer | Law Firm SEO
  75. Hi Scriptster,

    I’m still leaning towards this being a reasonable surfer approach, rather than a reasonable webmaster method. The patent focuses upon how a surfer might react to different links on a page, in a way that attempts to understand how they might interact with each, rather than giving us a clear idea on how to best set up links on our pages.

    There are a few articles that have now been written on this patent that make some broad generalizations such as the “top link” carries more weight than links lower down on a page, and that just simplifies things too much. The patent doesn’t pick out just one thing, such as the color or size of a link, or its placement on a page, but rather looks at a score associated with all of the features that the search engine might be using, in addition to user-behavior data to attempt to determine which link should be given how much weight.

  76. Hi Bill,

    Well, that’s just the thing: I understand that font size and color are not the only things that matter. I presume they know enough (too much IMHO) about the visitor’s behavior from tool bar and Analytics, AdSense etc. and can see that if a lot of clicks come from a certain XY(top left) – XY(bottom right) rectangle on the page to count it as navigation and discount it or something like that.

    I’m just saying that the link is where it is on the page and looks like is does because the webmaster put it there. We also have to assume that Google already has to rely on the webmaster’s vote for the linked page for the simple reason that he/she decided to put the link there in the first place. But location/appearance of the link is such a weak and potentially misleading signal (not to mention, prone to abuse) that I find it strange Google actually needed it given that they can measure other factors – like the bounce rate for example.

    As far as being misleading signal: say, I list a number of important links in the alphabetic order (let’s say it’s in English for simplicity to avoid opening another can of worms). If the list is long enough, the last (but not least!) links are guaranteed to be pushed below the fold. They are guaranteed to be clicked less. Does it make the destination page any less important?

    You can say that listing those links in such order is the webmaster’s mistake. But how many of us attended web design schools and know/follow the best practices? There guaranteed to be simple human errors (if we can call them that) with link positioning / appearance.

    Anyways, thanks again for the write-up. It’s all the stuff that webmasters need to keep on the back of their minds designing pages. I sometimes wonder how many of us can actually still afford designing for visitors rather than Google … (but I’ll leave than rant for another time)

    Cheers!

  77. Hi Scriptster,

    Thanks, you raise some great points.

    One of those that I’ve been thinking about a lot is the example that you provide with lists – I was wondering whether Google might weigh each link within a list of links differently or the same, even though the order of those links might be random or alphabetical or chronological (with newer articles listed first or last) or in other ways. Thinking about that, and following the logic describing how Google might treat list items in this recent post – Google Defines Semantic Closeness as a Ranking Signal, ideally Google would treat each link as if it were the same distance away from the title of the list.

    I was actually thinking about a followup post, possibly with the title “I am an Unreasonable Surfer,” but I think I’d rather address some potential issues involving the reasonable surfer model here, especially since you’ve brought the topic up so well.

    There are a few sites that I like to visit because they have blogrolls or sidebar links that go to some incredible sites. I often visit those pages regardless of whether they have new content or not so that I can visit the pages listed in those sidebars. Google also provides a sidebar widget where people can list blogs and have links to the latest posts from those blogs appear within the widget – it’s evident that they know that a curated list like that can have a fair amount of value as a resource that people will use. The types of sidebar links that I describe possibly should have less weight than a link in the main content area of a page, but how much less when they are important features on those pages?

    The location and appearance of those links is only one of the signals that the search engines is looking at though. I’m not a big believer in the search engines using “bounce rate” as a ranking signal because it’s one of the weakest signals that a search engine could use. But there are other user-behavior signals that are stronger. Bounce rate can be a stronger signal for pages that attempt to have visitors visit another page on a site such as a checkout page, but much weaker when the page itself answers an information need or fulfills a transactional need on its own.

    Thanks again, for asking some interesting questions.

  78. Extremely insightful post and one to which I continue to refer to for refreshed nuggets of perspective on the valuation of links. I think the factors presented here can be ranked, analyzed, and speculated on ad infinitum, but from a practical standpoint what this boils down to for me is the need for broad domain variance in a linking profile.

    When one is out and about slogging through the tedium of manual link building, the insights presented in this article are highly valuable in helping one gain the most value from any particular vote a webmaster is willing to cast. But at the end of the day, one isn’t going to be able to convince every webmaster out there to do exactly what would be ideal–from the standpoint of this very detailed analysis of what Google considers to be the most valuable link qualities for the most reasonable of surfers–when placing a link.

    Since people are people–even if we can disect Google’s patent, we still have to interact with each other–I think the best means to achieve the ideal factors here is to simply build more quality links. When you do the work to identify and build more quality links, more of these factors are present in a link profile. It boils down to hard, smart work, IMO. When that gets done, many of these criteria are met in a very natural way.

    At any rate, excellent piece. I’ll be back to reference this again soon, I’m sure.

  79. Hi Ramsay,

    Thank you.

    This patent does give us some hints at what Google might be looking for when deciding how much weight to give to links, and I agree with you about the best approach being to build quality links which will likely fulfill some of the approaches described within the patent.

  80. Hi Jason,

    We do know that Google is constantly looking to improve the quality of their search results, and they’ve been mentioning in a number of places that they average at least one change to their core ranking algorithms everyday.

    Some of those changes may be identified in places like their patents or papers or blog posts, but chances are that most of the changes happen without being written about.

    At it’s core though, the ultimate goal of most of those changes is to improve the experiences of searchers, and to make it easier to help searchers find what they are looking for. Many older SEO techniques can continue to be effective because they also pursue those goals – making a site more relevant for the people interested in what it has to offer to the people most likely to be searching for it.

  81. I think the biggest issue is once you start feeling like you understand how Google is ranking sites they change them, and this is why I still use a mix of old and new SEO techniques because you just never know. & like always what was once old becomes new again.

  82. Pingback: SEO Quiz
  83. Very interesting, thank you! In general most link aesthetics (e.g. font size, colour etc.) are defaulted by CSS instructions i.e. by default all links may be underlined and in blue. This patent shows the need to override the default link CSS properties for different links to ‘mark’ the link as more important to Google. For example, if an important link appears lower in the text (assuming that Google sees lower links as less important by default), it may be beneficial to ‘bold’ the link to stress its importance. Just a thought …

  84. Hi Gary,

    Good point on the use of CSS. There are still a good number of sites online that aren’t using CSS for the presentation of links, and there a large percentage of sites use variations of CSS styling based upon where links appear on pages, such as main content, sidebars, blockquotes, headings and footers, etc.

    It’s also possible that if a link appears in a list of links, all of those links might be seen as equally relevant for the heading of that list semantically as well – see: Google Defines Semantic Closeness as a Ranking Signal

    But yes, using bold on a link or italics, in an area of a page where the other links don’t might cue Google that the link might be more important than others.

  85. Thanks for the post, its very informative. i continues to your answer to Gary, isn’t it better to use the ‘strong’ tag instead of ‘bold’ ?
    Another thing about the anchor text length, i read in some seo article that the max length of anchor text should be ~55 chars, anything beyond that is ignored by google (in think in hobo blog).

    Thanks again.

  86. Hi Gilad,

    You’re welcome.

    Semantically, under the WC3 guidelines, the use of strong and em elements are more meaningful than the use of bold and italic elements. The patent itself doesn’t give us a list of approved HTML elements to use to mark up links, and even uses the word “italics” to describe a possible feature of a link that might influence how much weight that link might carry:

    font color and/or attributes of the link (e.g., italics, gray, same color as background, etc.);

    If the anchor text of a link is bolded, whether through the use of a strong element or a bold element, does that make a difference to the search engines? I’m not sure that it does.

    As for the length of anchor text, I’m not sure that I’ve ever seen anything directly from Google or one of the other search engines about a maximum length for anchor text, and I’m willing to bet that I’ve linked to articles in the past using their titles which have been longer than 55 characters and not been too concerned about it. It’s probably reasonable not to use too many words or characters in a link, and it’s even possible that if you have too much anchor text that a search engine might ignore some of it. Just wondering how someone came up with a limit of 55 characters.

  87. Hi Gilad,

    Opinions are great, but ultimately, when it comes down to it, the search engine’s opinion is the one that matters when it comes to how they might use different factors in their algorithm.

    The hobo-web experiment is interesting, but I would recommend that you conduct your own experiments. I’m concerned about nonsense words used within anchor text in links in the experiment – there may be parts of Google’s approach to meaningless or unrelated words or phrases in anchor text that would cause the search engine to ignore those types of terms. For example, if Google is using a phrase-based indexing approach, it might ignore some or all of those terms.

  88. I have always tried to place my links on a high PR page but once a again Bill you have teched my to think in new ways! Thanks allot mate!

  89. Bill this is an excellent breakdown on the differing factors which can affect the value and quality of a link. You have made me rethink things like placement and appearance as important factors.

  90. Thank you, Bruce.

    I was hoping that this post might get people to think a little differently about links, what they look like, and where they appear on a page.

  91. Hi Chris,

    Thanks.

    One place where you can see that Google is collecting user-data that goes beyond collecting click information on search results pages include Google’s Web history, which shows places that you’ve visited and browsed. Search engines also look at their log files, and information about the queries that people perform, and may collect session information from those query logs – looking at how people might refine their searches when they seem to be looking for information about a specific topic.

    Those log files can contain a lot of other information, such as the language a searcher is using, their location, their screen size and resolution, the browser type they are using, and more. If you use Google Analytics or Google Webmaster tools, take a look at some of the information displayed in those services, and imagine that the information Google is showing us there is brushing the surface of what kind of information they might be collecting about people who use Google.

  92. Hunh, I thought their “user data” only referred to how they observed clicks going **on the search results page**. Glad you’re posting the academic stuff, keeping it technical.

  93. First class post – many thanks. It’s kinda terrifying how complex SEO is becoming but this helped make things clearer to me.

  94. Hi Dave,

    Thank you. It’s good to hear that you’ve gotten some value out of this post.

    It makes sense that Google would consider more than just whether or not a link appears on a page to determine how much weight to give that link. I actually like the added complexity when it comes to something like the reasonable surfer model – it’s more realistic.

  95. Hi Rob,

    Thanks. The growing complexity of search means that we may have to keep vigilant for new approaches and ideas, but I think it’s something that we can handle.

  96. Hey Bill! Thanks for this insane post. It thaught me (another) thing or two about building links to my websites.

    But I just can’t resist myself from asking you: how important is the relevancy of the link that you get? Really.

    I’ve been outranked several times by competitors using auomatic software placing links to their websites from low quality-blogs and such. Some even go so far that they actually spam their links on to forum profiles… and from what I’ve learned in SEO – that’s not good.

    So, to my question — how important is the relevancy of the links you’re getting? And I’m not talking about the anchor text. I’m talking about the page relevancy.

  97. Hi Nabil,

    Thanks for your kind words.

    The impact of links on ranking looks at both the quantity and quality of those links, as well as the anchor text used in them. So a prominent link from a page that has a very high pagerank could have more impact than less prominent links from many thousands of links from low quality blogs, forum posts, and many other sites. It’s also possible that in many instances, a good number of links from places like forum profiles may have little to no impact at all.

    One of the original papers on PageRank (The PageRank Citation Ranking: Bringing Order to the Web. didn’t distinquish between whether or not the pages linking to each other were on the same topic, or related in some manner, and that really wasn’t something that would impact how much PageRank might be passed along by a link. But, there is a section of that paper which discusses “personalized” PageRank where a relationship between pages might exist that could influence the amounts of pageranks that are passed along for searchers.

    The HITS algorithm, which was developed around the same time as PageRank (and only a few miles away from the Stanford Campus), involves an approach where the topics of pages are much more important.

    Google’s phrase-based indexing approach is one where, if the anchor text in links pointing to a page is somehow related to the topic of the page linked to, may also influence how much PageRank and hypertext relevance might be passed along.

    This “reasonable surfer” patent provides a number of other factors (or features) that Google might look at when determining the weight of links.

  98. Truly appreciate the insight, Bill.
    Being completely new to SEO & SMM, this post is positively invaluable.
    I am slightly perplexed by what I see to be examples of what may (or may not) affect the value of my links. Pardon my frankness, but do I really need to be worried about the font size of the keyword anchor text in my link? I elusively post thoughtful, relevant, and well composed blog comments, but now I’m slightly concerned about several possible factors that may (or may not).
    I realize that (based on Google’s infamous smoke and mirrors) you certainly can’t provide empirical data, but I’m very curious as to what you think may (or may not) be important.
    Thanks again.

  99. Hi Jordan,

    You’re welcome.

    The patent provides a mix of features related to links that are all taken together into consideration, rather than any one by itself. Font size is one of those features, and there may be a few different ways that font size could impact whether it were more likely or less likely that someone might click upon a link based upon font size. A link in a slightly larger font size, compared to the text that surrounds it might be said to be slightly more likely to be clicked upon, for instance. A link in a font size so small that you can barely see the link might be said to be very unlikely that you would click upon it.

    But overall, I think it’s important to think about all of these features together, and whether or not they make it more likely or less likely that someone will click upon a link found on a page.

  100. Thanks for this Bill, brilliant dismantling of what is otherwise an intimidating body of considerations to take on board. I agree with you when you sum up by saying that mimicking reasonable surfer (that is human) behavior is the acid test – something spammers are all too likely to overlook….

  101. Hi Matt,

    Thank you. It’s possible that people who intend to spam webpages will take an approach like this into consideration. I suspect that Google is engaging in something called raising “the cost of attack” on their algorithm, by making it more difficult for spammers to gain any value out of placing links on pages that someone might reasonably not click upon.

  102. Bill on the issue of spammers – do you think the 100 links/100 external links figure as the acid test for a link worthy/link attainment worthy page for domains with say 100 external links on a given page from a mediocre to low value domain?

  103. Hi Matthew,

    You may have lost me a little with your question. I think you’re asking that when you are looking for a page where you might want to ask for or acquire a link, should you be concerned about the number of links that appear upon that page, and if there are more than 100.

    Google’s webmaster guidelines used to warn webmasters against including more than 100 links on the page (not sure if they distinquished between internal or external links) and now tells us “Keep the links on a given page to a reasonable number.” Matt Cutts has also addressed the issue, and his answer was that the 100 figure was more of a rule of thumb than an absolute, and didn’t really have much to do with rankings.

    The reasonable surfer model doesn’t explicitly address the amount of links on a page, but it does give us a list of features, that considered together, tell us that a link on a page that someone would reasonably click upon is one that is likely to carry more weight. So, if there are two links in a main content area of a page that match many of these features, and a 100 or more in a sidebar that don’t, even though the number of links is over 100, the two main content links may pass along considerably more weight than the others.

  104. Wow, really great investigation. I had no idea that all those things impacted the link value. Thanks again.

  105. “You may have lost me a little with your question. I think you’re asking that when you are looking for a page where you might want to ask for or acquire a link, should you be concerned about the number of links that appear upon that page, and if there are more than 100.”

    Yes, that is the first part of what I was asking more or less. But what I also wanted to know or have your thoughts on rather was is there a big/definitive line between authoritative sites in a certain field. So: is a link for a safari lodge site on a National Geographic blog with 250 other links preferable to a link for the same site on zyxblog.com with say 30 links? At what point does the premium on the name brand wear thin?

  106. Hi AJ,

    Thanks. Keep in mind that while the patent I’m writing about may be what Google is using, chances are that if it is, they’ve probably made any number of changes. It may be a good starting point in thinking about how they may value links differently, and just knowing that they’ve developed an approach to do that is important on its own.

  107. Hi Matthew,

    The difficulty with answering your question is defining the concept of an “authoritative” page. Just what is an authoritative page? Is it an “expert” page, as in Jon Kleinberg’s HITS approach with Hub pages and Authority pages? Is it a page that might be considered the “perfect” page in response to a navigational query? Is it the “authoritative document” associated with a business at a specific location in Local search? Is it a page associated with a specific Named Entity by Google, so that when a query includes that named entity, the search engine may return that page as the top result? Those are some of the “authority” pages I’ve seen in academic and search industry patents and papers.

    And if you start reading SEO blog posts and articles and forum posts, you’ll see many other definitions of what an “authority” page might be, and many of those defintions are folklore or myth or somewhat educated guesses, but guesses nonetheless.

    If we reduce the question down to if a link might carry more weight if it’s on a well known site than if it’s on a little known site, I think we need to define that more. PageRank is an attempt to identify “quality” pages, so that a page with a higher pagerank is considered to be higher quality than one with a lower pagerank. But there are many other ways to identify the quality of a page.

    Is there a “premium” on a name brand?

    When it comes to determining how much weight a link might pass along, I’m not sure that “authority” matters.

  108. So basically, the bottom line is…if you want high quality contextual links that don’t trigger link exchange alarms, submit high quality guest blog posts. At least that’s what I get from it.

    Thanks for the research, Bill…:)

    Mark

  109. Hi Mark,

    Guest posts may be one way to get some high quality contextual links, but they aren’t the only way. Do something newsworthy, write something that gets other people to write about what you’ve written, offer a product or service that people go crazy over.

    You’re welcome.

  110. See this is why I like blog commenting. I end up finding good content from new places. You posted these in May 2010 and I didn’t hear info like that from others until later. Maybe they’re reading your stuff. I think the link above the fold definitely is a factor but you don’t hear much about that.

  111. Hi Eric,

    I’ve seen a few articles/blog posts recently that reference this post. It’s nice to see people returning to this topic, and discussing the Reasonable Surfer patent. I’m hoping that they take away the general idea that search engines are capable of given links different weights based upon a number of factors rather than trying to use the actual listed features as an exact roadmap. Chances are good that whatever Google is doing now has evolved beyond what I’ve described in this post.

  112. I can’t get over how you respond to most all of the comments! That’s amazing…
    I can barely find enough time in the day to post on my blog let alone reply to everyone =)

  113. Hi RoxyB,

    I like responding to comments, and they often help me think about what I’ve written in different ways, and come up with new questions and ideas as I’m responding to them. I may not write a new blog post everyday, but I do try to respond to comments within a few days.

  114. This is interesting. I noticed a change in ranking for some of my websites just a couple of weeks after this patent was granted. Looking at some highly ranked competitor websites, I noticed they had many links from crazy, non-sense pages with incredibly high PR (like 5-6). So, those were probably considered “authority” sites, hence the good position in SERPs for low quality sites, made just for advertising.

  115. Hi Violeta,

    It can be really hard to attribute changes in rankings to any one thing. Chances are that if Google implemented some or all of the things described in this patent, that they did it a few years ago, rather than around the time that the patent was granted. We had already heard from people like Google’s Matt Cutts that Google wasn’t giving every link they found on a page the same amount of weight.

    The concept of an “authority” site is also frequently thrown around by people alot, without any real definition of what “authority” actually is.

  116. Hi bill, thank you for this awesome post. I actually had no idea about the random surfer clicking and that was how google intended pagerank to be. I learned quite a few new things about building links to my site.

    Thanks again

    /Alexander

  117. Bill I dont want to sound silly but how and where did you find the patent info? This is a brilliant write up and really gives you the inside picture on how google operates.I know theres nothing set in stone when it comes to them but it’s still great to know where their heads at

  118. Hi Renda,

    I search for information about patents a few times a week at the US Patent and Trademark Office website, and sometimes at the WIPO website as well.

  119. Bill,
    The more research I do on links and link-building strategies the more I realize how little I know. I feel like I’m back in an Engineering class again after reading your post. One thing is clear, that all links are not equal. More important the location, font highlighting and context are all important. It was all so easy when I didn’t know this stuff.

    Thanks for the enlightenment.

  120. Hi Derek,

    As long as you keep on learning something new everyday, you’re probably on the right path. There does seem like there is always something else to learn. You’re welcome.

  121. Excellent info Bill. I know that things have changed slightly since you have posted this but i am currently learning seo and i find older information very useful in understanding seo

  122. Hi Craig,

    Not sure how much has changed since last May. Some of the things that we see in patents from the search engines, such as interface changes are easy to spot and track for changes. Other things, like how much weight a link might pass along, and what features might influence that are a little more difficult to track and understand.

  123. Hi Violeta, i also noticed the blog i had suddenly jumped within 1 month from n/a to 1 but i am unsure how it happend or how to increase it further. Bill does this mean that a website about seo linking to another website about seo will have more creditability?

  124. Hi Scott,

    Some more PageRank in a Google Toolbar means mostly that you may have a few more links to your site than you did in the past. It’s not a measure of credibility, but rather popularity.

    The original PageRank algorithm really didn’t care much about whether the topic of a site linking to you matched the topic of your site. Another link-based algorithm that was developed around the same time, known as HITS did, and it was used in Teoma and Ask for a number of years. Chances are, if the genre or topic of sites linking to each other is a factor now, it’s probably only a small part of the overall picture.

  125. Bill, all I can say is wow. I’ve never been to this level of understanding of search engines. The depth of your knowledge on this subject is greater then I’ve ever seen around the web. I consider you a great resource for SEO. I learned something today! Thank you. Awesome.

    Kev

  126. Bill, thanks so much for this informative post! Love your stuff. Now let me see if I can regurgitate this into a simple application.

    Bottom line, I think it’s important to never forget why Google has implemented all these markers in the first place. They simply are trying to evaluate the value of an inbound link. How they do that with the information you provide here can actually help point us in the right direction in our attempt to acquire those links.

    Bottom line, though, we should not be caught up in the technical details, but rather on how to achieve these results –
    – How do you get a link from a related site?
    – How do you attract links from sites considered to be valuable in your industry?
    – How do you get links in the body of the text?
    – How do you get a link on a topic-related page?
    – etc…

    It occurs to me that many are caught up in engineering it all when really they should just be spending their time writing great copy and then putting some effort into building relationships with the right people so their copy is discovered and they get a following because of that great copy — probably like you have done here.

    That, of course, takes talent and expertise which not all have, and thus we are brought back again to why Google is doing this in the first place. I’ll say it again. Google is simply trying to return the most valuable, insightful, popular and relevant results for any search query. And thus they did for the keywords that brought me to your blog. Your post here is a perfect example in how to acquire links, and my PR 5 website is getting a strong urge to link to you. :)

  127. Hi Kathy,

    Thank you. I think you’ve captured the essence of my post really well. The features included in the patent are examples from Google, but they may not be all of the signals that they are presently looking at, or they may have decided that some are better than others. There’s also a user-behavior element to the process which may mean that some links on one page might be treated differently than other very similar links on other pages.

  128. Hi bill, i was just wonder if this has changed now due to the panda updates and if so in what way has it changed?

  129. Hi Natalie,

    Chances are that Google is still giving different links different amounts of weight based upon looking at a range of features associated with those links.

    The Panda update seems to be one where Google takes their rankings of pages (based upon relevance signals and importance signals like PageRank) and then adjusts those rankings some more based upon quality signals to move some up and some down.

  130. Wow Bill, this is an excellent in depth analysis and I appreciate you going through it. One question I have based upon some of the comments is, how would google determine if a link is more or less likely to be clicked upon? Is it purely the number of links on a page? I could see a nicely categorized page of resources on a page having a higher probability of gaining a visitors click than a couple links in a sidebar for example. Because websurfers are becoming more blind to links in sidebars etc. So I could see a link among 15 well categorized links getting more clicks than two in the sidebar.

  131. Hi Bruce,

    Thank you.

    How would Google decide which links were more or less likely to be clicked upon?

    It’s not just one factor in isolation, but a combination of different features.

    For example, a link that is in the footer of a page might be a little less likely to be clicked upon than one in the main content area of that page. If in addition, the font size of that link was smaller than other fonts on the page, if it is the same color as the other text upon the page so that it doesn’t stand out too much as a link, if it’s on a subject that’s very different from the content of the page so that people who might have come to the page in search of information about hamsters might be less likely to click on a link about sub prime rate mortgages (for instance), the combination of those features might make it carry a lot less weight than links with more positive features.

  132. Great info bill. All these Google update are running in my head and I sometimes get lost. Thanks to this overview and for confirming some information that I missed. Really good read!

  133. Hi Mike,

    You’re welcome. This patent filing confirmed a lot of things that people at Google, and even Yahoo were saying about how different links at different locations on a page might carry more or less link equity or PageRank. Not completely sure that everything in the patent has been incorporated into what Google is doing, but at least the patent presents a thoughfully designed framework for how they could get that to work.

  134. Pingback: Weekend SEO Roundup 13/14th August | All Things Web Blog
  135. Thanks for the info. It’s somewhat technical in nature but it’s always good to know the theory behind practical applications. After three years of learning about SEO there’s still more and more to keep learning about. I’ll check back here again for sure.

  136. Hi Tom,

    There always seems to be something new to learn when it comes to SEO, but I think that’s one of the things I enjoy the most about it.

  137. This helps in adjusting one’s strategies. For me, the big take away is that as I add links from outside sources that I will get the best bang for my buck when the link is closest to the top of the page.

  138. I have always wondered whether google valued outbound links depending on where they linked to for example we know that the better the inbound link then better the value of the page but what if i were to link to something which was extremely relevant to the people read it.

  139. Hi Craig,

    It’s possible that if you linked to a page that was relevant to the anchor text that you use, and if the anchor text was relevant to your page, then those features might help the link itself carry more weight. But the relevance of the page you’re linking from to the page you are linking to doesn’t seem to be a direct feature according to this patent.

  140. Natalie, regarding Panda: too many high quality links may also damage your relevance since Panda 2.0- If you have an unnatural count of anchor links compared to URL links then you will have noticed a huge drop of backlinks (counted) in Yahoo Site explorer this year
    Google will compare your website’s backlinks with a website that can be trusted to being 100% natural e.g. you local kinder garden website’s backlink “quality” compared to yours – and if your %age of anchor is 80% or so you will get a Panda Slap!

  141. Hi Ron,

    It can be hard to distinquish between the different algorithmic changes that Google makes, since they make so many of them.

    I’m not quite sure that I understand your statement concerning high quality links, and anchor links vs URL links, and how they might hurt someone.

  142. Bill, sorry for the late reply – I missed the email notification for some reason..
    However, in my 12 years of SEP experience I know that Google will use (per country) the usual footprint of “natural looking” backlinks – eg. the local school or post office website will NOT only have 100% of all backlinks being: “Sydney school” or “school in Sydney” – Most of the time these “naturally ranking” established websites will have a higher count of low quality links (URL’s or just ‘click here’ links), therefore google can use this to measure “who is grey hatting” and who not – this is all based on theory and experience (and some lucky tweets that came along the way lol)so don’t take my word for it – My successes and experience have proven me for myself right =)

    One case study was so crazy, let me tell you what happened: this client had almost 100% of all his backlinks as “high grade” keywords (highly competition anchor text) and all I did was build low grade links in posts to his site and his backlink count in Yahoo increased double as fast as others running in the same SEO channel – wtf!! It was almost like Yahoo was counting every link I was making as two! So this again has proven to me that the BIG formula must be contain at least a bit of this theory. :)

  143. Bill I am always amazed at the depth at which you cover SEO. I have to say I learn something new almost every time I read one of your articles.

  144. Great resource! This is one of the best analysis I’ve seen on the value of links. Thank you, Bill, for putting this together.

  145. Thank you, Marcel

    This was definitely one of the most interesting patents that I’ve seen from Google over the past few years, and it feels like it still has a lot of value today.

  146. Thanks Bill for this report. I didn’t know that user data actually was relevant to the streghth of a link.

    One question, still in my mind:
    Does Google crawl all documents it can find? pdf, doc, docx, txt etc.?

  147. Who’d have thought that a post on a specific SEO subject would still be relevant a couple of years later! Gives me hope…

  148. Bill, great post. I just came across this from a tweet today. I know its a couple of years old, but still great information. You’ve really broken down the ‘google talk’ explanation of PR and made it digestible. We’ve always heard that different pages have different PR and therefore give more ‘juice’ but to actually see the drawn out patent of it is a great insight! Well done.

  149. Hi Eric,

    Thank you. I was pretty excited when I ran across this patent because it presented a system that Google might be following which described how they might be allocating different amounts of PageRank to different pages. They might not be following what they presented in the description to the patent 100%, but it definitely provided some ideas on how they might incorporate such a system into what they do.

  150. This post just got mentioned at Webmasterworld where it was cited as a ‘classic post on the characteristics of links’. I agree.

    ‘Which links on a page are most likely to be selected by a reasonable surfer – those are the links that probably carry the most weight.’ – This explains it well and reiterates to me the importance of links in the main body of a page.

  151. Hi Brian,

    Thanks for the heads up., and for your kind words. Under this approach links in the main content area of a page might not always be the most important links on a page, but it seems like a factor that seems to carry some significant weight.

  152. Dear Bill,

    Let me tell you i use the internet to enhance my knowledge and at times i find few good source of Knowledge.
    I must say Today i found one http://www.seobythesea.com.

    Thse tips about SEO,SEM ,Online marketing and google Page Rank Secrets are Rational.

    I appreciate it Bill.

    Wish you all the luck for your Future Projects

    Sandra

  153. Loads of details, thanks. When you look at all the factors that influence the weighting of a link, it is amazing to consider how far search engine algorithms have progressed in a relatively short period of time. Imagine the situation in another 10 years? That’s one of the things that makes web design and SEO so fascinating.

  154. Im fairly new when it comes to most SEO topics so this post has been a massive help! Thanks, Bill!

    Really interesting to see what kind of links perform better than others, and what I can be doing to help get my ranking up a bit.

    One thing I have come across a few times is internal linking, I will always link keywords from one page to the other but never know if it is best to link all the keywords displayed on one page or just one? I have seen arguments for and against.

    Also you have shown that when a keyword appears first in a title tag this will perform better but is it best to have just the keyword or a short description? Again I’ve heard many arguments for both sides!

    Would be nice to know some other peoples opinions.

    Thanks again!

  155. Hi Matt,

    You’re welcome. No, I haven’t shown that when a keyword appears first in a title tag that it will perform better. I don’t think I’ve seen a reasonable or believable argument from anyone concerning multiple links on the same page, and whether one should rank better over another. I’ve seen many unscientific studies, but I’m not sure that most people understand the variables in action and even simple things like how phrase-based-indexing might impact such a study.

  156. Hi Bill,

    Sorry to continue a rather outdated thread but do you believe that this still applies in 2012, post Panda and Penguin?

    Thanks,

    Edward

Comments are closed.