Search Engines Applying Different Anchor Text Relevance from the Same Site and Related Site Links

Anchor text in a link pointing to a page is often used by search engines to determine what a page being linked to is about, and to determine what words and phrases that page is relevant for.

But, there are a number of issues raised when anchor text is used by search engines in that way. Here are a few of them:

  • If a page points two links to the same destination page using the same anchor text in both links (for example, in the navigation and the footer of the page), should the relevancy of that link text be weighted twice as much as if there were only a single link from the source page?
  • If there is a link on every page of a site to a single page of that site (a site wide link) using the same anchor text, should each of those links accumulate in weight to determine how relevant that page might be for the text used in those links?
  • If there are multiple links on a page to another page, or sitewide links to that other page, and the anchor text is different in each link, should the text in both links carry the same amount of weight in determining what the page being linked to is about?

  • If a site is substantially a mirror of another site, how much weight should anchor text from the first site to its mirror be given, and vice versa?
  • If a site is considered to be “related” to another site by common ownership or some other kind of cooperative relationship, should the anchor text in links from a site to a related site be given the same amount of relevance weight as anchor text in links to an unrelated site?
  • If a link appears to have been created to boost the rankings of a destination page in search results, how much weight should the anchor text of that link be given to the destination page?

There have been a large number of papers, patent filings, articles, and blog posts that describe PageRank and how it might be used in ranking pages at a search engine. There has been much less written on a very related topic – how anchor text in links might influence how relevant a page might be considered for the words and phrases used in those links.

In the very early days of Google, the relevancy of anchor text was seen by its founders as a major part of how pages on the Web should be indexed. We are told the following about anchor text in the 1998 Sergey Brin and Lawrence Page scribed The Anatomy of a Large-Scale Hypertextual Web Search Engine:

2.2 Anchor Text

The text of links is treated in a special way in our search engine. Most search engines associate the text of a link with the page that the link is on. In addition, we associate it with the page the link points to. This has several advantages. First, anchors often provide more accurate descriptions of web pages than the pages themselves. Second, anchors may exist for documents which cannot be indexed by a text-based search engine, such as images, programs, and databases. This makes it possible to return web pages which have not actually been crawled. Note that pages that have not been crawled can cause problems, since they are never checked for validity before being returned to the user. In this case, the search engine can even return a page that never actually existed, but had hyperlinks pointing to it. However, it is possible to sort the results, so that this particular problem rarely happens.

This idea of propagating anchor text to the page it refers to was implemented in the World Wide Web Worm [McBryan 94] especially because it helps search non-text information, and expands the search coverage with fewer downloaded documents. We use anchor propagation mostly because anchor text can help provide better quality results. Using anchor text efficiently is technically difficult because of the large amounts of data which must be processed. In our current crawl of 24 million pages, we had over 259 million anchors which we indexed.

But that statement doesn’t even hint at some of the issues that I raise above. Almost eleven years later, we’re starting to see some of those issues being considered in published research from the search engines, though if I had to guess, I would say that a number of those issues have been hashed through and possibly addressed at places like Google.

A paper published earlier this year and presented at the SIGIR’09 conference in July, with authors from Microsoft Research Asia and the University of Montreal, explores the relationships between links when determining how much weight text from those links should be given in determining what a page the links point to are about.

The paper is Using Anchor Texts with Their Hyperlink Structure for Web Search, and the abstract from the paper provides a nice overview of its exploration:

As a good complement to page content, anchor texts have been extensively used, and proven to be useful, in commercial search engines. However, anchor texts have been assumed to be independent, whether they come from the same Web site or not.

Intuitively, an anchor text from unrelated Web sites should be considered as stronger evidence than that from the same site.

This paper proposes two new methods to take into account the possible relationships between anchor texts. We consider two relationships in this paper: links from the same site and links from related sites. The importance assigned to the anchor texts in these two situations is discounted. Experimental results show that these two new models outperform the baseline model which assumes independence between hyperlinks.

The paper presents a number of different models involving how much weight they might give to anchor text from links located on the same page, on the same site, on “related” sites, and tells us about a number of experiments that they perform where these different weights for the relevance of anchor text play into the experiments.

The authors looked at a dataset of 3,000 randomly sampled queries, and about 140 returned documents for each query which were graded by human editors as to their relevance (on a scale of 1 to 5 or bad to perfect). They then separated the queries into informational type queries, and navigational type queries to examine how well the results turn out when they apply different weighted relevance amounts from their different anchor text models.

Their research appears to indicate that if they count multiple links from the same domain as a single link, and give different weights to links from other sites based upon whether or not there is a relationship between those sites, that the relevance of anchor text pointing to pages increases, especially for navigational queries.

The paper doesn’t answer all of the questions that I asked at the start of this post, but it provides some hints at how a search engine might handle some of those situations either now, or in the future. It’s definitely worth spending some time with if you’re concerned about how a search engine might treat anchor text in links from the same site, or from related sites. It’s also worth keeping an eye open for further research on the topic from the Microsoft Research Asia team.

One question that I need to ask, and intend to explore in the very near future, is what might a search engine do when there just aren’t many links (with associated anchor text) pointing to a page? Can the relevance of hypertext still be used somehow to tell a search engine what a page is about in that situation?

Share

32 thoughts on “Search Engines Applying Different Anchor Text Relevance from the Same Site and Related Site Links”

  1. Excellent break down. I believe that more than one link from a website is usually better than one, however, I would have to assume that if there are site wide links, that they are going to devalue the majority of the links – because that wouldn’t make sense to give 100% credibility to each link, especially if they are duplicate.

  2. Another issue with link text is alt text. Particularly from SMs and forms; where alt text might be used with words like ‘screenshot’, ‘profile’ and so on. The decision to use alt text is a bit odd IMO as it’s not necessarily used to describe the target website, but the image.

    Out of interest, have you ever come across anything arguing for marginalising link text’s influence on search results? I believe that link equity is given more weight on the whole as sites with very poor links aren’t able to rank above sites that have a lot in most scenarios. However, once a page has gained decent equity, it is able to outrank pages with much higher equity from their link text. This results in some SERPs being populated with sites that can get enough equity from decent-good sites and then getting keyword-rich links from anywhere, which results in sites with high equity not being able to rank. Since it is much easier to get 1000s of low quality links than a handful of high quality links, I don’t know why a search engine would place so much weight on link text.

    I’d be interested to see the search results if link text was treated as an extension of the text on the page as opposed to the cumulative weight they currently provide. This would change the focus of gaining loads of links for the sake of the link text to making people have to work hard to provide content of merit to get links from real sites. Although the issue is that it would make it difficult to determine which site is right to bring up for navigational queries. I guess it is difficult to find a balance between link text and equity.

  3. What happens in case you have different alt text from title tag?

  4. “Intuitively, an anchor text from unrelated Web sites should be considered as stronger evidence than that from the same site.”

    It never fails to amaze me just how stupid, naive, and clueless people in the search engine industry and information retrieval science industry can be.

  5. If they were crawling 24 million pages in 1998 I wouldn’t be surprised if they are now crawling billions or even trillions! That is a lot of data to process!

  6. Hi David,

    Thank you David. You raise some thoughtful points.

    I don’t usually associate alt text and anchor text, but I think it’s right to use the alt text to focus upon the image, but only if the image is actually meaningful in some content adding manner. For images that are primarily decoration, or bullets, or spacers, I like to see an empty alt value (alt=””). Since alt text is intended to be an “alternative” rather than a description, for images such as logos (which are often links as well), I wouldn’t describe the image as a “logo of organization X,” but rather put the company name (and possibly a tagline) as if it were an actual alternative for the logo. I think it’s right for the alt to be used as an alternative for the image rather than as a description of the destination if the image is also a link.

    The paper that I’ve written about in this post does explore a few ways of limiting the impact of anchor text relevancy in a few areas, such as from the same page or site, or from “related” sites, though the authors admit that they have a lot more to look at in the conclusion, involving relationships between links that echo what happens in the world outside the test lab.

    For pages that are link poor (not many links pointing to them), I have seen at least one other paper that addresses the problems that those pages might have, and that should be the subject of a post from me in the very near future.

    I’d be interested to see the search results if link text was treated as an extension of the text on the page as opposed to the cumulative weight they currently provide.

    I wonder sometimes about the choices that people make when they come up with anchor text to accompany a link, regardless of whether it’s navigational links on their own site, or links embedded within the main content of a page (and I’m probably sometimes guilty of that myself). The use of text like “click here” or “read more” doesn’t help matters much either, for navigational queries or informational, or any at all.

  7. Hi Chris,

    What happens in case you have different alt text from title tag?

    There really isn’t any discussion of the value of alt text or title attributes in the paper. It’s quite possible for alt text to be included within an image that also acts as a link – a good example might be a link to the home page using the site’s logo. Chances are that there may also be a link “home” in the main navigation for that site, and in a footer link as well, so you could easily have three links on every page of a site pointing back to the home page. Should those count as three different links on every page, with anchor text, alt text, and possibly even title elements influencing what the page being pointed to might be seen to be about?

    Title attributes can be used with just about every HTML element, including links. I wouldn’t suggest using them that aggressively, but it’s possible to use a title attribute in a link. And it’s possible that a search engine might consider the contents of that attribute as text associated with a link and meaningful to the page being pointed towards. If you have multiple links on the same page with title attributes for each, should the weight of those be discounted because there is more than one link. As I said, the paper doesn’t address that.

    It’s not a bad idea to use different text in a title attribute and an alt attribute when both are being used for the same image link, especially since someone using a screen reader or some other kind of assistive technology to visit a page is going to hear (or feel if using a braille reader) both of those, and using the same text might become too redundant very quickly.

  8. Hi Michael,

    “Intuitively, an anchor text from unrelated Web sites should be considered as stronger evidence than that from the same site.”

    I tripped over that statement too, and the first thought that came to mind was that search engineers would be well served by having significant previous experience as site owners.

    Who should know better what a page is about than the person who owns a site and gets to write the anchor text pointed to a page? And how often have I seen anchor text in a link used on one site pointed to a page on another site that wasn’t very meaningful at all, from text such as “read more,” or “click here,” or the name of the site or the author instead of something meaningful about the page being pointed towards. But it’s easy to spot anchor text in links within the same site that aren’t very informative too.

    I’m not sure which is a better source (or a worse source) of information about pages being pointed towards by links – unrelated sites or the same site, but I think every time someone writing a paper like this one finds themself starting a sentence with the word “Intuitively,” they may want to question their intuitions.

    Google’s Dan Russell gave a few presentations a couple of years back, where he brought up the use of intuition in a manner like this. Here’s roughly what he thought about trusting your intuition when doing research:

    1. Intuitions are terrible when trying to figure out what people are searching for.

    2. In particular, your intuitions are terrible.

    3. That’s why Google does studies.

    4. Fallacy 1: “I do it this way” so others do, too.

    5. Fallacy 2 “My Mom does it this way” so others do, too.

    6. Deep truth: You are statistically insignificant.

    7. Deeper truth: (As a computer scientist/student/audience member) you are a couple of sigma from the norm

    8. So are your friends

  9. Hi Dan,

    It is a lot of data :)

    Last summer, the Official Google Blog made an announcement that they had crawled 1 trillion URLs (see: We knew the web was big… ).

    From what I’ve read, the amount of information that Google has collected about the web is dwarfed by the information that they have collected on how people use the web, whether it’s the search query log files they keep, or the browsing information gathered from sources like their toolbar or personalized search, or many of the other services they offer such as bookmarks, alerts, etc. It’s really a lot of data.

  10. Hi Lee,

    Sometimes there are good reasons that don’t have as much to do with navigation or search engines as they do with providing a good user experience.

    Having the right link at the right place at the right time may mean that someone will visit a page that they otherwise might not go to. A site might have links to their main category pages in their navigation bar and in their footer, and then also within the main content area of their site. While the nav bar and footer links are purely navigation, the links within the main content area may be the links that are more likely to get someone to visit those pages, if they are presented persuasively within the context of that content.

  11. Hi Joel,

    Thank you. There is a lot to the paper, and my post is pretty much more of a summary and introduction to it rather than a detailed overview of all of the issues that they raise. :)

    I would agree with you that its reasonable that if there is more than one link to the same page on a site, whether from a few links on a page or site wide rather than a single link, that some of the value of the weight of all of the links may be discounted. It’s something that people have been suggesting in many places on the Web, but I haven’t seen too much specifically from the search engines on the topic.

    I did write a post at the end of August that also discussed how blocks of links found on pages might be merged together (How a Search Engine Might Analyze the Linking Structure of a Web Site), which could have a similar effect in some instances, such as when a link appears in the main navigation for a site and also in shortened list of links to main pages on a site. That merging possibly reduces both link equity (or PageRank) and hypertext relevance, when a link to the same page appears in both link blocks being merged.

  12. I have found that having more than one link on a page to the same page somewhere else does not give extra weight. I would most defiantly advise against this. I use the example of: “my name is lee; my name is lee; my name is lee”. Repetition is a waste of time. There is no logical reason apart from assisting in navigation to have more than one link on a page to the same page.

  13. I Agree, I was talking more for the sake of extra links to the same page from the same page (repetative linking). Linking within the content of a page allows association of the text so this would be the way I would generally do it.

  14. I have found that Anchor Texts is more important than content, now i have read it on the white paper “Using Anchor Texts with Their Hyperlink Structure for Web Search”

    I have put some link with anchor texts in a empty Website “A”, Not content at all just links pointing to another Website “B” then I did the same on a Website “C” with

    content and links pointing to another Website “D”

    I have got same results for Websites “B” and “D”

  15. Hi Lee,

    I agree with you as well. Repetitive links from the same page that don’t aid in navigation or in providing a good user experience, but that might be there solely to attempt to boost rankings are something that search engines should be looking at closely.

  16. Hi Rafael,

    The value of anchor text in a link pointing to a page to help a search engine determine how relevant that page might be for certain words or phrases may carry a certain amount of weight with it, that can give it value in search rankings, and as my quote above from the Google paper suggest, that can be helpful for links to things such as images or applications that don’t have text associated with them to index. It’s possible that a combination of seemingly related text within anchor text and content found upon a page being pointed towards can help even more.

  17. Bill Excellent post! As an experiment, There is an anchor text in one of the links pointing to my site (theme:iPhone) from a relevent site (theme: iPhone technology blog) which went like this: “Read iPhone reviews, specifications, features and find the latest models, iphone accessories…and something like this. It paid rich dividends in the SERP ranking of my site. I tried this at a time when most SEO’s were adhering to the 3-4 phrase anchor text link structure. If so does it carry same the same weight? Here the point is to experiment and observe how things go. We as SEO’s never have to stick to a particualr point or a theory, but rather experiment and find out what went wrong and what went right at a particulr pont of time. But again your post is a primer to all those who want to have a go with Anchor Text Linking structure!

  18. Hi Shameer,

    Thank you. There are no stead fast rules regarding anchor text, such as how many words should be included, and it doesn’t hurt to test and try new things, for both visitors and search engines (just another visitor in my book). It helps to pay attention to sources like the patents and whitepapers that I look at, and sometimes they provide some questions that can lead to experiments like the one that you’ve performed on lengthier anchor text usage. Another experiment to perform for longer anchor text like you suggested might be to see if using longer anchor text might mean more click throughs on a link from actual visitors.

  19. Hi Steven,

    An old cliche that I think about every so often is “when all you have is a hammer, most of your solutions look like they can be solved with nails.”

    It doesn’t hurt to have more than just a hammer. SEO has always been complex, it’s just that a lot of people have tried to address it with only a hammer in their hands.

  20. This is a bit complex, yes.

    Bill, I still didn’t figure the difference between links from related sites and from unrelated sites. I guess that other factors are involved here in linking the ‘weight’ from a site (no matter related or unrelated) like the overall site authority etc.

  21. Hi Finder Mind,

    Not sure about the concept of “overall site authority.” There are a lot of people who throw around terms like that, and come up with elaborate conceptualizations about the “authority” that a site might have without considering much simpler and more likely reasons for the high rankings of pages of a site. Case in point, Wikipedia.

    Anchor text can often be a helpful and informative resource to describe what a page is about. Looking at anchor text in links from sites that don’t have a “relationship” with another site it links to might be more objective in nature than a link from a site that definitely has a relationship, such as links from sites that share a common owner. Is that assumption true? I’m not completely convinced. I’d rather see a statement made like that based upon actual research instead of an intuition, as I’ve stated in the comments above.

  22. I have tested the links from these non-related sites and realised that they are good sources of quality back links, too, as long as the PR of those sites are high. The site that gives the back link come from library-related sites that point to a business site. After the closing of the quarter, the PR change from 0 to PR 4. Three months after the PR changed, the site still enjoys continuous growth of traffics with PR 4.

    I started to move to exchange links even with non-related sites. It proved its worth for us as i watch the traffics grow. Non-related back links from high PR pages are better than back links from related but low PR pages like those from link.asp or partner.asp.

    I have a theory that a home page to home page exchange link is also beneficial to each site.

  23. Hi Bob,

    By “related” sites, it seems you’re reading my post and the paper as if it means that the sites are related by topic. That’s not what I meant, or what the authors of the paper mean. When they use the word “related,” they mean that the sites might be owned by the same person, or there appears to be some kind of relationship between the owners of the sites based upon how they link to each other and interact with one another.

    I would caution that too much exchanging of links, or what the search engines would call excessive reciprocal links, can have a negative impact on rankings, including the possibility of penalities or banning of a site by the search engines.

  24. Is it true that if we have a page on a related site containing two contextual links with different anchor text, both linking to the same page that Google will only follow the first link to the destination (only giving one link credit)-surely if this is the case it becomes pointless to place two links to the same destination on one page-it would only help navigation instead of ranking.

  25. Hi SEO Manchester,

    There have been at least a couple of well publicized experiments reported on SEO blogs on how Google might different anchor text from two different links located on the same page, and not a lot of agreement over the results. Matt Cutts responded to that question with an answer that stated that the answer was more complex than a simple yes or no, and that if the anchor text for both links was the same, that Google might ignore the second link.

    Neither of those experiments considered that there might be a difference if the page with the links appeared on the same site, on a related site, or on an unrelated site.

    Chances are good that when Google crawled the page where the links appeared, that it created a list of URLs appearing upon the page, and likely even included the anchor text associated with those URLs, and possibly even some text surrounding the link, including possibly alt text if one link included an image, and even text from title elements.

    We’re pretty sure at this point that Google will look at links in different segments of a page differently, so if there’s a link to a page in the main content of the page, it might be treated differently than if it is in a sidebar. We also know that it’s quite possible that if a search engine sees a block of links on one part of a page, such as in the main navigation, and sees the same links or a subset of those links in a different part of a page, such as a list of links in a footer, that it might merge that block of links together even if the links contain different anchor text. But here, you asked about contextual links, so I’m going to assume that you mean individual links that appear by themselves within the body of text in the main content area of a page.

    Should a search engine look at the first instance of a link and ignore the second link to the same page, just because they are in that order? What if the anchor text in the first link is “read more,” or “click here” and the second link actually contained words that were relevant to the content of the page? Since the search engine is likely listing both appearances of the link, and associated text and anchor text, it might do something more than a simple “first link wins” analysis. For example, it might:

    1. Use something like what is described in Google’s Phrase Based Indexing (Anna Patterson version), to see how related the anchor text in the link is to the content of the page being pointed towards. If anchor text from one link appears to be related, Google might use that anchor text. If anchor text from both appears to be related, the search engine might count both.

    2. Regardless of whether the anchor text used in the link is related to the content on the page, the search engine might look at other places where anchor text is used to point to the page in question, to see what anchor text other sites are using. If the same or substantially similar (or possibly related, in a phrase-based way) anchor text is used in a certain percentage of those links or above a certain number of times or both, it’s possible that the search engine might consider that text associated with the page, even if that text might be “click here.” This threshold percentage and/or number or both might be the type of thing that keeps Google Bombs from being effective, while still letting some pages rank well for terms like “click here,” even though those words don’t appear on a page being pointed towards.

    Other possibilities exist as well, but an “automatic first link wins” approach doesn’t make sense if a search engine’s purpose of looking at anchor text is to learn something about the page being linked to, regardless of how many links to it appear upon the same page – one or two.

    Beyond navigation, another purpose for including a second link from a page, regardless of whether or not search engines were paying attention, would be to increase conversions. Imagine, for example, that your links pointed to a page on a site where people could get information about new and used cars, and you had rich sections of information, forums, user-tools, etc., for both audiences, but required only one registration for both groups. The page that you link to is the registration page, and you use two links to point to the same page. One link reads “register to learn about new cars,” and the other link within the body of the main content of the page reads “register to learn about used cars.” It’s possible that using two links instead of one might make it more likely that a visitor to the page where the links appear might click on one of them and register for the site. If one of the main goals of that site is having people register, then two links may lead to more conversions of that goal.

  26. I found that a few sites I built with sitewide navigational text links seemed to perform well and the pages the links pointed to did start to get “weight” for the term in the link quite quickly. However, there were a couple of other sites done in a similar way which I just couldn’t make work using the same method.

    I decided to spend some further time looking more closely at the reasons why, and it seems the ones that perform all had backlinks popping up randomly on forums and bookmark sites by people who were suggesting the internal pages to others, mainly using similar anchor text as placed on the sites navigation. So I guess although it helps make the page relevant on site, to make it truly perform.. the service or product the page provides needs to become useful to others externally before it gets truly accepted. Damn obvious I know!.. But it wasn’t until I used the linkdiagnosis.com tool (which by the way is awesome, and no I am not affiliated!) that I discovered the backlinks which were giving the most strength.. I hope this makes sense lol.. it’s late here :)

  27. Hi Lee,

    It’s interesting how search engines might determine how much impact site wide navigational text links might have on a site. Chances are that they will have some, but probably not as much as if those links were each from independent pages. I recently wrote a post about a patent from Google which seems to cover this to a degree – Google’s Affiliated Page Link Patent.

    It seems that internal links from a site might cap off at some point on the amount of link weight they pass along to a page, and that links from sites that don’t seem to be “affiliated” with a site may not limited in quite the same way.

    Regardless, it can be helpful to get links to a page from a variety of sources, both on your own site, related sites, and unrelated sites.

Comments are closed.