Which Google Link Analysis Approach May Have Changed?
In the Google Inside Search blog, Google’s Amit Singhal published a post titled Search quality highlights: 40 changes for February that told us about many changes to how Google ranks pages, including the following:
Link evaluation. We often use characteristics of links to help us figure out the topic of a linked page. We have changed how we evaluate links; in particular, we are turning off a method of link analysis that we used for several years. We often rearchitect or turn off parts of our scoring to keep our system maintainable, clean and understandable.
Curious about which link analysis method Google may have stopped using, I decided to look at different link analysis methods I have seen that Google has used in the past, to try to identify a link analysis approach that they may have stopped using. I couldn’t decide which one they may have stopped, but it was interesting seeing all of these link analysis approaches in one place.
A lot of people were guessing which method of link analysis might have been changed, from PageRank being turned off, to anchor text being devalued, to Google ignoring rel=”nofollow” attributes in links, to others. I was asked my opinion by a few people and mentioned that there were several potential link analysis approaches that Google might have stopped using.
I’ve made a list of a dozen possibilities and granted Google patents that describe them, but Google uses link analysis in a lot of ways, and what Google turned off might involve something else entirely, and/or something that might not even be described in a patent.
Here’s my list of Link Analysis Methods:
1. Link Analysis Using Local Inter-connectivity
Search Results are ranked normally in response to a query, and then before they are displayed to searchers, the links between the pages in that smaller subset are explored and some results may be boosted in the results based upon links between those results.
The book In the Plex mentions that the inventor behind this patent, Krishna Bharat, developed an algorithm similar to the HITS algorithm that was incorporated into what Google does in 2003. This patent was granted in 2003, and it’s similar in many ways to the HITS algorithm.
This process might be somewhat unnecessary these days, especially if Google is reranking search results based on something like the co-occurrence of terms in a result set based upon phrase-based indexing. – Ranking search results by reranking the results based on local inter-connectivity
2. Link Analysis to Find Related Sites
If you perform a search that appears to be for a specific site, you might see a list of other pages at the bottom of the search results, with a heading (that’s also a link) that heads “Pages similar to www.example.com”. If you click upon it, you’ll see search results for [related:www.example.com], The method that determined which pages were related was based upon links pointing at those pages using a link-based analysis.
A paper about this type of link analysis method, written by a couple of researchers who would end up becoming Google Employees is Finding related pages in the World Wide Web
Could Google have found a better way f finding related pages? It’s possible, but the pages showing don’t seem to have changed. – Techniques for finding related hyperlinked documents using link-based analysis
3. Link Analysis Using Adaptive Page Rank
This patent describes a faster approach to calculating PageRank, taking some shortcuts. It can take a while to calculate PageRank, and a method like the one described here could speed that up.
Google has a lot more pages indexed now than they did when the patent behind this approach was filed, and they may still need this shortcut. They’ve also advanced technologically, and may not.
– Adaptive computation of ranking
There is a whitepaper that was written by the inventors of this link analysis approach, intended to speed up how PageRank worked and make ranking at Google faster. The paper is Adaptive Methods for the Computation of PageRank, by Sepandar Kamvar, Taher Haveliwala, and Gene Golub
Added 26, 2019 – We were told recently that a Former Google Engineer: Google Hasn’t Used PageRank Since 2006. He said that Google Stopped using the original PageRank in 2006, and replaced it with something that had a very similar name, but which worked quicker and more efficiently. I guess that Adaptive PageRank, which is supposedly 30% quicker to calculate PageRank scores for pages, is the most likely link analysis replacement for PageRank based upon this news. (We don’t know if PageRank was the link analysis method that was being referred to by Amit Singhal in the post I mentioned at the start of this post, but it could have been.)
4. Link Analysis for Cross Language Information Retrieval
It might be possible to use anchor text from a link on a page in one language to understand what webpage that link is pointing to in another language, to understand what the targeted page is about.
Google has done a lot of work in building statistical machine translation models over the past 5-7 years and that technology might serve them better than an approach like this one. – Systems and methods for using anchor text as parallel corpora for cross-language information retrieval
5. Link Analysis Using Link Based Clustering
Google has probably clustered similar web pages by looking at other pages that link to pages appearing in search results, and seeing what other pages they link to.
I wrote about this link analysis method in the post How Link Based Clustering Could Allow Google to Group Search Results
Google might have replaced this clustering approach with one that focuses instead more upon the content and/or the concepts contained on those pages. – Link based clustering of hyperlinked documents
6. Link Analysis with Personalized PageRank Scoring
Determining personalized page scores for web pages based upon links pointing to pages that appear for specific queries in search results and whether the anchor text in those links is related to those query terms.
Google might use a different approach, such as one that may look at large amounts of data about searchers, pages, and queries to calculate a personalized page score for pages. – Personalizing anchor text scores in a search engine
I dug deeper into this patent in my post On Personalized PageRank and Personalized Anchor Text Scores
7. Link Analysis Based on Anchor Text Indexing
Using anchor text for links to determine the relevance of the pages they point towards. It’s quite likely that Google continues to use an approach like this, but in a modified manner that might be influenced by things like phrase-based indexing – Anchor tag indexing in a web crawler system
For more details about how this link analysis approach works, I wrote a post about this patent: Google Patent on Anchor Text Indexing and Crawl Rates.
8. Link Analysis using Historical Data
In 2005, Google published a patent application that describes a wide range of temporal-based factors related to links, such as the appearance and disappearance of links, the increase, and decrease of backlinks to documents, weights to links based upon freshness, weights to links based upon authoritativeness of the documents linked from, age of links, spikes in link growth, the relatedness of anchor text to page being pointed to overtime.
Google may have used some of the factors described in this patent and continue to use them or replaced them with something else, and it might have ignored others, – Information retrieval based on historical data
I’ve written a few posts about this patent, and the many continuation patents that updated aspects of the link analysis methods that it covers. I also found an earlier version of the patent (a provisional version) and wrote about it, and a continuation patent that focused upon just some of the claims in the original. If this patent has caught your attention, you might find my post about those interesting. It is at: Revisiting Google’s Information Retrieval Based Upon Historical Data
9. Link Analysis Looking at Link Weights based upon Page Segmentation
We’ve known for a few years that Google will give different weights for links based upon segments of a page where a link is located. It’s quite likely that something like this might continue to be used today, but it might have been modified in some manner, such as limiting in some way the amount of value a link might pass along if, for instance, it appears in the footers on multiple pages of a site.
Then again, Google probably has already been doing that. – Document segmentation based on visual gaps
Google filed a much more detailed patent focused more upon segmentation of any pages, and not just local pages. This patent can be found at: Determining semantically distinct regions of a document
While both of these patents go beyond link analysis, the location of a link on a page can make a difference regarding how much weight a link might carry. I wrote a more detailed post about the second patent at: Googles Page Segmentation Patent Granted
10. Link Analysis Based on Reasonable Surfer Model Link Features
Google’s Reasonable Surfer model describes a good number of features that might be taken together to determine how much value a link might pass along from a page about other links on that page, and one or more of those values may be no longer considered in a way that they might have been in the past. – Ranking documents based on user behavior and/or feature data
I’ve written a couple of posts about the reasonable surfer model link analysis approach, because it is an interesting one, and because it was updated at least once. Those posts are:
- Google’s Reasonable Surfer: How the Value of a Link May Differ Based upon Link and Document Features and User Data
- Google’s Reasonable Surfer Model Updated
11. Link Analysis Looking at Links between Affiliated Sites
Some sites may be deemed to be related or affiliated, to others in some manner, such as being owned by the same person or people. The value of those links might be diminished because of that relationship, in comparison to other “editorially determined links.”
How that affiliation is calculated might have changed. – Determining quality of linked documents
I wrote about this patent in much more detail in the post: Google’s Affiliated Page Link Patent
12. Link Analysis with Propagation of Relevance between Linked Pages
Assigning relevance of one web page to other web pages could be based upon the distance of clicks between the pages and/or certain features in the content of anchor text or URLs. For example, if one-page links to another with the word “contact” or the word “about”, and the page being linked to include an address, that address location might be considered relevant to the page doing that linking.
There are a few different parts to this method of having the relevance of one page on a site propagated to other pages on the same site, and one or more of those could have changed if it is in use. – Propagating useful information among related web pages, such as web pages of a website
I wrote a post about this patent at Google Determining Search Authority Pages and Propagating Authority to Related Pages
What “method of link analysis” do you think Google turned off?
Updated August 3, 2019
Thanks for the thoughts Bill, but I guess there are going to be a lot of ifs and buts, difficult to determine exactly what has changed. Heard a talk by Graham Hansell of CIM yesterday and he said that it was likely the link valuation change was Google turning off anchor text indexing! Not helpful when speaking to a room full of potential clients.
Its pretty much useful information for webmasters and seo professional to keep an eye on kind of links hey are acquiring and how it recognize as good or bad by Google. Thanks for sharing, Hope you would share further detail in future of identified any fix pattern about link evaluation.
Thank you Bill for this great overview. It seems that we are up to guessing. 🙁
But I don’t see a lot of fluctration, just in secondary markets.
Paul, does Graham have any pruves for his theory? How should they than find out topic relevancy?
Google is giving less importance in PR. Even if you can have a link in a reputable site Google may not follow that link, if he “thinks” that it will not have a good experience for the user.
With the integration of G+ on SERP’s the results will be as personalized as it can get. The target for Google is “a good experience”. Content farms and link building will be banned more and more.
If you put yourself in Google’s shoes for a moment then in my opinion (after reading Wil Reynolds recent article on Link spam) the top target is surely the personalisation “of PageRank Scoring for web pages based upon links pointing to pages that appear for specific queries in search results and whether the anchor text in those links are related to those query terms.”
This has to be the deepest consideration because ultimately top relevant web content has provided the best in expert advice; providing a self policed network of authority.
@Matdwright
Thanks Bill! As usual it is a pleasure to read your articles. I did not find the clear answer to the question, but I can see that until this moment the anchor text value is intact.
Obviously, we’re all just guessing (unless Amit would like to leave a comment :-P), but given the importance of SPYW, it is extremely unlikely that Google “turned off” Personalized PageRank Scoring (this also has a lot of spam fighting uses) or Link Analysis using Historical Data (especially since the new unified “privacy” policy goes into effect today).
If I had to pick one or two from the above list, my guess would be Local Interconnectivity and/or Cross Language Information Retrieval. The first approach actually amplifies the effects of linkages between pages in a SERP, and if you’re looking for a way to devalue linkages (i.e., “turn off” a method of link analysis), this would be an excellent candidate. As Bill mentions, the second approach seems extremely outdated now that Google has such a powerful language translation system. In fact, I would probably trust Google’s translation of a page significantly more than I would trust the accuracy of anchor text in a different language.
@webbstuff
Bill, thank you for presenting your thoughts here. I was following the thread over at Dejan Seo on G+ yesterday, so this was a nice follow up. I’m thinking #7 Anchor Text Indexing has been changed, in an effort to move away from spammy link building tactics.
I think they will devalue the anchortexts and focus on the website theme, surrounding text and other factors that will give Google a picture of the link instead of the anchortext.
Great article Bill. Lots of possibilities to consider. Even though I didn’t attend SMX this week, I submitted the “is it anchor text” question for the “Ask the search engines” panel this afternoon and Danny asked it. The reply was “it wasn’t anchor text – it was another link evaluation signal”. Not totally helpful, and at this point way too early to know if that was a truthful statement or not.
Going to be fun in the coming weeks as SEOs attempt to figure it out through testing.
This content has been an excellent read Bill, thank you for your incite.
I think that G will attempt to significantly lower the value of anchor text links based on quantity analysis. G already has the data set for a given domain, so this would be relatively straight forward to implement.
Thanks Again
Lee
My guess is that Google is reducing the importance of anchor text…there are a lot of great tools that SEOs can use to determine the “right” percentage of links that can have keyword anchors without triggering any sort of red flag. As a result, I’m guessing this particular signal has been devalued a bit (only I have no proof).
Besides, it’s not as if Google can’t infer anchor text from the context of the link…I doubt keyword anchor text is essential to determining relevance.
If Google is really turning off anchor text evaluation, none of the known google bombs should work anymore. this is the ultimate proof and the bombs are still armed 😉
Very interesting. I upgraded the software on my website around the end of February and since then I’ve been getting very poor Google search rankings. I had thought it was something to do with the upgrade but now it seems more likely to be the link evaluation changes by Google… Quite annoying really!!
Anchor text will die, i have read few articles saying google will consider links from Iframe as well.
In my view, I see that Google has made the importance to other factors which we are unaware of or we would be having rough concept about it. Recently I read a post of will in SeoMOz and I got to know some of the big lies of Google , e.g. the Valuation towards the links while determining the SERP position!! I don’t think the value factor to the PR could be given more and more, as we all know that nowadays Google is giving more value to the freshness and the social media connections towards any fresh content!! Thanks Bill!!
Sounds to me like they turned off something they had set up and were using for several years but that perhaps wasn’t having any impact (or a slightly negative one).
It can’t be something as significant as anchor text otherwise we would have noticed it already in the results.
By “characteristics of links” I would look more to something like title= anchor in links or something that might be get spammed. Perhaps the text near a link as it perhaps wasn’t relevant enough most of the time?
With Amit being vague about this one and so clear about the other points it makes you think it might be more important than he’s letting on.
I agree Nayan – anchor text as a ranking factor will soon be a thing of the past. IMO it can’t happen soon enough
I’m quite sure there has been a change in anchor text exact match. For many years we have been choosing a certain keyword and tried to get as many backlinks with that specific keyword and that has brought good results. I read an article in SeoMoz in which they had run an experiment that proved that the best results came out of Partial match, so if you are positioning “orange mega sweets”, instead oof getting 10 links with that anchor text you should get 2 with “orange sweets”, 2 with “sweets”, 2 with “mega sweets” and so on. The experiment proved that partial match the better long term strategy.
I tried to apply this in our linkbuilding strategy with significant results. It makes sense as it isn’t natural to always get linked with the same anchor text…
That’s exactly right, Rafeal – and I actually read the same article :). Variation in anchor text is very important. However, in my ideal world, anchor text would have very little impact as a ranking factor, as it simply encourages spam.
If it isn’t #11 Links between Affiliated sites, it will be at some point. The practice of creating a bunch of bogus blogs with hollow content as personal link farm is bound to be a strategy that won’t have long term value.
Bill, if I am to take a pick from your list it would have to be #3. Adaptive Pagerank was made redundant since introduction of a new link graph processing framework at Google.
References:
http://googleresearch.blogspot.com/2009/06/large-scale-graph-computing-at-google.html
http://web.eecs.utk.edu/~dongarra/ccgsc2010/slides/talk28-konerding.pdf
Pregel framework was used at Google even prior to 2009 (when they decided to share it outside Google). So the pattern fits.
I used to think that calculating QB ratings were difficult, the aver per completion etc. thenI began reading about SEO, makes QB ratings simple… I did see a comment above talking about anchor text and I have just never been able to put together why that should be a factor in SEO..
Hi Paul,
Google has used link text as a signal for determining the relevance of pages since the day that they started, though it’s likely that they’ve reigned it in over time in different situations to keep it from being abused. The clearest example has been in the case of Google Bombs, where people might try to make a political or social statement about a particular page, or even as part of a joke (which is presumably what the first Google Bomb was intended to be). Google supposedly took some steps to limit Google Bombs in the past. Google’s patents on phrase based indexing also expressly state that they can be used to limit the use of Google Bombs.
Chances are also that something like the reasonable surfer approach could be used not just to limit PageRank, but also hypertext relevancy. There’s also been a lot of discussion on the web about exact match domains, and how much of a possible role anchor text relevancy might play in the boost those seem to get, in some cases. Is it the appearance of keywords within a domain that causes that boost, or is the fact that when someone links to those pages with the domain name that they are getting the benefit of anchor text within them? Google has been granted a patent that might curtail the benefit of exact match domains, at least in cases where Google might identify a commercial query involved.
I’m not sure that I would state that Google probably stopped paying attention to anchor text in front of a room full of potential clients, or at least not with some mention of the kinds of steps that they appear to have taken in response to Google Bombs and exact match domains.
Also, while phrase based indexing could potentially limit things like Google Bombing, some sites have so much anchor text of a certain type pointing to them that there might be a certain threshold where even something like that wouldn’t have an impact. For example, search for “click here” and you still get the adobe reader download page as the top result, even though the words don’t appear on the page. And in the instance of the Google Bomb that was pointed to the Whitehouse biography of George W. Bush using the anchor text “miserable failure,” the algorithmic effort to stop it worked effectively up until the point where someone added the word “failure” to the content of the page. Under a phrase-based indexing approach, using anchor text pointed to the page related to the content of the page could potentially make that page relevant for that term.
Hi Alex,
Thank you. I might post another post or two about other link evaluation methods that Google might be using. The ones I listed are some of the ones that they may have been using in the past, and may still be using, but there are others as well.
Hi Nedim,
Educated guesses aren’t a bad thing as long as we recognize that there are likely many things that the search engines are doing that we don’t know about, and may never find out about. Things that might be trade secrets, or described in unpublished patents, or in whitepapers that are distributed internally within Google only.
We weren’t given any indication of the scale of the impact that discontinuing the use of that specific ranking algorithm might have like we were with a couple of the Panda updates, or the Google Freshness update (now 36% fresher results). It’s possible that one of the reasons why it was being retired is because it might not have been having much of an impact, or because it was doing something that was made largely redundant by some other approach.
There are other ways that Google could discover topic relevancy, from a quick and dirty categorization by looking at words within URLs (there’s a Google whitepaper on the topic, but it’s probably unlikely to be used in that manner), to a phrase-based indexing co-occurrence approach, to clustering based upon content and/or similar links, to looking at query terms that pages are ranking for, as well as others.
Hi Mat
By Wil’s post, I’m guessing you mean How Google Makes Liars Out of the Good Guys in SEO does point out how anchor text can be abused in a number of ways that might not necessarily lead to the best, or at least high quality results, out ranking other results. I’m not sure that has changed since Wil’s post, but it’s probably something that Google should look into more.
Hi Alf,
What you’re suggesting sounds like something that Google has been doing for at least a few years already – not having links pass along an equal amount of PageRank on a page under an approach that might be similar to what is described in the reasonable surfer patent. Definitely a good change, but I don’t think it’s the one that they just made recently. I believe Matt Cutts has been saying that Google hasn’t been weighing links evenly for a while.
Hi Thomas,
You’re welcome. I’m ruling out that Google stopped looking at anchor text as well. It still seems to be having an impact. If it didn’t, I suspect that we would be hearing a lot more people screaming about the change.
Hi Steve,
I don’t think Amit is going to stop by and fill us in either. It would be nice, but I suspect it’s not going to happen.
The personalized PageRank scoring would be something that would introduce web pages in search results without those necessarily being social search results. I’m not sure that the SPYW results would necessarily be an adequate replacement, and I suspect that boosted pages because of personalization tend to be somewhat better if they aren’t just pages that you’ve already visited before. I could see how that method might introduce some spam pages into personalized results, by showing you pages that link to pages within search results you’ve seen before. I’m wondering if some content filtering might help stop that problem better than just turning that approach off though.
There are a few different methods under the historical data approach that we might not notice as gone, especially if Google made some other tweaks involving things like freshness.
But I’m leaning toward the local inter-connectivity and the cross language approaches at this point as well, and I’m not sure if we would notice too much of a change if they were turned off.
Hi Darren,
Thank you. Happy to see you come over here from that post. Anchor text relevance does seem to be one of the methods of link analysis that people catch on to easily, and attempt to abuse, and it appears to often have a impact. Still, I’m not seeing any signs that it has gone away, and that might be one of the link analysis methods that if Google stopped using it, we might see the biggest impact from.
Hi Jan-Willem,
Wouldn’t the text surrounding anchor text be as easy, or almost as easy for people to try to spam as anchor text might be? I think if Google still looks at it, and looks at some of the other factors that you mention, that might help them determine how much weight to give to anchor text. If the theme of a page, and the text around a link is about cars, and the anchor text within the link is about ice cream, then maybe the anchor text relevance shouldn’t count as much in that instance, for example.
Hi Lee,
I’m guessing that might help in some instances, but maybe not as much in others if the analysis is about the whole domain itself. For example, on a site that is an online newspaper or magazine or group blog that focuses upon a variety of topics, if the page or post or article itself is about that topic and the anchor text is related to the page itself, should it count less because of the variety of topics on the domain?
HI Jason,
I don’t know that there’s a way to determine a “right percentage” of links that have certain anchor text, but I would definitely love to see the explanation provided by those tool makers as to how they came to such exact numbers. It sounds a little like someone has taken the old perfect page SEO tools approach that ignored off page elements of SEO and tried to enhance them by adding in off page factors somehow.
I have seen many instances where the context of a link wasn’t necessarily all that helpful in standing in for the actual anchor test used. But if it were, that could possibly be a fairly computationally expensive approach in comparison to just using the anchor text, or comparing the anchor text used against the other factors within that context, wouldn’t it. I do think there’s still considerable value in anchor text, if the cost of attack upon it as a ranking signal can be increased significantly.
Hi Kai,
I’m still seeing the Adobe download page on a search for [click here], even though technically that isn’t a Google Bomb. It’s just linked to by a lot of pages using that phrase.
Hi Alan,
Thank you. Good to hear that you were able to get that question in. I don’t think we’ll every get an answer directly from Google, and they might end up regretting that if if means that all of us spend more time testing the many different link analysis methods that they might be using.
Hi philbydevil,
When you see changes in rankings, it could be due to something you’ve done, something your competitors have done, changes made at a search engine, or even changes involving what people are searching for. Since you noticed lowered rankings after the changes you’ve made, it’s probably a good idea to check over those and what their potential impact might have been pretty carefully, even with this announcement of multiple changes from Google.
Is Google crawling all of the pages that it should be? Did you introduce the possibility of more than one URL for the same pages? Did things like your titles, your meta descriptions, anchor text in your links, and so on change? Is your site slower? Might you have added elements to pages that make them less user friendly in some manner?
I’ve seen people do things like leave a “Disallow: /” in the robots.txt file that was on their site on a development server before, telling search engines to not crawl any of their pages. It’s worth checking for things like that.
Hi Nayan,
Search Engines have been using anchor text for a long time, but it is something that can be abused by people wanting to manipulate search rankings. It doesn’t appear to be the link analysis method that Google stopped using though. I did see some discussion at a few places on how Google is treating content from iframes a little differently now, and may be working to understand them better, especially since they are prone to abuse, such as potentially delivering malicious content to web pages (trojan horses, viruses, etc.), but I don’t think that has to do with a link analysis method being turned off by Google.
Hi Ajay,
The list of 40 algorithm changes did seem to include a good number involving freshness, and if we want to follow along with that theme, it’s possible that Google might have changed how it determines whether a page is “fresh” or not based upon a link analysis method.
Google also does seem to be looking at a lot more social signals, but it might just be using those in social search at this point, and possibly doing a lot of testing to see how they could potentially be used for web search rankings in different ways, and if that helps provide better web search results.
Hi Marcos,
I’m tempted to agree with you that it might not have been something so significant otherwise there would be more people talking about the impact. Then again, as you note the other hand, the announcement did seem purposefully pretty vague.
Hi Matt,
I do often see many instances everyday where people are attempting to use anchor text to manipulate rankings, usually in my blog comment spam bin. But I think it still has some significant positive value as well, and that search results would be worse without hypertext relevancy being a ranking factor
Hi Rafael and Matt,
The approach that you describe would fit in well with phrase-based indexing where having related terms or phrases linking to pages in anchor text could have a positive impact. That wouldn’t necessarily be something I would see Google’s head of search announcing as a link analysis method that Google was discontinuing, though.
Hi Tad,
I think we’ve already seen Google implement an approach where they might limit the amount of PageRank passing between links on sites that they might think are affiliated in some manner, whether for spam purposes, or even legitimate ones like common ownership of those sites.
Hi Dan,
I would call that a good guess based the speed and scalability of Pregel and upon the way the announcement was worded. Do you think we would notice much of a difference in the rankings of pages if Google replaced the Adaptive approach with the use of Pregel? Is that something that they already have done a while back since Pregel has been in use for a while? Would they make the somewhat ambiguous announcement that they did if this was the change they made?
Hi Mike,
You might actually enjoy a paper about football statistics that I ran across in a search engine patent last November:
A Penalized Maximum Likelihood Approach for the Ranking of College Football Teams Independent of Victory Margins
It’s funny how many of us get our first introduction to some level of statistics through sports, especially baseball.
Here’s what Page and Brin wrote about PageRank and Anchor text in The Anatomy of a Large Scale Hypertextual Web Search Engine:
Guys, I don’t know about anchor text relevance but Google sure turned things up a notch in terms of their anchor text density thresholds. Ever since 3.3 came out, I’ve noticed a lot of sites with a lot less links beating the crap out of more authority sites (with tons more links). These authority sites however have tons of the same anchors going towards their pages but they haven’t been penalized necessarily, but more like given less preference. Something is going on here.
Another one great article. Everytime i came here, i learn something new. I don’t have the time to make my own analisys so it’s very useful for me. Thanks again. And you are the one and only seo blogger who answers to all commenters 🙂 Btw the phrase-based indexing was something that i think will go more and more complex in near future
Partial match in the anchor text will surely become more powerful than tons of exact matches. But the death of anchor text does seem something to study carefully before announcing it to a room full of clients.
I feel that the Personalized Pagerank scoring may be even more relevant now because of the emphasis by G on personalizing a visit to what the searcher types in. This will continue to become a very interesting subject as Google PSYW continues to target personal searches and results become more relevant to what you are searching for. IMO.
As a few other people have mentioned above, I have also read some other supporting articles about the change in Anchor Text Indexing.. However, all of what we speculate are just educated assumptions (as mentioned by Bill) but I’m sure educated guesses are much better than nothing! Only further time and research will we be able to determine some of these changes..
Do you think we would notice much of a difference in the rankings of pages if Google replaced the Adaptive approach with the use of Pregel?
No, it was probably sitting redundant and overwritten by the newer algo so that’s why they did the cleanup which frees up resources and makes maintenance easier.
Is that something that they already have done a while back since Pregel has been in use for a while?
Yes it would have happened with Caffeine.
Would they make the somewhat ambiguous announcement that they did if this was the change they made?
It doesn’t match 100% as the dropped analysis method refers to discovery of meaning of the linked page. Pregel framework update fits in all other aspects than that. We’re working on thin information here and will probably never know.
I think it has to do with exact match anchor text to “vote” for pages. Google has more then 500+ ranking signals and they have been testing this for over a year now.
Great article, how do you have the time to scour through these patents to find this stuff? Am I missing a valuable resource here?
Educated guesses are very helpful and allow me to brainstorm for the future. It also looks like I will be reading more about partial keywords in anchor text.
I am having a hard time seeing that google is giving up on PR. I have done a couple searches for keywords related to my wedding photographer site. Every time I find a site I can not explain why it ranks so high it has a high PR value.
I think the G+ influence has yet to be felt. There may have been tweaking to link importance in February but I cannot see how Google can make any sort of large scale changes until they see how G+ is panning out. And as it – and local search – are not completely world-wide (no local search in France where I am) there is no way Google will/can itroduce a major shake-up in their algorithm. Maybe in 6 months however!
Extremely complicated. I think I need an intermediate course. But as always, thanks for the tips.
Hey great write up Bill. I myself was wondering what link stuff has recently changed. My guess was many directories and blog links would be devalued. So I am on board with the links on related sites have more value.
Great article there Bill; I do however agree with “Thomas R’s” response. After analysing a number of internal websites and client websites – it seems that there has been a huge deflation in the value of directory and blog links… I stand to be corrected as they’re only 1 factor of our link building strategies – but it seems to be that way…
Hi Mech,
It’s not necessarily the amount of links that matter, and it never really has been. One link from the right place can be worth many thousands from others. That’s at the heart of the PageRank process itself.
How do you define an “authority” page? If it’s by the number of links pointed to it, I would try to find a different definition, because that doesn’t seem to be what the search engines use.
Hi Dimitar,
Thank you. Phrase-based indexing is definitely worth spending some serious time with. I’m also seeing an increasing number of signs that a concept-based indexing, of a type possibly similar to that developed by the people who came to Google via the acquisition of Applied Semantics may play a stronger role in how Google indexes content.
Hi Eliseo,
Under something like phrase-based indexing, partial matches of anchor text, or even the use of related terms (words or phrases that tend to co-occur in pages that might rank well for a particular query) could potentially play a larger role in how much hypertext relevance or even PageRank might be passed along by links. I wouldn’t be to quick to announce the death of anchor text.
Hi Vinny T,
I definitely believe that Google sees significant value in personalization, and in seeing how a personalized link analysis might help them deliver better personalized results. I suspect that we might see more emphasis on social sharing signals to help Google determine what to show as results during personalization, like in the search plus your world approach from Google, and maybe a little less reliance on the kind of link analysis described in the particular patent that I linked to.
Of course, like most of the patents above, we don’t know for certain how much of what they describe Google has ended up using, but it’s likely that even if Google was using some type of similar link analysis for personalization, the social signals that Google seems to be increasingly using may play a larger role than ever before.
Hi Larry,
Thank you. One of the things I like best about looking at patents like I have in this post isn’t so much the answers that they might provide, but rather the questions that they can potentially lead us to. We may never know exactly what it was that Google changed when it came to the specific link analysis method involved, but if the inquiry can help us understand a little better how Google and the other search engines might be using links in different ways to help them index and present search results, I think we all benefit.
Hi Dan,
I’m guessing that Pregel might have been introduced around the same time as caffeine as well, based upon the dates of the links providing information about it.
But it’s definitely the kind of information that I was hoping might come up in the comments to this post, so thanks for adding it to the discussion. We probably will never know exactly what Google dropped, but if we can get a better idea of the possibilities, I think we all benefit.
Hi Joseph,
I’ve been hearing “more than 200 signals” for a number of years now from Google representatives like Matt Cutts, but I’d guess that the number is probably much higher as well. One of the Microsoft papers from around 2005 on their ranknet approach mentioned the use of over 500 signals in their machine learning approach to ranking pages in search results. Google seems to be on a path towards machine learning as well, with the Panda upgrades, and the number of signals involved in that is likely more than a couple of hundred.
I do think it’s possible that Google has experimented with different weights and approaches to exact anchor text matching, including the use of stuff from phrase-based indexing, but I don’t think they’ve dropped it as a link analysis method.
Hi Chris,
I’ve been scouring through search-patents for at least 7 years now, and learned a lot of things to look for on the way that can help speed things up a little.
Hi Jonathan,
If educated guesses can help you come up with good questions to ask as you’re exploring search results, optimizing pages, and looking through web analytics, they can become pretty invaluable. 🙂
Thanks for the great thoughts. I think more of personalized page rank scoring or something like that. It’s hard to tell and we can only speculate, while Google sits on the truth and playing with our minds, bastards ;-).
Hi Curtis,
I don’t think Google is discontinuing the use of PageRank either, and it’s not something that I suggested in the post, though I did include one item about adaptive PageRank, which may have been an approach that Google could have used in the past to speed up the calculation of PageRank for pages. Google has been increasingly including a number of new ranking signals into how they rank pages since their earliest days, but it doesn’t look at this point that they’ve stopped using PageRank.
Hi Simon,
Thanks for adding your perspective. Google plus been introduced into the social search results that we see, but I expect Google to take some time into considering how they might best use it to impact Web search results outside of social search. I suspect that the authorship markup that they introduced might play a leading role in that at some point. For example, if someone publishes content associated with a Google Plus account, and Google comes across the same content elsewhere, they might potentially give more weight to the content associated with the Google Plus account if they don’t have reason to believe that it is a fake profile.
I wasn’t aware that local search wasn’t available in France, or at least some parts of France, at this point. I didn’t include local search results in my list above, though there is likely some aspect of link analysis involved in ranking local search results. But that link analysis doesn’t seem to have the same amount of impact for local search as it does with Web search. I don’t think that Google would have announced this as the discontinuation of a link analysis method that they had been using for several years of the change primarily impacted local search. They did also mention that they had made a change to local search that would result in more “local” results showing up in organic web results.
Hi Thomas,
You’re welcome.
I guess there’s just no getting past the fact that search engines are somewhat complicated. 🙂
Google has had years to get that way, but I’d rather see the complexity in what they do than the much simpler search engines that we had in the 90s that just didn’t produce very relevant results.
Hi Thomas R,
I suspect that Google’s discontinuation of whatever specific link analysis approach they might have stopped using may continue to be a mystery for a long time, but maybe the announcement signals a good time for us to dig deeper into some of the different approaches that they might have used inthe past.
I’m not sure that Google would diminish the value of links from blogs or directories across the board, but rather might change the way that the calculate the value of links from those sources.
Hi Guy E,
I just can’t see Google saying, “Hey, we’re not going to count PageRank for links from blogs or directories anymore, or discount those types of links considerably.”
I do suspect that Google might devalue links from blogs or directories that they think might exist primarily for purposes of boosting PageRank, like blogs that might have been purchased to be included in private blog networks, and I’ve seen a fair amount of discussion lately in some forums that discuss topics like those that some of that is happening with some well know private blog networks.
Google might also devalue the amount of PageRank that is passed along from comments in blogs, or from comments that they think are comment spam. Rather than relying upon a rel=”nofollow” in blog comments, they might have given up upon that, and decided upon a different way of weighing the value of comment links, or not giving them weight. Given what seems to be an increase in the amount of forum posts and blog posts about “do follow” sites, I could see Google considering that, and I could also see many bloggers turning off any “dofollow” plugins that they might be using, to try to ward off comment spammers.
There are also a lot of directories out there that seem to focus more upon boosting the rankings of sites included within them than helping people who might use their directories to find information that they might be looking for. That’s nothing new, but maybe Google decided that they would try a new tact with those. It’s also possible that many directories may also have made internal changes of their own to nofollow listings, or send them through redirects that don’t pass along any PageRank value.
Google discontinued their own “directory” not too long ago, which was pretty much DMOZ but with the listings organized in a manner determined by Google. Maybe they have devalued directory listings as a whole.
I would suspect that if you think Google might be passing along less weight for links from blogs and directories that you’ve possibly explored if those changes seemed to come from those sites themselves rather than because of something that Google might have done.
Hi Fredrik,
You’re welcome. We can only speculate at this point, but maybe we can all learn something while trying to investigate what changed.
Incredible work by the way. Thanks for the great thoughts.
In my opinion, I think more of personalized page rank or depending search terms (universal search) where Google could focus on image, videos or text. Social “pings” importance will probably also increase. That’s only a matter of time.
It makes sense that Google would diminish the value of links between affiliated sites that are not “editorially determined”. I wonder how common it is for webmasters/SEO people to create their own hub of websites to bolster their or their clients’ linking profile.
I kind of think it might be anchor text since google announced it will have less of an influence in regards to backlinks in the future and already modified it to prevent things like google bombs. Who really knows though besides google? They’re such a mystery and I think they want to stay that way.
Thanks for these thoughts, Bill. There are more others, but your list is with most important rules that are changed many times in last two years.
Hi Jason,
Thank you.
Is it possible that Google is using a different approach than the one described above involving personalized PageRank. I’m not fully convinced that approach is one that Google has been using. It is possible that social activities on Google Plus could also play an increased role in personalization as well.
Hi Joel,
I agree about lessening the value of links between sites that are affiliated by things like common ownership, and I suspect that is something that Google has been doing. I don’t think Google would have stopped that, and I can’t think of a different approach they might use. I do think some site owners do create and/or use the kinds of hubs that you mention.
Hi Rebekah,
I’m not sure that I’ve seen a clear statement from Google that anchor text would have less influence in the future, but something like phrase-based indexing can lessen the impact of anchor text that might not be very relevant or appropriate, and provides a way to stop Google Bombs as well.
I agree with you that Google wants as much of what they do to be as much of a mystery as possible.
Hi Eduard,
You’re welcome. Not sure how much any of the analysis methods I’ve listed have changed, and I know there are a number of others as well. Google provided enough information with their post to make this a mystery, and not enough to give us a clear idea of what they might have changed. Is it tied to the devaluation of link networks, since that seems to be something Google appears to have done a lot of recently? Maybe.
From a strategic perspective, Google has moved into dangerous ground by devaluing third party links. Granted this has been a gradual process over the past several years.
Google’s principal value add as a search engine – it’s most defensible position – was that it was merely “rolling up the data” about what the denizens of the web thought was important / useful. One metric of this is links (one link = one vote), search behavior (what clicks, sticks, bounces) was another key indicator. Both are independant, fairly objective, and relatively hard to fake. Google wasn’t directly “ranking” the quality of content – just reporting what users / webmasters thought.
What are the alternatives?
– Social Media? Google’s out of position there, Social Media buzz is also relatively easy to fake.
– Semantic Analysis???
The latter is a very scary place – if Google’s “automated review” of what it thinks a good website should have becomes the basis of determining search rank. Both for webmasters and Google. Imagine a political candidate whose website ranked on the second page – because they didn’t follow Google’s rules… or a Google competitor in a similar situation? Once Google starts directly rendering an opinion about what is good / bad, the objectivity of their search results comes into question…
I enjoyed reading your article. Every single method listed here makes sense… I would add traffic as a good parameter, as I’m pretty sure that a link that receives 0 clicks has less value respect a link that generates thousands of visits.
Hi John,
We are moving into some uncertain areas, with Google consider more signals including some that are likely social and other ones that may rely more upon semantic analysis. In addition, Google has been expanding their definition of what “relevance” means, from the simple matching of keywords from a query to keywords on documents. Google’s trying to understand the situational or informational needs behind many queries, and the concepts and categories that they might best match. It’s also likely that more user behavior data is playing a role in what ranks where, or is the basis of statistical models to try to best predict user behavior.
Hi Simone,
When I wrote this post, my motivation was to show that Google uses a lot of different link analysis methods, and has described a good many of them in patents or papers. I don’t know if any of the 12 I listed were the one that Google stopped using. I probably could have listed over 100 if I had a week or two to do it. 🙂
Google is creating a great confusion on our part due to changes in its search algorithm. Some web administrators even received e-mail notifications from Google warning them about unnatural links on their site. Yet the manner by which Google determines “unnatural” is still a big question. Google advises us to create quality content, but can hardly protect us from possibility of poor links being directed to our site by our competitors.