A Replacement for PageRank?

Representatives from Google announced recently that they would no longer be updating their PageRank toolbar annotations for web pages. Google had been updating those 3-4 times a year for over a decade.

Does this news indicate that Google is no longer using PageRank, or that PageRank has changed in some significant way? (The ranking signal isn’t the toolbar annotation itself, which was too infrequently updated to be an accurate reflection of what PageRank might have been for a page)

We've been told that Google will no longer be updating this annotation or proxy for PageRank.
We’ve been told that Google will no longer be updating this annotation or proxy for PageRank.

Could it be a sign that Google has found something different?

A Google patent granted this past September has aspects that involve a global score for web pages (or web sites) that in some ways are similar to PageRank.

It also has a way of ranking individual pages within a site that could lessen the global score if the pages that are similar to a query aren’t very important to the site.

A flow chart from the patent giving an overall synopsis of how it works.
A flow chart from the patent giving an overall synopsis of how it works.

There are elements of the ranking signals used that are independent of the topic of a query involving things such as how many links are pointed to pages, and how many links are pointed externally from the pages of a site.

Parts of it reminded me of Jon Kleinberg’s Hubs and Authorities scores. That was developed around the time that PageRank was, but looked at a couple of different metrics to determine how authoritative pages might be.

There also seems to be an element that looks at the themes of pages and sites and links from them to other pages on other sites, and find value in common topics between them, like a Topic Sensitive PageRank.

A slide from a presentation from the inventor of Topic Sensitive Pagerank, who has long been a Google Search Engineer
A slide from a presentation from the inventor of Topic Sensitive Pagerank, who has long been a Google Search Engineer

The patent starts simply, by telling us that search engines rank web pages and other objects (images, videos, books, businesses, and others) to show search results responsive to a query. The order of those results may be based on various factors.

One type of information used in rankings can be from sources that are external to the web pages, which reflect both a quality of that page and information about the content of the page that reflected the relevance of the web page with respect to a query.

Global Rankings

When a query is received by a search engine, objects (pages, images, media files, etc.) are ranked to generate a global ranking. The ranking is based at least in part on how relevant they are to a query and a relative authority of each object compared to other objects, including objects based on other sites and the same site.

Onsite Rankings

A number of pages from each site can be ranked based on a set of onsite criteria to create an onsite ranking

A combined ranking is created for each page based on a combination of the global rank of the site, and the onsite ranking of the page.

In response to a query, a list of pages may be presented based on combined rankings.

Possible Features in this Process

So, the global ranking of a page is modified by the internal structure of the website it is within, or some threshold level of relative authority of the page compared to other pages, possibly based in part on a relative placement of the page within the internal structure. This could be done in part by an analysis of links to the page from other pages on the site. Sort of like a PageRank analysis limited to a single site.

This ranking might involve looking at:

  • One of a type of the search query
  • A type of the corresponding website
  • A relative age of the particular resource with respect to other pages from the site
  • A type of content associated with the particular resource

The onsite ranking criteria used to rank Web pages within a site is different than the global ranking criteria used to rank the web pages.

A Representative Onsite Ranking for a Page

Onsite ranking scores are computed for pages of a site compared to other pages of the site. A representative page may be determined for the site based on the onsite ranking scores for the pages of the site. The global ranking score for the site may be adjusted based on the onsite ranking score for the representative page.

Additional aspects of ranking features:

  • The global ranking score for a site may be computed based on data that is identified without reading information from the particular site. In other words, that score is query independent and the topic of the query doesn’t matter.
  • The global ranking score for a site may be based, at least in part, on a level of trust in a domain associated with the particular site. This sounds like it anticipates the possibility that multiple sites might be contained on one domain, like a WordPress.com.

Interestingly, the link above using the anchor “query independent” is from an article partially by one of the inventors of this patent, titled “Query-Independent Evidence in Home Page Finding.” I suspect that I’ll be looking for and finding more aspects of this patent reflected in other papers and patents and so on.

Possible Advantages

The patent points out the following “advantages.”

(1) Search result relevance might be increased by incorporating local signals into their rankings.

(2) Local signals may include information relevant to a particular site and may also provide additional information for ranking a site relative to other sites in search results.

(3) In the possibility of reading inaccurate or unreliable data from local signals, that may be balanced by looking at the structure of a site or the relative authority of the site when using local signals to rank search results.

Onsite and offsite search ranking results
Invented by Sundeep Tirumalareddy, and Trystan G. Upstill
Assigned to Google
US Patent 8,843,477
Granted September 23, 2014
Filed: October 31, 2011

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for ranking search results.

One method includes ranking web objects in response to a search query to:

  • Generate a global ranking based on a relevance of each web object to the search query and a relative authority of each web object compared to other web objects in the plurality of web objects, each web object including a web page in a corresponding website that includes a plurality of web pages;
  • Ranking the plurality of web pages corresponding to each website based on onsite ranking criteria to generate an onsite ranking;
  • Generating a combined ranking for each web page based on a combination of the global ranking of the web object that includes the web page and the onsite ranking of the web page; and
  • Presenting web pages responsive to the search query based on the combined rankings.

Offsite Data

The global ranking may involve the offsite data, including signals that can be identified without reading information from the web object such as:

  • Number of links to a page or site from other unrelated sites
  • Number of times the page or site has been selected in search results for a particular query
  • Other statistical data about the relevance or authority of a site associated with the page

Onsite Data

Onsite data may also be used with offsite data to compute the global ranking score for a page in the search results.

Onsite data can include data based on information obtained from the page or site, such as:

  • Number of keywords on the web page or website responsive to the search query
  • Location of the responsive keywords
  • Number of links to the web page from other pages on the same domain, and/or
  • Placement of the web page in the structure of the website – a homepage may be regarded as more important than another page requiring navigation through several hyperlinks before it can be viewed.

Other Issues Regarding Ranking Web Pages

More than one page on a site might be relevant to a query, and these “related” pages might be compared by looking at things such as the “onsite data” listed above that includes things such as keywords on a web page related to a query, etc.

For example, the global ranking score may be based on both offsite data and onsite data, but more weight may be given to offsite data when calculating the global ranking score. Similarly, the onsite ranking score may be based on both onsite data and offsite data but with more weight given to the onsite data.

Some other signals may include:

(1) How many times a word is used – a page with more than one instance of the word “laptop” could indicate the page is relevant to “laptop computers.”

(2) How prominent a word is on a page – Like the use of that word “laptop” further confirming its importance in the content of the web page, if it appears in the header of the page.

(3) How important the word is, on a page compared to other pages on the site such as how often is it linked to from other pages on the same domain compared to others.

Offsite Data

This kind of page data may be seen as being outside the webmaster’s control that may indicate relevance, authority, popularity, or importance when it comes to certain subject matter or a specific query.

Topical Relevance to other sites – This offsite data may relate to how important or authoritative a site might be compared to other sites as well, or to the importance of a particular page on a site compared to pages on other sites. So, pages that are associated with well-respected domains and are relevant to laptop computers, and all link to a particular site can indicate the relative importance of the site being linked to.

Authoritative Relevance to other sites – The number of links to the site from others may indicate a higher authority associated with the particular site. The offsite data associated with a website may include information reflecting the site’s authority. A site with a high level of authority, generally relating to laptop computers could be trusted to display pages with reliable content relating to laptop computers.

Off site factors that indicate authority on site – Further, a site may be trusted to place pages in appropriate locations within a site according to the relative importance of each page, accurately defining the relevance of pages on the site.

Reliability Signals – These may include:

  • The number of external links to pages from a site as well as the authority of external websites that have links to the website. Sites associated with a great amount of external links may be assigned a higher authority value than sites having fewer external links. This sounds a lot like Jon Kleinberg’s hub scores, where some sites are seen as very reliable pages that connect other important pages.
  • A site linked to by an external site with higher authority (such as a more reputable website) may be assigned a higher authority value than a site linked to by an external website with less authority.
  • The patent tells us that generally, the larger the number of external links to a web page, the higher the authority of the site hosting the web page.
  • The relevance of a site to a query about laptop computers may also be measured by how often a specific search result have been chosen in response to the query.

Other Factors

The onsite ranking criteria may also be used to determine the importance of a particular resource within the website based on other factors, such as the type of query or the type of site.

A web page with information related to the cheapest product of the same brand found on a site may receive a high onsite ranking.

An onsite ranking for a forum site may assign higher onsite rankings to pages containing newer forum posts.

This mix of onsite ranking signals and off site ranking signals has implications of its own as well.

For instance, imagine a page that is ranked highly according to the off site signals, but is ranked lowly based upon the on site signals. The site overall might be an important one, but the page on that particular topic may be rare on the site, or of not much importance on the site.

A site that sells men’s clothes could be very important, but it might also sell watches, and might only sell 2 types of watches, which could maybe make those 2 pages those watches appear upon to be not that important.

Above, where there was discussion of a “representative” local scored web page, if the query involved men’s watches, the “representative score of the watch pages might be pretty low, and those could negatively impact a combined score for all pages. That site, that doesn’t focus much on Men’s watches (2 pages only, out of possibly a whole lot more, probably shouldn’t rank that well for the term).

Thanks to Erik Fantasia of http://www.aroundthisworld.com/ and about.com for asking me to blog about this patent. I missed it in late September. I’m glad and thankful that he pointed it to my attention.

46 thoughts on “A Replacement for PageRank?”

  1. Changing the algorithm to evaluate page rank or even changing its name is allright but what will happen when semantic web search will actually come into being in a big way? All the search parameters that are so hot, will come crumbling down.

  2. Hi Bill,

    May be Google has found something different. Thanks for explaining the ranking factors.

    Best Regards
    Miraj Gazi

  3. Hi Miraj,

    Thank you. This was a different and interesting patent, and the mix of approaches it described were pretty interesting. I think I’ll be digging into it for a while.

  4. Hi Alex,

    We seem to be at some halfway point where we have applications fueled by the semantic web side-by-side with search focusing upon web pages. It’s our opportunity to explore and learn

  5. I think page rank has played out it´s roll. Ther is many better factors to determine what ranking positions a page should have.

  6. All I see is the same algo they use now, have more topic related site you will rank for a specific keyword or a range of keywords related to the same topic.

    You can name it Newsfeed PageRank !

  7. Hi Bill!

    Another great article. I had heard rumors and figured that Google, while removing the public updates to PageRank, would use an internal metric for ranking. However, it seems like this system is in a few parts, with each giving input into the others (global, sitewide, etc).

    I find it interesting and also surprising that Google is now, seemingly, using a website’s own internal page to do a version of co-citation for authority of the main URL as well as the other related pages as well.

    One question I have concerning these metric calculations is in the categorical selections that Google is making. Is this a static listing that the algos are placing pages and entities into based on content and citations? Or are these non-literal subjects, where their algos are creating general categories and then creating specifics after calculating the content, intent, links/citations, etc? What I mean is that with your example, would the assumed category be Men’s Fashion, Fashion, Clothing, Men’s Clothing, or where? Or, would the algos look at the main page, create a dynamic category of information and start ranking internal pages/entities based on the main page’s determined category? I’m thinking out loud (my apologies), but I’m wondering what method is used as the categorical selection to determine the intent of the entity (the overall site, homepage and internal pages/entities). I guess, in a major way, the category designated as owned by the website would also dictate whether or not an authority in that chosen industry would alter rankings and weight in Google (for instance, a site about Men’s Fashion getting linked to by a women’s fashion site might have a different level of authority weight compared to a site that sells men’s shoes or men’s suits).

    Whatever the case, it looks like links and content will be a constant for the next few years in ranking entities, brands and pages, no matter the amount of semantic that Google itself discusses.

    Love the site, and thanks!
    Jim

  8. What I always found annoying about topic-sensitive ranking factors and the long evolution of “themes” is the failure to realize that information architecture by topic is only 1 way that people organize and label web content.

    There are other, and even more preferred, ways of organizing and labeling content. Until Google (and other search engines) software recognize, acknowledge, and integrate information architecture properly into their search algorithms, they will not evolve effectively.

    My 2 cents.

  9. Hi Shari,

    What we honestly can’t tell are the details of how Google might implement the concepts of internal site structure, information architecture and themes into the algorithm being described in this patent.

    Like other most other patents, it only provides enough information that someone who is “learned” in the field has an idea of what is being patented. A patent description never provides enough detailed information so that it could be used as a blueprint for others who might want to copy it. If Google writes in a patent that they might follow certain processes, it’s possible that they might do more than do a perfunctory information architecture by topic.

  10. Bill,

    Yes indeed! I am exploring since past few days, as I got to know about semantic web very recently, after a long long time some thing got me this excited.

    I am searching for the ways to make my web pages semantic web friendly. My knowledge is at very initial stage, if you could write some thing about it, it would be great!

  11. Revealing post as always, Bill. I read it twice.

    Important point by Shari. Often times the best information design isn’t structured or labeled according to “inventory”, but according to the goals of people. Great sites that don’t meet conventional, topical, architecture shouldn’t fall by the way side in rankings as a result.

  12. Thank you for this great article! But Many SEOs still find domain with high PR to drive their websites as well and I’m quite sure that high quality contents will be important role more than focus on PR.

  13. Nice writeup as always and a great find by Erik. Seems like alot of care when doing page interlinking is the way forward for onsite. Topical links for outbound backlinks.

  14. Hey Bill

    I don’t think this would be a replacement, but an additional signal, perhaps #201? 🙂

    Even if PageRank is removed/replaced as a ranking signal, I still think it’d be maintained as a crawling signal.

  15. And what about the already acquired high PR of a webpage? How it is going to affect the overall ranking score comparing to a new website with no PR?

  16. Hi Bill,

    Interesting article and incredible research. PageRank has been long due for a change, anyway. Even if Google totally removes it from the search rankings or change the algorithms to have less value, Pagerank will stick as one of the factors in the search algorithms, perhaps on a lower level.

  17. Hi,

    I have a couple of questions. Since Google does not want you to have your links at the very bottom of the page anymore, for those of us who are website designers who add our links at the bottom of sites we have done…what should we do?

    The next question is, I have H1, H2 etc tags. Can I make the H2, H3, the same size as H1 since you should only have one H1 on the page? Does it matter and will I get penalized?

    Thanks,

    L.S.

  18. I could not get it Bill, does it mean that If a particular web page has been optimized for a particular keyword but the content does not do justice to that keyword then global score would be reduced? But isn’t this happening already?

  19. Phenomenal research! The additional On-Site factors you mentioned were very interesting and should give a user a better experience too. Specifically in your example about lowest price products. It is mind boggling how much smarter their engine is becoming.

  20. Thanks to Moz top 10 email for bring me to you Bill. Apologies for not catching this in Gplus.

    Question for you: What do you feel the future might look like for those of us ranking well across North America pre localization updates? I’ve seen a surge in local prospects knocking at my door via search, and more international prospects arriving by social. Is this the future?

  21. Hey Bill,

    Saw your site on the Moz top 10 and thought I would check it out. Google most definetely have their own system, but I think it’s made up a few factors that we already know of. Personally I am seeing Topical Trust Flow being quite important with a few of my sites.

    On-Page has always been important and linking is relatively the same as it has been for the past few years, but if the links coming to the site seem to be from all different topical sources, it doesn’t make sense to me. Using Majestic, I’ve noticed that most (no where near always) of the top 3 positions have links from other strong TTF sites or pages on those sites with specific relevance.

    Perhaps TTF plays a bigger part in this than we think?

  22. Thank you for discussing widely the page rank factors and new idea introduced by google. I think stick to the basics of the page rank and having quality content in a website is the main factor then there will be nothing to worry about changing the strategy of the page rank or its name.

  23. Bill,
    I think it always remains myth!! however, Google is planning something new and will definitely announce in few years. I think back links importance is decreasing now amd in future Google will be going to consider factors like on page ranking, bounce rate.

  24. Google does not want you to have your links at the very bottom of the page anymore, for those of us who are website designers who add our links at the bottom of sites we have done…what should we do? There are other, and even more preferred, ways of organizing and labeling content. Until Google (and other search engines) software recognize, acknowledge, and integrate information architecture properly into their search algorithms, they will not evolve effectively.

  25. Interesting read but doesn’t everybody already know that PageRank has kind of been dead for some time now?
    I honestly can’t remember the last time I even though about PageRank.

    Devin

  26. Hi Dev,

    Most people didn’t get that memo. It’s easy to say what you just wrote, but can you make an intelligent argument for PageRank having actually died. I just did by presenting a very reasonable replacement from Google. If you’re going to call it redundant, what is your argument, and a “I can’t even remember the last time I thought about pagerank” doesn’t count. 🙂

  27. Great read Bill,
    Whenever I read these articles it always gets me thinking about where search will be in 10 years time, and which of today’s attributes will still be used?
    Neil

  28. Hi Neil

    I find myself wondering that as well. There are a lot of directions that a search engine should go it, but the biggest likely involves keeping their visitors happy.

  29. Hi Hammad

    Thanks. There’s really no way to tell with any accuracy whether or not Google is continuing to use PageRank at all these days. We do know with some certainty that Google will no longer be updating the toolbar PageRank, but that doesn’t tell us anything about their continued use of PageRank for ranking pages. The process described in a patent to this post still uses things like backlinks, but possibly different than how PageRank does.

  30. But isn’t it mean that Google has stopped updating toolbar update only? may be Big G is updating the PR on back-end? If backlinks and other search engines concepts are alive, I think PR is still alive but we can’t see it.

  31. Great write up Bill!

    Always enjoy reading your stuff. To me though overall it sounds like not much is REALLY changing here. For years PageRank has been very easy to manipulate and has been exploited through link building tactics like snatching up dropped domains and re-theming the site for the purpose of selling links.

    To me its always been important to focus on the relevancy and quality of a site as compared to strictly looking at the PR of a site. If I want to rank a site one would need to focus on optimizing content the PROPER way and then focusing on trying to build engaging, informative content and also getting backlinks from niche-relevant sites.

    Plus on top of that, we now have other metrics like PA/DA from MOZ as well as Cemper Trust, which to me is much more indicative then Google PR.

    So my question to you is… do you think anything has REALLY changed even if they stop updating PR?

    Paul

  32. Hi Paul,

    It’s unlikely that Google would ever use metrics from SEO Tools providers, such as Moz or Cemper. It’s also unlikely that today’s version of PageRank is too similar to the PageRank Patented through Stanford back in the late 1990s. But, we have seem search quality related patents from Google on topics such as Panda that share some similarities with this patent. It’s possible that if Google is using this one, that many of the changes it describes may have been the result of an evolution.

  33. Hi Gabe

    The idea behind PageRank is for a search engine to look at the links pointing to a page, and create a metric based upon the quality of the links pointed to that page. A link from a page that has a lot of high quality links pointed to it, like the home page of the New York Times, is going to likely pass along a lot of PageRank with it. A link from a page that has a few lower quality links pointed to it, likely won’t pass along much PageRank.

    It’s worth digging into the original whitepaper on PageRank to learn more:

    The PageRank Citation Ranking: Bringing Order to the Web
    http://ilpubs.stanford.edu:8090/422/

    Bill

  34. Good to see google moves forward with its ranking algorithm. I hope it is more accurate than current system. Google just screwed up the system in so many ways, but it all turned to money right now. To get first position, we got to pay a lot to google via adwords.

    Sad state for quality contents.

    Robin.

  35. Is google already using this method of ranking and do you think it might affect existing SEO ranking?
    Thanks

  36. Hi Kelsey,

    We can’t tell for certain if Google is actually using this method, but it sounds like it takes the parts from a number of approaches that Google has likely used in the past, so it is a possibility.

  37. Great blog.

    I’ve always used PR as a shacky signal, telling me if a domain has a good reputation in Google’s algorithmn or not. Nothing more, nothing less.

    Last time Matt Cutt’s said that there would be no PR update in a long time, an update occured few days/weeks later. 🙂

  38. Hi Jesper,

    Thanks. True about Matt Cutts saying that there wouldn’t be a PageRank Toolbar update, and then one happened shortly afterwards. The toolbar PageRank is only an indication of what PageRank might exist on a page at one point in time, and actually PageRank for a page might change over time to something different. But, the fact that they seem to be abandoning Toolbar PageRank doesn’t seem to bode well for it as a measure of the importance of pages. With that statement about not updating the toolbar, even using it as a measure of whether a page has a good reputation or not may not be possible anymore.

  39. Hello,
    So 1 thing is clear now that Google wont be updating PR of website. But does anyone got a clue what it will be the next thing that Google is trying to do to rank website, of course ranking is important other wise how you are going to distinguish between a legitimate content website and a copy pasted site ?

    Next question I have in mind, does this changes the way of SEO we have been doing till now, like link building, forum posting and to some extent paid advertising ? So how we are going to see if that actually paid us some profit. (Other than site traffic) ?

  40. Hi Dota,

    Chances are that many of the things you are doing for SEO for your site likely won’t be changing. What metrics do you use now to measure success with your site? I don’t know what Key Performance Indicators you may have set to use to determine success with you site, so I can’t tell you what potential changes you might see with your site.

  41. What a great post it is really. I know the term Page ranking and of course the meaning of it but i do not have such a deep knowledge on it but now after reading this post I know how it works. Well for replacing this I think Google will have better replacement.

  42. It is really nice article Bill; you have given an input on how Google might implement the concepts of internal site structure. To rank high in search results the onsite and offsite ideas shared by you is really helpful and drags interests to know more on the topic and implement the same on site. Good to know that Google moves ahead with its ranking algorithms.

Comments are closed.