Does Google Use Reachability Scores?
Can the quality of links that your pages or videos or other documents link to influence the ranking of your pages, based upon reachability scores? A newly granted patent from Google describes how the search engine might look at linked documents and other resources reachable from a page or video or image to determine such reachability scores.
Search rankings might be promoted (boosted) or demoted in search results for a query based upon that reachability score calculated based upon a number of different factors.
Someone clicks on a search result, and while there they find links to other resources that they might click upon. Different user behaviors recorded by a search engine might be monitored to determine how people interact with the first, or primary resource visited, and similar user behavior signals may also be looked at for pages or videos or other resources linked to from that resource. Reachability scores might also be calculated for those secondary resources linked to from the first resource, looking at the third or tertiary pages and other resources linked to from the secondary resources.
Calculating reachability scores may follow a process like the following:
1) Google might begin by identifying secondary resources that are reachable through one or more links of a primary resource, where the secondary resources are within a number of hops (clicks, gestures, etc.) from the primary resource,
2) An aggregate score for the primary resource might be calculated based on the scores of the secondary resources, where the scores for those is calculated based on prior user interactions with the secondary resources,
3) That calculated reachability score can impact the ranking of the primary resource in search results.
Prior user interactions for the secondary resources might:
1) Represent an aggregation of multiple users’ interactions with the secondary resource.
2) Include a median access time or a click-through-rate (long clicks, medium clicks, short clicks) associated with the secondary resources.
A resource’s reachability score is a prediction of the amount of time a searcher might spend accessing the primary resource as well as additional (secondary) resources linked to the resource.
By adding the influence of reachability scores to an initial ranking score that might be calculated based upon relevance and importance (such as PageRank), it may lead to search results that improve user experiences, as well as potentially improving an advertiser’s ability to reach the user (for primary and secondary resources that use advertising).
The patent is:
Determining reachability
Invented by Hao He, Yu He, and David P. Stoutamire
Assigned to Google
US Patent 8,307,005
Granted November 6, 2012
Filed: June 30, 2010
Abstract
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining a resource’s reachability score.
In one aspect, a method includes identifying one or more secondary resources reachable through one or more links of a primary resource wherein the secondary resources are within a number of hops from the primary resource; determining an aggregate score for the primary resource based on respective scores of the secondary resources wherein each one of the respective scores is calculated based on prior user interactions with a respective secondary resource, and providing the aggregate score as an input signal to a resource ranking process for the primary resource when the primary resource is represented as a search result responsive to a query.
The description of the patent provides examples related to video searches and search results, but stresses that it can apply to many other types of documents, such as HTML pages, word processing documents, PDFs, images, feeds, and more.
The patent uses the word “hops” instead of clicks because it contemplates user behaviors such as touch gestures, voice commands, or other input types other than just clicks.
User interactions for videos could include things such as:
Click through rates,
User ratings,
Median access times, etc.
Some user behavior signals are treated as being a lot more trustworthy or reliable if there are a lot more data points. So, user data about a resource that has only been accessed a couple of times might not be seen as too reliable and would be considered a lot more reliable if it had been accessed a thousand times or more.
The number of clicks or hops away that a secondary resource might still be considered secondary may be based upon some preset amount, or how many clicks or hops someone might average when they click on a result during a search session, or during a certain period of time (such as 24 hours).
The reachability scores might be influenced by other signals that indicate trustworthiness as well, such as previous user interactions, or whether clicks on those resources tend to be long clicks, or have been deemed trustworthy in some other way.
A long click for something like a video might be based upon something like whether or not a video has been watched for at least 30 seconds, or if the video is shorter than that if the whole video has been watched. Interestingly, YouTube recently noted on their blog that the rankings of videos might be influenced by time watched. That new signal may or may not be based upon the reachability score described in this patent, but it does seem to be influenced by the concept of a “long click.”
Are your pages or videos or other documents being promoted (or falling behind other search results that are being promoted) based upon reachability scores?
Nice one. But I would assume that to really implement it web wide, not just on Google-owned resources like YouTube, Google would need a 100% Google Analytics and/or Chrome market penetration. As a side note, I am guessing that bounce rate for secondary resources accessed via links from an analysed resource would also be a useful signal, right?
@Irish: Google can already determine the user path / flow very well without chrome and analytics. They don’t currently use chrome and analytics right now and there is no reason to believe they will.
But currently they can determine the upstream and downstream of the users very well.
@Åukasz Rogala: heh, 10 minutes would actually be a pretty decent onsite time 🙂 I am talking more of traffic bouncing off site right away, as soon as they access the page – now that should already signify something.
@eyepaq: would hate to start a small war in Bill’s comments but are you thinking? You think Google is some sort of a deity don’t you?
Interesting that this is patented – It’s been a popular theory that Google has been doing this for a while, but I’ve still never been able to figure out how they’d track it.
Tracking chrome to that level is a legal nightmare (Not that that’d stop ’em, of course) and IE/Firefox/Safari still has too much penetration to make chrome alone a reliable data source. G:Analytics doesn’t have the market penetration either to get reasonably accurate data. Can you imagine the fallout if there started being ranking reasons for using Analytics? Lawyers would make sweet bank yo’.
The only way I can think of that would work is tracking if a site links to authoritative, high PR sources. They can trace that with Googlebot alone easy enough, but if it became a major ranking factor it’d be pretty shockingly easy to abuse. Prioritising onsite factors just isn’t Google’s style.
Fantastic summary as always Bill! The concept of the click being replaced in favor or the hop has significant implications for linking strategies, most obviously in the social and mobile realm, where the hyperlink has been vanishing in favor of other more instantaneous forms of resource sharing that directly impact a site’s “hopibility”. I think I just invented a word:)
IrishWonder: imo bounce rate isnt good joice. What about cooking sites, where you get recipe and leave for cooking? Time on the page about 10 minutes, and br 100%.
It would definitely be a solid way to quash SEO-only links being built. Basically anything built just for a crawler would be identified this way if they do find a way to accurately measure it across the web.
Does Google Using Google Analytics , Chrome to Analysis your Readability Score ? i don’t agree with Rogala that Boune rate does not affect your Ranking . It does . If we are not using google Analytics then now google does Measure any thing or not a simple one for every one 🙂
Hi Michael,
It’s been a popular theory for a while? Have you ever seen anyone write about this anywhere? I haven’t.
Google toolbar could easily be one source of information – Google doesn’t need everyone to use Chrome for Google to collect information from it, and have a large amount of usable data. That’s not needed, and it’s very unlikely that Google is using Google Analytics data – Google Analytics really doesn’t collect the kind of information used in this process in a way that would make it easy to use.
@IrishWonder Don’t you think there is a possibility that Google is buying clickstream data from ISPs?
OK, I went ahead and started the process to trademark the name Hopibilityâ„¢.
A thought provoking article Bill! More and more I’m getting deja vu though with SEO. Suddenly content is ‘de rigueur’. Us old timers know it always has been! If what you’re saying above is correct then again, seen it done it. Linking out to valuable resources and adding calue to your customers is something that it would be great to see sites doing more of now the days of pagerank hoarding are hopefully on the demise.
I think recent Google algo changes actually just give white hat SEO’s more ammunition to do what they’ve always done but sometimes struggled to persuade others is the right thing.
I love the idea of incorporating the CTR of outbound links on a given page into the ranking algo. It makes a lot of sense that a page that has outbound links which are actually being clicked through is going to be of higher quality than a link farm page where outbound links are rarely, if ever, clicked. Those inbound links are being looked at as carrying more weight in Google’s eyes because they are most likely editorial in nature. Why not give some of the credit to the linking page for creating good content with relevant outbound links?
IrishWonder made a good point. I have no idea how they’d track this other than analytics and/or chrome. Regardless it would make sense if they’re trying to up the ante on trying to verify quality of links or where they’re coming from and how they’re being interacted with.
Hi Irishwonder,
I don’t think that Google would need to collect every user behavior on the Web to implement something like this.
Of course, the more data they have concerning interactions with resources that might be connected to each other, the more trustworthy the system will probably be. I suspect that unless there’s a certain (sufficient) threshold of activity regarding the use of resources involving a primary resource for a query, reachability scores won’t be calculated to promote content in search results.
I also don’t think that it would be necessary for Google to look at Google Analytics information either. They have query and click logs of their own that they can mine for data. They collect a lot of information to track people’s web histories already.
Hi Eric,
When I first ran across the term “hops” within the patent, I wondered if it might become a term to replace “clicks” since it applies to alternative ways to go from one document to another on devices that don’t necessarily have mice, and don’t use mouse clicks to visit those pages or other resources. I like the term, and the hopibility term you’ve just coined. I wonder if its use will spread.
I looked to see if there are any pending or granted Google patents at the USTPO that use the word “hops” in that context. There are Google patent filings that use the term, but it doesn’t look like they use if to describe that particular meaning. For example, here’s one use:
“Moreover, note that communication in the network may occur via multiple hops between nodes 118. ISPs 120-1 and 120-2 may provide information about the relative location of a given node in their sub-network to a particular user in order to determine appropriate routing and/or billing.”
Hi eyepaq,
I’m prone to agree with you, but I love it when I dig up a patent that actually shows us part of the process involved in an approach like that. I’ll keep my eyes peeled to see if something in a patent might give us more details.
There have been quite a few patents from Google that tell us that Google collects a fair amount of user data to use in many different ways, but this one actually provides a more detailed look at how that user data might influence search results. I’m hoping more detailed looks like this one start surfacing.
Hi Åukasz,
I agree that bounce rate is a pretty noisy signal, and there are plenty of reasons why people might visit a page and leave quickly after getting information from the page visited – grabbing a phone number, verification of an address, answer to a question, etc.
The patent does talk about looking at long clicks, medium lengthed clicks, and short clicks as one way of describing user activity for a page. We also know now that Google may be trying to gather data about the reading speed of pages.
The example of how they might calculate a long click for a video was interesting in that context as well. If someone spends more than 30 seconds watching a video, for instance, they might consider that a long click. If the video is shorter than 30 seconds, but all of it is watched, that might also be considered a long click.
Hi Brent,
I agree. Links that are built solely to try to pass along PageRank probably won’t pay too much attention to the quality of the pages they link to, and their “reachability scores” likely won’t be all that high.
Hi Jam,
It’s a reachability score, rather than readability. Google collects web history information for people logged into Google when they search and browse the Web, which enables Google to personalize search results for people. That doesn’t use Google Analytics or rely solely upon Google Chrome. See:
Basics: Google Web History
Hi Eric,
Nice! I’m wondering how well that might catch on, and if we’re going to see the use of the word “hops” from Google more in the future.
Hi Caroline,
Good points. Back before PageRank and before Google, we did think about what we linked to based on the value that it would provide to our visitors. It was as if we were referring them to someone else, and that reference of ours could impact how those visitors felt about us. Give people links to really bad sites, and they might not think too much of us afterwards.
That’s always something I think seriously about when considering whether or not to add a link in a post, and never about how much PageRank the recipient of that link might receive.
Hi Tim,
So the patent doesn’t state expressly that click through rates are part of the reachability algorithm, but it does imply that could possibly be something that might be considered. The patent is definitely telling us that Google is considering what we link to, and what value and quality those resources have may influence a reachability score. And it does look like it is rewarding sites that not only have good and relevant content, but also provide links to quality content as well.
Only having a website for about 6 months or so – articles like this help me form the way I think about SEO. I think “Who, What, Where, When, Why and How” will always be relevant, if not for SEO, for your own piece of mind.
It is very interesting that Google has a patent for Determining Reachability. I guess we’ll find out? Perhaps that is why facebook changed over the last several months to highlight your reach and such? I’ve also heard Twitter is going to reach rather than follower count? Only time will tell….
This is a great post after the wake of carnage that Penguin left. I think the link-building side of SEO has basically been turned on its head this year, which is *hopefully* a good thing. Many may disagree now, but we all knew it was just a matter of time… things really were getting out of control. And who would argue against quality being the ultimate goal. Spammers.
This is a very interesting progression. However it makes me think whether instead of going out building inbound links we will see the increasing tactic of building external links within our own sites to other high authority sites. It is good however as it will make us stop and think about the content we are generating and researching relevant external content that will benefit the reader.
To be honest we should go back to the days of ‘web rings’ in the late 90’s/early 2000’s where sites all in the same niche linked to each other as they were extremely relevant to the user and were easier to find than in a search engine.
An interesting patent that would explain some BH claims about getting out of a penalty by putting a links to a trusted resource. Everyone is wondering how could Google collect this data, they may use user query data and history to make a guess about the hop a given user may have done.
Example, a user may make a search, click on a result and doesn’t bounce back to SERP. Result clicked has a link to Wikipedia. Later (hours, days, event months ?), the user may search the same term and add Wikipedia immediately. The first resource may have led the user to the right content he was looking for (Wikipedia page).
Killing it again with your ability to summarize Google’s patents, Bill. Thanks for passing on your detailed knowledge.
Reach-ability and time on page, or video length watched. This is all good information but now is the time to test out the individual theories like if I publish a series of :30 second videos and spread those in different networks and on sites how the vlaue of those views and links come in to play.
Malcolm bring up another good point, however I am not sure it is always feasible for larger content producers (pshhh, even the little guys too) “stop and think about the content we are generating and researching relevant external content that will benefit the reader.”
Matt Cutts video on guest blogging
“If your website links to sites that we consider low quality or spammy, that can affect your site’s reputation”
This spells bad news for the grey hats, black hats and other people wearing hats. If Google can look at how visitors interact with links, she will know that all those paid “guest posts” are pure linkspam because people don’t click. Ditch your monthly targets and build a handful of links that really matter!
I agree with eyepaq, there are more ways than one in which Google can track upstream and downstream without relying on Analytics and Chrome. Hitwise is just one that springs to mind.
Suppose this will become the “next big thing” with the “seo experts” so if you receive a weird mass-mail reading “Sir, I’m contacting you from an SEO company, I will build a resource page for you. Did you know that Google…blah blah blah” don’t be surprised 🙂
This spells bad news for the grey hats, black hats and other people wearing hats. If Google can look at how visitors interact with links, she will know that all those paid “guest posts†are pure linkspam because people don’t click. Ditch your monthly targets and build a handful of links that really matter!
This patented Reachability Scores in ranking resources would greatly affect the rampant practice of black hat and grey hat SEO strategies. The concept of clicks or the patented term ‘hops’ is a great determinant of the Reachability Scores because it is influenced on whether clicks on those resources tend to be long clicks, or have been deemed trustworthy in some other way. However, it is still the users’ behavior that is treated as a lot more reliable which means that user data about a resource that has been accessed less will not be considered as reliable than those that are accessed most of the time.
I play it safe will google, I just write original quality content and try to make my website very user friendly for my visitors and let my links grow organically. That is the best way to build links and rankings.
Hi Bill,
I have recently been using annotations within videos to strongly direct users to parts of longer videos that they may so desire to view more than other viewers, a speed lane of relevant timely links if you like.
As such, this contents [in style] of link optimisation allows the video to be lengthy yet attract higher engagements throughout the video, 20 or 30 minutes inward.
Making videos far longer and far more reachable from the beginning is in my opinion the full monty of YouTube’s ‘views not clicks’ mantra if only YouTube themselves will further their testing of player/resource integrations such as “https://popcorn.webmaker.org/” and video SEO’s like me championed subtitling far greater (to improve search inside videos with speech).
Whilst I would not or cannot prove that annotated links amongst a breadth of video optimisations I use is more or less helpful in attaining higher ranks for videos, it has improved the ranking of videos and also the click through rates from videos to product pages (more development work here to do also – namely improving the links in description fields versatility) e.t.c.
The bottom line is if you are a Video SEO like me then you should be thinking long-term about what video formats constitute to quality and trying to perform greater search experiences through optimisation of a journey via content.
Thanks Bill,
As ever,
Matt
@matdwright
Video SEO expert (in the UK)
What google really wants is that backlinks come naturally. This means that other people start linking to your site because of the great content it has. If you can achieve that then you can rank your site very easily actually. It just takes good quality content building to achieve this.
What google really wants is that backlinks come naturally. This means that other people start linking to your site because of the great content it has