Does Google Use Reachability Scores?
Can the quality of links that your pages or videos or other documents link to influence the ranking of your pages, based upon reachability scores? A newly granted patent from Google describes how the search engine might look at linked documents and other resources reachable from a page or video or image to determine such reachability scores.
Search rankings might be promoted (boosted) or demoted in search results for a query based upon that reachability score calculated based upon a number of different factors.
Someone clicks on a search result, and while there they find links to other resources that they might click upon. Different user behaviors recorded by a search engine might be monitored to determine how people interact with the first, or primary resource visited, and similar user behavior signals may also be looked at for pages or videos or other resources linked to from that resource. Reachability scores might also be calculated for those secondary resources linked to from the first resource, looking at the third or tertiary pages and other resources linked to from the secondary resources.
Calculating reachability scores may follow a process like the following:
1) Google might begin by identifying secondary resources that are reachable through one or more links of a primary resource, where the secondary resources are within a number of hops (clicks, gestures, etc.) from the primary resource,
2) An aggregate score for the primary resource might be calculated based on the scores of the secondary resources, where the scores for those is calculated based on prior user interactions with the secondary resources,
3) That calculated reachability score can impact the ranking of the primary resource in search results.
Prior user interactions for the secondary resources might:
1) Represent an aggregation of multiple users’ interactions with the secondary resource.
2) Include a median access time or a click-through-rate (long clicks, medium clicks, short clicks) associated with the secondary resources.
A resource’s reachability score is a prediction of the amount of time a searcher might spend accessing the primary resource as well as additional (secondary) resources linked to the resource.
By adding the influence of reachability scores to an initial ranking score that might be calculated based upon relevance and importance (such as PageRank), it may lead to search results that improve user experiences, as well as potentially improving an advertiser’s ability to reach the user (for primary and secondary resources that use advertising).
The patent is:
Invented by Hao He, Yu He, and David P. Stoutamire
Assigned to Google
US Patent 8,307,005
Granted November 6, 2012
Filed: June 30, 2010
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining a resource’s reachability score.
In one aspect, a method includes identifying one or more secondary resources reachable through one or more links of a primary resource wherein the secondary resources are within a number of hops from the primary resource; determining an aggregate score for the primary resource based on respective scores of the secondary resources wherein each one of the respective scores is calculated based on prior user interactions with a respective secondary resource, and providing the aggregate score as an input signal to a resource ranking process for the primary resource when the primary resource is represented as a search result responsive to a query.
The description of the patent provides examples related to video searches and search results, but stresses that it can apply to many other types of documents, such as HTML pages, word processing documents, PDFs, images, feeds, and more.
The patent uses the word “hops” instead of clicks because it contemplates user behaviors such as touch gestures, voice commands, or other input types other than just clicks.
User interactions for videos could include things such as:
Click through rates,
Median access times, etc.
Some user behavior signals are treated as being a lot more trustworthy or reliable if there are a lot more data points. So, user data about a resource that has only been accessed a couple of times might not be seen as too reliable and would be considered a lot more reliable if it had been accessed a thousand times or more.
The number of clicks or hops away that a secondary resource might still be considered secondary may be based upon some preset amount, or how many clicks or hops someone might average when they click on a result during a search session, or during a certain period of time (such as 24 hours).
The reachability scores might be influenced by other signals that indicate trustworthiness as well, such as previous user interactions, or whether clicks on those resources tend to be long clicks, or have been deemed trustworthy in some other way.
A long click for something like a video might be based upon something like whether or not a video has been watched for at least 30 seconds, or if the video is shorter than that if the whole video has been watched. Interestingly, YouTube recently noted on their blog that the rankings of videos might be influenced by time watched. That new signal may or may not be based upon the reachability score described in this patent, but it does seem to be influenced by the concept of a “long click.”
Are your pages or videos or other documents being promoted (or falling behind other search results that are being promoted) based upon reachability scores?