Google’s Agent Rank / Author Rank Patent Filing

I originally wrote the following article 6 years ago, and it was published on Search Engine Land on February 9th, 2007. At the time, I wasn’t sure if we would ever see Google find a way to meld together ranking signals from PageRank and Information Retrieval with relevance signals from authors and publishers and commentators and editors and advertisers.

There’s been a lot of discussion recently about something being referred to as Author Rank with the launch of Google Plus. The Agent Rank patent itself was granted by the USPTO on July 21, 2009. Two continuation versions of the patent were also filed by Google since then, with one stressing the portability of reputation scores for agents, and the other pointing out that not all endorsements from Agents are equal.

I hope that we never do see an “Author Rank,” but would prefer the Agent Rank described in the first patent, where the reputation scores of all of the people who put together the content of a page played a role in the ranking of that page.

If you’re interested in discussing Agent Rank today, I’m one of the moderators in the Google Plus Community Google Authorship & Author Rank. Stop on by, join if you’d like, and become part of the community.

Here’s my post from 2007:

Google’s Agent Rank Patent Application

Google returns results based upon content appearing upon individual pages, or at specific URLs. But that content could come from different authors, who have different levels of control over it. For example, a blog page may have posts written by more than one author, comments penned by others, and advertisements showing ads that even the owner of the site has no direct control over. A forum might have many different authors responding to an initial post, and may also display advertisements.

Imagine a system that instead of ranking content on a page level, breaks those pages down and looks at smaller content items on those pages, which it associates with digital signatures. Content creators could be given reputation scores, which could influence the rankings of pages where their content appears, or which they own, edit, or endorse.

That’s a broad overview of a new patent application from Google:

Agent rank

Invented by David Minogue and Paul A. Tucker
US Patent Application 20070033168
Published February 8, 2007
Filed: August 8, 2005

Abstract

The present invention provides methods and apparatus, including computer program products, implementing techniques for searching and ranking linked information sources. The techniques include receiving multiple content items from a corpus of content items; receiving digital signatures each made by one of multiple agents, each digital signature associating one of the agents with one or more of the content items; and assigning a score to a first agent of the multiple agents, wherein the score is based upon the content items associated with the first agent by the digital signatures.

Agents and Authority

When we perform a search at Google, we receive responses to queries based upon how relevant those results might be to our search terms. The order of those results is based upon rankings influenced by both query-dependent and query-independent criteria.

Query-dependent criteria are signals that try to identify how semantically related a document is to a query, such as word frequency distributions.

Query-independent criteria are signals that attempt to identify how authoritative, or intelligible, or trustworthy a document might be, such as PageRank. PageRank tries not only to look at the number of references to a document, but also the quality of those references.

Can authority or trustworthiness be measured in a different way, based upon understanding who the author of content on pages might be, through the use of digital signatures associated with an author? Could query-independent signals be tied to that author, so that a score for content created or controlled or edited or reviewed by the author could be used to rank pages?

This patent application describes a system where that might be a possibility.

Agent Control of a Resource

The document begins by looking at how much control that agents might have over specific resources.

When all content from a resource is under the control of a single agent, the reputation of the agent can be directly related to the content of that resource. But, it’s possible that a page has more hands involved than one, that each control different parts of a page. In that case, if the different partitions of information can be identified, reputation for each agent might be calculated at that partition level.

Difficulties involved with this approach might involve the fact that an agent may contribute content to many different resources, a single source may be created or controlled by multiple agents, and the ownership and control of a resource may change over time.

Benefits of the Approach

The patent filing describes a number of features and approaches, and they are worth looking over, but I want to focus upon the benefits that they say this will bring to us:

  1. Identifying individual agents responsible for content can be used to influence search ratings.
  2. The identity of agents can be reliably associated with content.
  3. The granularity of association can be smaller than an entire web page, so agents can disassociate themselves from information appearing near the information for which the agent is responsible.
  4. An agent can disclaim association with portions of content, such as advertising, that appear on the agent’s web site.
  5. The same agent identity can be attached to content at multiple locations.
  6. Multiple agents can make contributions to a single web page where each agent is only associated to the content that they provided.

Digital Signatures for Content

Different content pieces on a page can be signed with a digital signature, either directly by the agent or indirectly on behalf of the agent. These signatures identify who actually created each content piece on a page. One example for a method of creating and validating digital signatures is the World Wide Web Consortium’s XML-Signature Syntax and Processing

Content pieces can have multiple signatures based upon roles an agent may take involving the content, such as author, publisher, editor, or reviewer.

An agent would have exclusive access to the private key they use to sign the content piece, and the digital signature could also include metadata such as creation date, review score, or recommended keywords for search.

Agents could sign only a portion of a page, and exclude content over which they don’t claim any responsibility, such as ads served alongside the document.

That content can range from individual hyperlinks to entire documents, and can include text, images, audio, or video. The signature can also allow people to verify that the signed content hasn’t been materially altered since the signature was generated.

If you want to allow your content and signature to be portable, such as for a syndicated article, you could state that in the meta data associated with the content.

Ranking and Reputation Scores

Tying a page to an author can influence the ranking of that page. If the author has a high reputation, content created by him or her many be considered to be more authoritative that similar content on other pages. If the agent reviewed or edited content instead of authoring it, the score for the content might be ranked differently.

An agent may have a high reputation score for certain kinds of content, and not for others – so someone working on site involving celebrity news might have a strong reputation score for that kind of content, but not such a high score for content involving professional medical advice.

Reputation systems are often measured in terms of effectiveness by how difficult they might be to attack and manipulate. Here, there are at least two factors that may help keep manipulation from happening:

  1. Reputational scores may be set so that they are relatively difficult to increase and relatively easy to decrease, so that an agent may not want to place his or her reputation at risk by endorsing content inappropriately.
  2. Since signatures of reputable agents can promote ranking of signed content in search results, agents are provided a powerful incentive to establish and maintain good reputational scores.

The method of ranking based upon reputation scores is described in an analogy based upon PageRank. There’s also some discussion of an alternative possibility of using a seed group of trusted agents to endorse other content. Agents whose content receives consistently strong endorsements might gain reputation under that method. In either implementation, the agent’s reputation ultimately depends on the quality of the content which they sign.

The use of digital signatures enables the reputation system to link reputations with individual agents, and adjust the relative rankings based on all of the content each agent chooses to associate himself or herself with, no matter where the content may be located. That could even include content that isn’t on the internet.

Conclusion

This is a very different way of providing rankings for pages, based upon the reputations of agents who may have interacted with, and digitally signed content on those pages.

Ted Nelson, one of the early pioneers of hypertext, spoke at Google a couple of weeks ago (Transclusion: Fixing Electronic Literature – link to video). He described a very different kind of hypertext than what we are familiar with, which involved a system for connecting electronic documents with content from multiple sources appearing on the same pages together. The last question in the Q&A part of the presentation asked how his electronic documents might be connected so that they can be found easily. His answer, “I guess Google will do that.” This isn’t the system that Ted Nelson envisioned, but it shares some similarities.

I could see blogging systems building tools that allow for digital signatures like the ones described here, such as the Typekey feature in Typepad to authenticate the identity of commentators on multiple blogs.

Share

15 thoughts on “Google’s Agent Rank / Author Rank Patent Filing”

  1. I think that this is a great thing for content where authorship is appropriate; however, I do wonder exactly how Google will control for authorship affecting rankings for keyword terms that do not require authorship in the first place.

    Will we start seeing Amazon products showing up with an “author” for the description? That seems absolutely silly to me – there’s a definite difference between an article like yours and author-agnostic prose for things like website descriptions, product descriptions, and data oriented queries, e.g., “used cars in San Diego”.

  2. @ Ted –interesting take–but I have to think Google will have to be able to differentiate between sales prose and blogging prose…there are tons of on-page elements that should delineate such different sectors, no?

  3. One of the biggest issues is that many people think that Google Authorship and Google Agent Rank or Author Rank is the same. And they think that if you verify authorship on your site then your site automatically has a lot of Author Rank. Google (or Matt Cutts) needs to come out publicly and explain some of this and how Author Rank–and Authorship has to do with actual search engine rankings.

  4. Thank you for this post Bill.

    @ted : as it said, author and agent are not necessary the same thing.

    while an agent describing a product, is verry different from an agent commenting the same product, google only need to understand, wich kind of agents take in count for a determinated web page.
    On Amazon products page for example, it make sense to give more value to commentator, than the editor itself. For a blog post, the author could be the main agent to take count, in a search rankings perspective.

  5. My view is that Google want to create “author rank” as a means of getting users to identify themselves more often and in the same way across the net, once this is done in a large way Agent rank can really kick in to play.

    In the mean time Author Rank effects mainly blogs and those using G+ – at least in my experience.

  6. @Bill
    Not sure that’s going to happen. Google has always liked to keep things pretty vague. We’ll see. I would definitely listen to what he has to say.

  7. Glad to see your original post being resurfaced so that it’s top-of-mind for the folks who are paying attention. It feels like the Agent Rank telephoned over time into the idea of “AuthorRank”, and people would benefit from revisiting the primary sources from time-to-time.

    Curious to hear more about why you hope “AuthorRank” never comes about. Is that because that idea, of attributing one page to one author, is to basic and too far from the flexibility of Agent Rank?

    Also, that’s some quality spamming from John Miller, posting an unaltered quote from one of my pieces as his “comment” (“Nothing much happened with Agent Rank after that because the idea of ranking “agents” is dependent on being able to identify them in the first place. No great system for claiming an online identify really existed back then; I wouldn’t call W3C’s XML-signature syntax or other digital signature protocol an ideal solution.”). Haha.

  8. i agree with @ted, authorship for blogging is too much important than ecommerce site, but no one can understand what the actually Google do with their algorithms, sounds like mix.

    thanks to @bill i am also one of them people who think Google authorship and Google agent rank is the same.

  9. Hi Olivia,

    Thanks – Authorship definitely looks like one of the pieces of Agent Rank. I guess we need to wait and see on that one. :)

    I like the idea of setting up authorship for a blog, and I also like the idea of most ecommerce sites having blogs, too. I think it provides some insights into the people who run the site, and helps humanize it.

  10. Pingback: The Importance of a Google+ Profile for local business owners

Comments are closed.