Is Google Getting Better at Understanding Topic Authority and Author Authority?
Last week, Google was granted patents about ranking pages on the Web based on topics for those documents and expertise and/or authority of authors of those pages. The process also describes how Google may use different methods to determine the authority of multiple authors who may have worked to create the documents.
This sounds similar to statements Matt Cutts made in May in a video about What should we expect in the next few months in terms of SEO for Google?
The important statement there is:
We are doing a better job of detecting when someone is sort of an authority in a specific space. It could be medical, it could be travel, whatever. And trying to makes sure that those rank a little more highly if you are some sort of authority or a site that according to the algorithms we think might be a little bit more appropriate for users.
There’s been a lot of discussion about something that people have been calling “Author Rank,” and I’ve been referring to as Agent Rank back in 2007, in a post on Search Engine Land – Google’s Agent Rank / Author Rank Patent Application (someone at Search Engine Land changed the title of the post earlier this year to include “Author Rank” in it).
The patents point at a couple of different aspects of how the processes within them might work.
The first of those involves classifying documents based upon the topics that they cover, and weight for that document on how strongly a document might be associated with those topics.
The second involves receiving “authorship” information for the document, which would identify authors involved in the creation of the document and an authorship percentage for each author.
An “Authority Signature Value” for “a first author of a first topic may be generated based on a product of an authorship percentage for the first author of the first topic and the weight of the first topic in the document, where the first topic is included in the received topic information.”
Google has an “authorship program”, which is somewhat described in an interview with Google’s Samar Kamdar, who appears to be one of the heads of the program. The article tells us:
Kamdar explained to me that the Authorship program was based on the premise that content associated with a real identity is often of higher quality than content published anonymously.
The authorship program is related to the authorship markup that you can include on pages that you author. (I’m one of the moderators on Google Plus Community Google Authorship & Author Rank, and if you have questions or need help setting up Google Authorship, it’s a really helpful community).
The patents describe different aspects of how an “Authority Signature Value” might work, with much of the text (but not all) in the description sections of the patent being substantially similar.
Rather than summarize or itemize all of the information in these patents, I’m going to provide some highlights and let people interested in them drill down through the patents, though if you have a question, please bring them up in the comments below. Here are those highlights:
Social? – The patents don’t mention the word “social” or discuss Google Plus as a social network, but the “Authority Signature Value” language in the patents reminds me very much of the “digital signature” language from the Agent Rank patents.
Authority? – Google and other search engines have used the word “Authority” in the past to stand for other things. The HITS (hyperlink-induced topic search) algorithm was developed by Jon Kleinberg around the same time that PageRank was being worked on a few blocks away at Stanford. There have been more than a couple of SEO related articles that discuss topics such as How to Find Authority Websites & Get Links From Them. I’ve seen more than a couple of forum discussions (or arguments) on whether .edu sites and .gov site are “authority” sites on the basis of getting more value from links from those types of sites.
But the idea of having content that is digitally signed, and associated with a real person, and the fact that the content they’ve created on different topics has been analyzed by Google makes it more likely that an “authority value” associated with them is somehow authoritative.
Topic Authority and Topic Intensity – Documents may be broken down into topics, and the authority of those documents would be based upon how authoritative they are on those topics. The authority of authors and their signature scores would be based upon topics as well. So someone might be considered an “authority” on brain surgery or surf fishing or gardening and would have different “authority signature values” based upon topics. There’s some discussion on Topic intensity in the patents as well, and sometimes a topic might ebb and flow in value based upon trends and the burstiness of a topic. It’s good seeing this discussion on Topic Authority though.
Here are the patents:
System and method for determining similar topics
Invented by Michael Jeffrey Procopio
Assigned to Google
US Patent 8,458,197
Granted June 4, 2013
Filed: January 31, 2012
Abstract
A method and system for determining similar topics may include receiving user information for one or more users, the information including at least one topic and a user value for each topic, where the user value represents how strongly the user is associated with that topic. Topic information for a source topic may be generated based on the user information, the topic information including at least one user and a topic value for each user, where the topic value represents how strongly the topic is associated with that user.
Similarity scores may be generated based on a topic value for each user for the source topic and a topic value for the same user for each topic in a set of topics, where each topic in the set of topics is associated with a topic value for each user. Similar topics may be selected and output.
System and method for determining similar users
Invented by Michael Jeffrey Procopio
Assigned to Google
US Patent 8,458,195
Granted June 4, 2013
Filed: January 31, 2012
Abstract
A method and system for determining similar users may include receiving information for a source user, the information including at least one topic and a user value for each topic, where the value represents how strongly the user is associated with that topic.
Similarity scores may be generated based on a value for each topic for the source user and a value for the same topic for each user in a set of users, where each user in the set of users is associated with a value for each topic. One or more similar users may be selected based on the generated similarity scores, and one or more of the selected users may be output.
System and method for content-based document organization and filing
Invented by Michael Jeffrey Procopio
Assigned to Google
US Patent 8,458,194
Granted June 4, 2013
Filed: January 31, 2012
Abstract
A method for categorizing documents may include receiving topic information for a source document, the information including at least one topic and a weight for each topic, where the topic relates to the content of the source document, and the weight represents how strongly the topic is associated with the source document. Similarity scores may be generated based on a weight of each topic in the source document and the weight of the same topic in each document within one or more sets of documents, where each document in the one or more sets of documents comprises topic information.
A confidence score may be generated, based on the similarity scores, for each of the document sets. One or more document sets may be selected based on the confidence scores and may be output to a user.
System and method for determining active topics
Invented by Michael Jeffrey Procopio
Assigned to Google
US Patent 8,458,193
Granted June 4, 2013
Filed: January 31, 2012
Abstract
A method for determining active topics may include receiving topic information for a document, the information including at least one topic and a weight for each topic, where the topic relates to the content of the document, and the weight represents how strongly the topic is associated with the document. User activity information for the document, including a user activity value including at least one of a number of viewers and a number of editors of the document may be received.
A topic intensity for each topic may be generated and stored by multiplying the user activity value for the document by the weight of the topic in the document. The topic intensity may be monitored over time. An alert may be generated based on the topic intensity.
System and method for determining topic interest
Invented by Michael Jeffrey Procopio
Assigned to Google
US Patent 8,458,192
Granted June 4, 2013
Filed: January 31, 2012
Abstract
A method and system for determining topical interest may include receiving signal information for a user of a document, the information including at least one signal value representing the user’s activity with or relationship to the document. A document interest value based on the signal information for the user may be computed. Topic information for the document may be received, the information including at least one topic and a weight for each topic, where the topic relates to content of the document, and the weight represents how strongly the topic is associated with the document.
An interest signature value of a first topic for the user may be updated by adding the product of the computed document interest value for the user for the document and the weight of the first topic for the document.
System and method for determining topic authority
Invented by Michael Jeffrey Procopio
Assigned to Google
US Patent 8,458,196
Granted June 4, 2013
Filed: January 31, 2012
Abstract
A method and system for determining topical authority may include receiving topic information for a document, the information including at least one topic and a weight for each topic, where the topic relates to content of the document, and the weight represents how strongly the topic is associated with the document. Authorship information for the document may be received, the information including, for each topic in the document, at least one author and an authorship percentage for each author.
An update to an authority signature value for a first author of a first topic may be generated based on a product of an authorship percentage for the first author of the first topic and the weight of the first topic in the document, where the first topic is included in the received topic information.
Bill, what is a “user” as it is being used in these patents?
The patents used word user, author and viewer … as +Mark question, i have the same one with him, hopefully you can explain more about those terms.
I love the validity of Author Rank. What I’m less certain about is deciding “expertise” based on social signals. For example, I like to share a lot out of my expertise…I save the day job for more of my expert advise.
It amazes me how clear this is about the various intentions they are making. I am also amazed at the detail of the abstracts. This makes so much sense in light of the emohasius from everyone lately on authorship at Google Plus. Mark Traphagen’s question makes sense to me. If user is the reader and author is the author, this is not just about analyzing so much detail on each author but analysis of the reader and how they show their interest. It is getting so detailed about making sure we write about things and the audience is pleased. Just so fascinating.I pleases me to see such detail in this planning. Thank you for sharing.
Hi Bill,
I’m interested in what happens with the authority that is built up by a “user” when they move from one publisher (as an employee) to another?
E.g.
You sold seobythesea.com and stopped contributing posts to it.
Would all of the authority that this site now has, because if you, in some way follow you?
Another way to put it… would this site become less authoritative?
Duncan
Yodelay.com
Mark – To me the crucial patent that is related to all this is Delegated authority evaluation system – http://www.google.com/patents/US20100185626
In that document it refers authorities as a public entity or private entity.
I know I have been banging on about it a long time, but I say a user/authority/author can be a public entity too.
I have always said there is a difference between author which is used for delegation and Author which is display system.
Nice as always…”Bill”
And it’s now more clear that future of search is going to be more and more social. And writing Good content or publishing on big authoritative website will same weighted as the well established Author profile associated with the same.
Great Post, thanks.
So Authorship ist going to be a ranking factor after all… I am still curious how individuals who create content for/on behalf of a corporation will be handled.
I knew authorship would be something that Google would consider now that they have their own platform with Google +
It will only be so long until everyone starts developing Google + profiles…
Thanks, Bill – great stuff, as always.
It spooks me a little to think that “authorship value†itself might become a (major?) ranking factor. If that happens, it seems to me that we might see people buying and selling that authorship authority. Might become the new backlink buy. But who knows. I guess it all depends on how cautiously Google introduces it – and on the extent to which Matt Cutts acknowledges it as a factor that matters. IMHO, it has a lot of potential, but that includes potential for abuse.
question to me is: what if I am an expert on several different topics, which is not that unusual. And I connect the topicly different sites on my google+-account. Can I become an authority on different topics or will google decide on which topic? or will this maybe even result in no authority on all topics? Especially when these topics differ a lot, let´s say I´m an expert in “SEO” and “Finance”?
I´d appreciate any comments, especially from Bill!
Thank´s for this post, good stuff as usual!
As usual, great coverage here Bill….and thanks too for the depth which I’m currently researching myownself….
🙂
Jim
I for one, try to develop good content and utilize Google+ to leverage traffic. It seems to be working far better than expected, and as far as my visibility on G+ and “ripple” effect – its great. The end result is not affecting my website as of yet though. it may be due to it being a .XXX domain (blake.xxx) but I am going to switch it up and test how the keyword and G+ sharability relates to my website keywords, content and TLD.
Thanks for sharing!
Excellent coverage. I’m so glad I’m already building up my Google+ user profile.
Thanks
I’ve been working on my google authorship for a couple of weeks but it still doesn’t show up on searches. Time to check to see if there are any other missing links and re-index my sitemaps! Great post :)! Looking forward to building some authority 🙂
Hi Mark,
A user is someone who interacts with documents and the topics within them, according to the patents.
We’re told in the patents that:
Hi Eric,
The things you create when you blog, or when you contribute a social post on different topics are ways that you can show off your expertise. The things you share with others before they become popular is another way to show off your expertise. The threads that you become involved within on specific topics, with contributions and responses are other ways that you can show off your expertise. A quick response in a forum of “first” isn’t a way of showing off your expertise. 🙂
Hi Pepper,
Thanks!
Under the patent, both readers and authors are users. The way that you might interact with others within social settings and when you publish something on the web, and the expertise that you show off when doing that does show your interest in different topics, and the quality of those contributions and interactions can show off both your interests and the weight of your authority.
Hi Andri,
A “user” of the systems described within the patents can be an author and/or a viewer – someone who expresses an interest in a topic, and who may show off some expertise. That expertise can be added to a user value signature, which can increase and improve the expertise that a user is perceived to have on specific topics (or decrease them).
Hi Duncan,
The authority (or author signature value) would be tied to me as a user or author. That authority can then be used to weigh the authority of the content created. I believe that a measure of authority is supposed to reflect what actually happens with authority of content created by someone with expertise on specific topics. So, if you sold your website, and stopped contributing to it, it might continue to accumulate authority based upon the authority of people contributing new content to it. New content become more authoritative or less authoritative, but that would depend upon the new author or authors. The authoritativeness of older content might become more or less authoritative based upon the contributions that you might provide in other places based upon your “Author value signature.” 🙂
Hi Terry,
I’m not sure what role the patent you point out might have in relation to these patents and to author rank. I have been keeping it in mind and how it might fit, but I’m not sure that it is a good fit. It seems like it would be a better match for a review system or Q&A site rather than one that might impact rankings of web pages under a digital signal approach like agent rank or an author rank.
Thanks, Rajesh.
It is interesting that Matt Cutts recently told us that we should see the release of an algorithm that can boost search results from authors that have consistently exhibit some level of expertise on specific topics, in a way that seems to fit in well with these patents.
Hi Andreas,
It does seem like authorship is going to be tied to flesh and blood real people. In some ways, it’s like patents in that patents need to be filed by actual people, and then they can be assigned to the corporations those people work for. We do have “publisher” markup, and there’s definitely going to be some kind of credit for content that goes to the person or business that publishes content.
Hi Stuart,
I think once everyone has a Google + profile, and Google can then evaluate those profiles, things will get even better. 🙂
Hi Phil,
It’s possible that Google might take a few iterations to get things to work well when they introduce social signals like authorship into rankings. But Google seems to be broadening the signals that it will be ranking pages upon instead of replacing them. I think that’s going to help.
Matt Cutts did recently indicate that pages might be ranked higher in the very near future based upon the exhibited authority and expertise of authors of that content.
I think there’s less potential for abuse under a system that would also include social signals of expertise, especially if they are tied to digital signatures and real people.
Hi Nico,
Many people can develop expertise for a range of topics. The difficulty might be in exhibiting that expertise in different areas, but it’s not an impossibility. The choices that you make when you write a web page or blog post, or create posts on social sites can be on a wide range of topics, but the “authority” that you show off is going to depend upon the quality of those contributions. I haven’t seen anything that might cause one type of expertise being diminished because of a display of a different expertise on a different topic. And under the patents, authority is based upon the topics associated with content authored rather than one overall authority value that might penalize someone for a wide range of expertise. Any limitations might be more based upon how widely you might spread yourself more than anything else.
Thanks, Jim.
I like keeping up with a lot of different topics, but it can be a lot of fun digging really deeply into some. 🙂
Hi Blake,
At this point, there really isn’t an indication from Google that they are be using authority or reputation scores for different topics, but it seems like we might be on the horizon of that happening. Fortunately, there’s a benefit to building authority and expertise that transcends Google’s use of algorithms.
Thanks, George.
It’s definitely a good time to get in on the ground floor in terms of building up that profile.
Hi Ben,
Definitely check using Google’s rich snippet markup tool as well. It might take a while for authorship markup to show up in search results – it’s possible that Google has some thresholds in place, like requiring a certain number of pages or blog posts, or something else that may be required.
Do you have any thoughts on where Publishership might be going? For example, you might have a team of highly skilled writers that all contribute to one blog, but you want that blog to remain independent of the writers; Do you will eventually use a Publisher account the same way you use an Authorship account? As they’re still quite different at the time.
On a separate note, I’ve been an author for http://net66.co.uk for some time now, and with having another site (site 2) ranking lower, I decided to add my authorship to this second site. A week after site 2 was cached every ranking I was monitoring and traffic improved. There was also no change on Net66 so I think having one author write for two or more sites doesn’t really matter.
Looks like it is time to start working harder on Google + accounts, very interesting read and certainly look as though they will hold quite a bit of weight.
Very interesting read. Great to see some new updates on Authorship and Author Rank. Highly anticipating the day Google releases an official announcement on this.
Thanks Bill!
Very good read as always. I am a regular reader, sometimes poster. I feel compelled to actually do something with my Google+ profile, but was curious; is authorship exclusive to Google+, or, to what extent are other factors weighted (posts from past company emails, social accounts, or other). In what way can a user consolidate these past activities to gain authorship from them without having to reach out to every webmaster and adjust a credit. Perhaps I am confusing authorship with backlinks to some extent.
For example, would I gain credit from past postings on http://www.seobythesea.com if I was to update my Google+ account to associate with this old company email address? If I did not have access to a past email address, would any previous articles and/or post be lost to the author or is there a way to transfer its value moving forward?
Modifying Spammers Patent
Microsoft Weighs in on Ranking Authors in Social Networks
If association to a specific email address or profile can be traded or attached in some way, I could see possibility for abuse. At the same time a user could gain authorship from others in their organization. Who should own the rights to this authorship? The company or the individual?
Defiantly time to start building authorship with Google+ as I can only assume its going to become more and more integrated into the algorithm. You maintain so much info on this blog I sometimes wonder how you find the time to write all these fantastic posts, thank you very much and don’t stop, best wishes.
Yes it had to come, Google+ has been missing that special ingredient to make it sticky, now it becomes the “must have” profile for anyone wanting to make their way via the web and once again Google has control. R John
Hi Adam,
Thanks. Building the “authority” behind what Google might try to measure algorithmically is something that we could do without Google+ even being around – the things they look at are supposed to be things that echo features that display authority and expertise. Regardless of whether Google uses them as signals in future rankings, they still have value outside of Google+.
The “information” on this website is part of my personal effort to build up that kind of authority. 🙂
Hi John,
While I think it’s going to be very important to maintain and improve upon a “reputation” based upon having a Google+ profile and building up reputation through it, I’d definitely urge caution. We’ve seen search engines die before, with AltaVista, Excite, Lycos, and so on. There don’t necessarily seem to be any real competitors to Google when it comes to search now, but we don’t know what the search landscape will be like in even five years. 🙂
Hello Bill,
I come to you since I consider you the expert in SEO & SEM who reads and write in a more scientific way.
In the following patent “System and method for authority value obtained by defining ranking functions related to weight and confidence value”, it seems that Microsoft was also doing something quite similar, but obviously with a different approach. You can also find it as “Authority Ranking” pub. no.:US2011/0246484 A1. In that one it seems that Bing will start determining the authority of a source by checking how the source is interacting with social networks, real time networks and web server. The term for “source” is quite ambiguous anyhow.
What got my attention was the paragraph which says:
“[40] In some embodiments, the source data 202 refers to a source 110 that has consumed the content 106. Consumption of content 106 by a source 110 deemed to be authoritative can be understood by the authority server 116, or any other entity generating the authority index 116, as indicating that the content 106 is authoritative, and/or is more or less authoritative on the basis of the association with the source 110.”
I was wondering if that means that depending on the source who consume(I understand “consume” kind of like “read/download”) the content of our website, they will tweak the authority values assign to our website.
Sorry for misspelling things and grammar.
Thanks Bill, As usual.
Excellent coverage.
It just makes sense for google to give more weight to Google+, I believe anyone who is not capitalising on this is already missing out.
THanks again