Is Hummingbird the key to understanding an author’s expertise for things like In-Depth articles, and a possible future Author Rank? With content from an author considered using a concept-based knowledge base, it’s quite possible.
The Google Hummingbird rewrite of Google’s search engine wasn’t just aimed at providing a way to understand better long and complex queries, like the type that someone might speak into their phone. It was also likely aimed at better understanding the concepts and topics written about and discussed on Web pages, and in social signals such as posts at Google+ and comments on those posts, in Tweets, in Status Updates, and other short text-based messages where there might not be a lot of additional contexts to go with those messages.
The following screenshot shows the concepts that might appear for Tweets when they are analyzed using the Probase Concept-Based knowledge base (from Short Text Conceptualization using a Probabilistic Knowledgebase):
Google has been presenting knowledge panels next to search results, especially when a query includes a named entity in it. For example, search for [Jerry Lewis] and Google shows off facts about Jerry Lewis that it has extracted from pages on the Web. These include facts from Wikipedia, upcoming events where the comedian is performing, movies and TV shows that he’s been in, and other people that are often searched for when someone searches for Jerry Lewis, such as Dean Martin, Bob Hope, Tony Curtis, Milton Berle, and others.
Search for Kanye West, and you’ll see some similar results that include facts, songs that he has written and performs, Albums he’s released, and other people who might be searched for by people who search for Kanye West. Jerry Lewis isn’t included in those people.
In both cases, Google recognizes that there is a named entity in the query being performed, and it looks up what it has in its knowledge base to show off those knowledge panels. It also might use that information for the web search results that it sees as well. But, Google is likely doing more than just looking for entities. It can also look for the concepts and attributes of entities when it considers queries that people have searched for. A knowledge base that includes entities and their attributes, concepts, and keywords can be useful in expanding queries that someone searches for, to show a wider range of relevant search results, like in the Probase example above.
Building a Concept-Based Knowledge Base at Google
To learn more about Hummingbird, I’ve been exploring Microsoft’s Concept-Based Knowledge Base Probase recently, in the posts Are You, Your Business, or Products in a Knowledge Base?, and in Concept-Based Web Search. Google has been granted patents within the last year on different ways to construct a concept-based knowledge base and better understand the content of queries. Another white paper on Probase, titled Short Text Conceptualization using a Probabilistic Knowledgebase, covers some of this same territory. From the abstract:
In this paper, we improve text understanding by using a probabilistic knowledge base that is as rich as our mental world regarding the concepts (of worldly facts) it contains. We then develop a Bayesian inference mechanism to conceptualize words and short text. We conducted comprehensive experiments on conceptualizing textual terms and clustering short pieces of text such as Twitter messages.
Our approach brings significant improvements in short text understanding compared to purely statistical methods such as latent semantic topic modeling or methods that use existing knowledge bases (e.g., WordNet, Freebase, and Wikipedia). Our approach brings significant improvements in short text understanding as reflected by the clustering accuracy.
I’m going to be digging into several patents recently granted to Google that describe how they may be building a concept-based knowledge base that can be used to understand short text messages better and to understand the topics that authors write about, and the topics and concepts discussed on pages on the Web.
Authorship and Determining Expertise in Topics
Google’s authorship program allows people to digitally sign the content that they create on the Web and within Google Plus and other places on the Web. Google is likely exploring ways to understand messages and blog posts, and articles written by these authors to gauge and score them on the topics that they write about. In How Google Might Rank User Generated Web Content in Google + and Other Social Networks, I wrote about a Google patent application that described how Google might generate user contribution (reputation) scores like that.
For Google to start using author reputation and expertise as ranking signals with different scores in different topics, Google needs to be able to understand the concepts that people write about, and how those might be related and fit into different topics. As Google explains on their page about Appearing in In-Depth Search Results:
Authorship markup helps our algorithms to find and present relevant authors and experts in Google search results
For Google to determine whether or not an author has expertise in a particular topic, Google needs to be able to understand what they write about and determine what their level of expertise might be compared to other authors who write about related topics. Here’s what Google’s Matt Cutts said in his Pubcon 2013 Keynote Presentation about author authority:
We’ve also been looking at detecting and boosting authority. So take medical, for instance. If you’re an authority in the medical space, we want to know that and to push you up higher whenever a medical query comes along. Now, this is not something that is done by hand. We don’t pick individual topic areas. It applies to thousands of different topic areas.
So, nothing that you have to do, but if you are a topical authority, keep writing about it, keep developing, keep deepening the amount of content that you have. You want to be a resource, you do want to be an authority, and if you turn out to be an authority, then you’re more likely to be boosted by that particular change.
Conclusion
In the Hummingbird re-write of Google, the chances are good that a concept-based knowledge base will be used to understand better social signals like threads and comments in Google+ for measuring authority topics.
The screenshot from the Microsoft paper above on Probase shows how concepts might be mined from short text social messages using such a knowledge base. This would work well with named entities and attributes related to those entities, concepts identified in those short messages, and then keywords from the messages if no entity/attribute/concept associations are found in that knowledge base.
Keep in mind that Google is actively building its knowledge base. As it grows, more associations involving these different elements will be made.
We’re going to look at the Google patents that I’ve been talking about in recent posts next, to get an idea of how Google is going about building its concept-based knowledge base.
Bill,
I’ve always been fascinated at how you decipher patents and shoot the possible future of Google’s algorithm with it. We all saw that AuthorRank and being an authority in a niche with your Google Authorship is going to be leveraged (perhaps as a ranking factor) soon enough. Perhaps Hummingbird places us in such a position.
However, would you be so kind as to simplify what is the direct correlation between Hummingbird and AuthorRank? I was expecting to see that in this post – or perhaps I must’ve missed it?
Thanks!
Hi Sean,
There’s officially no such thing as author rank, and there are no guides to how it works from Google or anyone else. I made no promises to provide a direct correlation between Hummingbird and Author Rank, and I can not. I can only do a lot of research and share what I learn.
There has been a lot of speculation about a potential author rank, and I’m trying to put together research from a lot of sources, including information that Microsoft researchers have published, like the article that I linked to in this post that describes how a concept based knowledge base will make it a lot easier to interpret and understand the concepts in short text social messages.
I honestly cannot tell you the “direct” correlation between Hummingbird and Author Rank, because it’s likely that the only people who are aware of that direct correlation, if any, most likely work for Google, or have had that shared with them from someone at Google (I haven’t been).
I can, and as I’ve been promising, will write about granted patents that describe how Google might put together a concept based knowledge base, possibly similar in a lot of ways to Microsoft’s Probase, that can help Google parse short text social messages much better. If Author Rank does happen and it includes a number of different scores for people based upon the topics involved, that kind of knowledge base is going to be needed and necessary to calculate different author rank scores for individuals for the different types of concepts that they write about.
HI Bill,
I thought I’d come and say hi, I’ve been kind of shy lately on the blog commenting and social which does no good at all for my authority, I’m falling prey of not practicing what I preach as much as I would like to.
I’m also having particular difficulty trying to make some clients see how the all content/blog post creation tie in with authorship, author authority (possibly the infamous author rank) and social engagement and content curation.
“For Google to determine whether or not an author has an expertise in a particular topic, Google needs to be able to understand what they write about, and determine what their level of expertise might be”
I think this quote and the video where Matt Cutts says it, illustrate it really well, small business owners have real trouble in seeing the value of writing content on their subject area and linking out to good resources, rather than writing just about their own business, products or services.
I think I need to send one or two to this article see if it helps.
Thanks for the great insight as always.
Hi Pedro,
Thank you very much for stopping by, and sharing your thoughts.
It isn’t always easy to get clients to see the connection between how spending time on Google+ or Twitter or other social networks and even their own blog might influence how the search engines might perceive them, and how that could potentially lead to better rankings, but there’s even the benefit that comes from building stronger relationships from people who might potentially become clients. So there is that direct impact that’s a little more tangible that a possible Author Rank. Even if Google doesn’t quite sense a client’s authority, if their visitors do, that can be potentially very beneficial, too. π
There is an article in the Search Engine journal http://www.searchenginejournal.com/google-authorship-lethal-guest-bloggers/69733/ which says “Sooner or later, horrible authors will vanish off the SERPs along with all their low-quality contents thanks to Google Authorship. And if that isnΓ’β¬β’t enough, the sites they own will also get heavily penalized, thanks to the list of verified sites in Google Webmaster Tools.”
What they basically say that the Authors reputation will also start affecting the SERP in a big way in the near future.
Judging from the furious reaction from the YouTube community, Google really aren’t in many people’s good books right now. Regarding this author rank, Hummingbird just seems to help boost you if you’re not some spamming lunatic. I’ve sharpened up my Google+ account and always contribute good content. Am I an expert in anything? Probably not, but I can just be myself.
Back to YouTube comments briefly – I do hope they sort it out. It’s currently an utter disaster. What’s become apparent is the lack of anonymity won’t stop people from behaving like idiots online.
Hi Bill,
I’ve heard PR is not going to be updated anymore, these days many blogs talk about author rank, domain authority and page authority. If this is the case then how does SEO going to affect the blogs. This new algorithm Hummingbird is said to kill the micro niche blogs as when a user types buy xyz product results start appearing on the right side of Google so chances of users visiting the site and clicking on products will be reduced. So what do you think of this hummingbird affect on the micro niche blogs?
Hi Vijesh,
The Google announcement about toolbar PageRank was that it wasn’t going to be updated before the end of this year, but not that it wasn’t going to be updated ever again. Google did not announce that they were abandoning PageRank.
They also did not say that they were going to start using Author Rank anytime soon, and I’m guessing that’s something that they are still trying to figure out.
Domain authority and Page Authority are metrics created by Moz, and are not ranking signals developed by any of the search engines. They were created by Moz as a convenient tool for people to use to try to gauge how easy or difficult it might be to rank a page, but they are not ranking signals that can be used by search engines. It doesn’t matter whether or not there is discussion on blogs about author rank, domain authority, or page authority. Those blogs don’t determine how Google will rank pages.
I don’t think that Hummingbird will kill micro niche blog sites, but I Do think there are a lot of articles about Hummingbird that are only guesses at best.
Hi Alex,
Not really sure that the reaction to changes on YouTube means much when it comes to a possible author rank from Google.
This is excellent dude, semantic search via G+ profiles & pages, it all looks at authorship authority and engagement. I hate saying this & hate even more the years I spent learning this, but even non black-hat technical SEO really is dying, in a way G+ has almost re-started SERPS from scratch allowing one to rank much faster then before based on their G+ activities.
Hi bill,
Thank you for sharing this excellent article. It is very well written. Again thank you !!