What are Named Entities?
Named Entities are specific people, places, or things, focusing on what Google might look for when returning information about queries. They got a lot smarter in answering questions about named entities with the acquisition of MetaWeb, which had developed a way of better understanding named entities in searches for them, which Google appears to have adopted.
Here is an example of how MetaWeb handled named entities, as described in one of the patents they had gotten granted:
You may know him by many names or titles – Governor of California, Terminator, Governator, Conan the Barbarian, Kindergarten Cop, Mr. Universe, Mr. Olympia, Arnold Strong, Arnie, The Austrian Oak.
To Metaweb, Arnold Schwarzenegger is referred to as 9202a8c04000641f8000000000006567.
Who is Metaweb?
Metaweb is a company recently acquired by Google, and they’ve created a system of indexing named entities that allow you to search for information in a new way. The idea sounds a little like a library’s Dewey Decimal system, but for named entities.
Why is this important, and what are Named Entities?
A named entity is a specific person, place, or thing. For example, named entities can include Barack Obama, the Commonwealth of Virginia, or the Great American Ballpark in Cincinnati. Associating unique identification numbers with named entities can make it easier to index them and find information about those named entities when they might be referred to by different names, like my example above about Arnold Schwarzenegger. They can also help with local search by allowing specific places, businesses, or landmarks to have unique identification numbers.
How often do named entities appear in Web searches? A recent paper from Microsoft, Building Taxonomy of Web Search Intents for Name Entity Queries (pdf) tells us that they are pretty common:
According to an internal study of Microsoft, at least 20-30% of queries submitted to Bing search are named entities, and it is reported 71% of queries contain name entities.
Google announced their acquisition of Metaweb in an Official Google Blog post, Deeper understanding with Metaweb. Metaweb also announced the acquistion in their post, Metaweb joins Google
Metaweb started a knowledgebase called Freebase, which had volunteer editors and contributors who added entity information. It became one of the significant sources of information behind Google’s Knowledge Graph.
Metaweb has several patent applications at the United States Patent and Trademark Office. They are worth diving into if you want to learn a little about some of the technology behind the company.
I’ve just started looking at them myself, beginning with the one below on “Query Optimization,” where I found the Metaweb ID number of Arnold Schwarzenegger. The patent filing describes how an ID number can collect and store data about named entities and information associated with them and how queries can be performed based on that collected information.
Here are the patent filings assigned to Metaweb
Automated online purchasing system
Invented by W. Daniel Hillis, Bran Ferren
US Patent Application 20030195834
Published October 16, 2003
Filed: September 18, 2002
Meta-Web
Invented by W. Daniel Hillis, Bran Ferren
US Patent Application 20040210602
Published October 21, 2004
Filed: December 15, 2003
Personalized profile for evaluating content
Invented by W. Daniel Hillis and Bran Ferren
US Patent Application 20050131918
Published June 16, 2005
Filed: May 24, 2004
Delegated authority evaluation system
Invented by W. Daniel Hillis and Bran Ferren
US Patent Application 20050131722
Published June 16, 2005
Filed: May 25, 2004
System and method to facilitate importation of user profile data over a network
Invented by W. Daniel Hillis and Bran Ferren
US Patent Application 20060095780
Published May 4, 2006
Filed: October 28, 2004
User Contributed Knowledge Database
Invented by Timothy Sturge, Kurt Bollacker, Robert Cook, John Giannandrea, Nicholas Thompson, Edwin Taylor
US Patent Application 20090024590
Published January 22, 2009
Filed: April 22, 2008
Graph Store
Invented by Scott Meyer, Jutta Degener, Barak Michener, John Giannandrea
US Patent Application 20100174692
Published July 8, 2010
Filed: January 20, 2010
Database Replication
Invented by Scott Meyer, Jutta Degener, Barak Michener, John Giannandrea
US Patent Application 20100121817
Published May 13, 2010
Filed: January 20, 2010
Query Optimization
Invented by Scott Meyer, Jutta Degener, Barak Michener, John Giannandrea
US Patent Application 20100121839
Published May 13, 2010
Filed: January 20, 2010
Knowledge Web
Invented by W. Daniel Hillis and Bran Ferren
Assigned to Metaweb Technologies, Inc.
US Patent 7,502,770
Granted March 10, 2009
Filed April 10, 2002
Metaweb Conclusion
Metaweb operates the community-based site Freebase, a community-based source of data about different people, places, and things. For a great example of how they collect and display data, see their page on George Washington.
What will Metaweb bring to Google?
That remains to be seen, but Metaweb’s technology might help make it easier for Google to associate information with named entities. As the Microsoft paper I mentioned above noted, searches for named entities make up a good percentage of searches on their search engine. The chances are that searches for named entities are pretty popular on Google. So the impact of the Metaweb acquisition could potentially be a large one.
I’ve written a few posts about named entities. These are some that I wanted to share:
- Do You Have a Named Entity Strategy for Marketing Your Web Site?
- How I Came to Love Entities and Start Doing Entity Optimization
- How Google Uses Named Entity Disambiguation for Entities with the Same Names
- How Named Entities Connected to Trending Topics can be used to Address Real Time Search Results
- Not Brands but Entities: The Influence of Named Entities on Google and Yahoo Search Results
- How Knowledge Base Entities can be Used in Searches
- Finding Entity Names in Google’s Knowledge Graph
- Google Gets Smarter with Named Entities: Acquires MetaWeb
- Entity Associations with Websites and Related Entities
- How Google Might Identify Entity Synonyms Using Anchor Text
- Extracting Facts for Entities from Sources such as Wikipedia Titles and Infoboxes
- Extracting Semantic Classes and Corresponding Instances from Web Pages and Query Logs
- How Google May Identify Main Entities
- How Google’s Knowledge Graph Updates Itself by Answering Questions
Last Updated June 26, 2019.
Never came across Metaweb before – but seems Google is getting smarter by the day and is acquiring anything that might pose a question mark on its supremacy even before those companies/properties become serious player.
I hope that this move of Google is really for the better and also, that Metaweb will really help in improving Google. I am hoping for the best.
Hi John,
I hadn’t heard of Metaweb before this acquisition either. Spending some time reading through some of their patent filings, I think they have some pretty interesting ideas. It’s hard to tell if they acquire the company to use its technology, or to “hire” the people working for them, or both.
Hi Andrew,
I think the potential is there for the acquisition to help improve what Google is doing. It sounds like Google isn’t going to make any changes to the Freebase site that metaweb runs, so whatever happens with the acquisition is more likely to impact Google’s search results. We may have to wait a while to see the impact of this purchase.
Hi Bill,
Well, I agree with John(the first commenter). Perhaps Google just see MetaWeb as a threat to their dominance. This is what these monster companies do right? Buy up the competition the moment they pose any kind of threat? Or am I just being a little cynical? 😉
Greetings from Spain.
Rob
Hi Rob,
Thanks. It’s nice to meet you.
There’s a possibility of that, though Google does have a significant head start over Metaweb in many aspects of search, and I’m not sure that they really could have been perceived as a threat to Google at this point in their life cycle.
I would suspect that the chance to work with Metaweb, and use the technology they developed had to be pretty attractive to Google, however.
Another possibility that someone like Microsoft may have targeted Metaweb if Google didn’t. 🙂
After never coming across Freebase before, I paid a visit there and the one page I went to (the Boston Red Sox) was 12 months out of date in places. Unless you’ve got the critical mass of visitors to self edit a site, like Wikipedia for all it’s faults does, then even the backing of Google will in no way guarantee success.
Hi Steve,
I hadn’t seen Freebase before, either. I noticed some areas that were light on data as well – I’m wondering if they will have more people getting involved in adding to that data now that Google has acquired metaweb. I’m not sure that acquiring Freebase was Google’s main objective in acquiring the company, however.
Google is making the right move when purchasing meta web. the semantic web is the next generation of the internet, where search engines stop looking for words and starts to understand truly what we are looking for.
so far i didn’t see any semantic database as serious as freebase. it looks as if while microsoft and bing search alliance is coming up google still takes a step ahead into the semantic web.
Thanks for the brilliant post.
Looks like Named Entities might be another thing that will become a ranking factor in search results. This was definitely a valuable acquisition for Google.
actually my friend, more that you could imagine, i have researched the latest patent that google had released on may this year, after 4 years of waiting. some of the features in the algorithm are able to calculate a person’s quality and expertise level in his area.
This means that if many SEO people will tend to visit your site and quote you and your articles it will mean much more than a bunch of bogus bookmark accounts with no clear entities.
i have wrote the article in hebrew – what google really knows about surfer behavior
i have tried to use google translate for people to read it. it is a bit weird but the message can be understood well.
bill if you see this, not in the quality that it will assist anyone here, don’t think i am trying to earn a link !
you can just remove it, though i think translating this piece of information to english could be important to anyone,
it took me 6 hours to read all related documents and experiments to extract this information.
Hi Duran,
It’s funny, but reading the first sentence of your comment, I think you could have made the same statement back when Google purchased Applied Semantics. In many ways, what they offer in the area of organic search is an approach that looks less at keywords and more at the meaning behind those words.
Microsoft has been working on an object-level (pdf) search approach for a few years, and you can see it in action at Microsoft Academic Search. They have a few other papers on this kind of object level ranking, including how it can be used in other kinds of vertical searches such as for products.
Google has also allocated a lot of time and effort to fact extraction on web sites about specific people, places, and things, and it’s possible that they may take what they’ve acquired from Metaweb to build something that goes beyond what Freebase offers presently.
Hi Alex,
I’m not sure if this approach to named entities might translate into another ranking factor as much as it might signify a different approach to collecting and indexing information found on the web. Rather than helping to rank web pages presently, it focuses upon how “facts” found on pages about a specific person, place, or thing might be collected, organized, and presented to searchers.
Combine the technology that Google acquire from Metaweb with the technology that they acquired in their acquisiton of Transformic, and this could potentially be a very valuable acquisition.
Hi seo academic,
Yes, I mentioned Microsoft’s Academic search a couple of comments above yours as an example of how Microsoft isn’t sitting still on getting smarter about named entities as well.
Yesterday I read an article on Business Insider and there it was mentioned that Google acquired about 50+ companies this year. Its a gigantic growth rate. Hats off to Google..
Hi Geek Revealed.
I read that article too. I wish that there was a way to find out more about many of those acquistions – most of the details about them really weren’t made public, and the names of most of the companies involved are unknown. It is a gigantic growth rate.
I’m a big fan of Arnold. Just wondering why he quit showbiz and choose a political career. I mean not totally quit but.. you know what I mean..
Hi RJ,
Arnold has seen some tough times lately in the public eye, but his story is pretty interesting, and if he came out with an autobiography, I’d read it.
If you read the Wikipedia article about Arnold, there seems to be a little contradiction in his choice of getting involved in politics. One biographer states that Arnold planned getting involved in politics by using bodybuilding and then a career in politics as building blocks for gaining political office. Another section of the entry implies that Arnold wasn’t even seriously considering running for governor of California until he stated he would during an appearance on the Jay Leno show.
There are rumors out there that he is considering getting back into movies.