I just returned from a few days in Las Vegas and the Pubcon Conference.
I had the chance to see some great presentations and talk to a number of interesting folks, and the company that I am the Director of Search Marketing at, Go Fish Digital won a US Search Award for Best Use of Search for Travel/Leisure, for a campaign we did for Reston Limo.
I wanted to share my presentation from the conference here as well.
Named Entity Disambiguation is important in a Knowledge Graph Search System
Last year I wrote a post titled Google on Finding Entities: A Tale of Two Michael Jacksons. The post was about a Google patent that described how Google might perform named entity disambiguation when different entities share the same name. The patent in it was filed in 2012 and granted in 2014. Google was also granted a new patent on how it might do named entity disambiguation this week, which was originally filed in 2006. It is worth looking at this second one, given how important understanding entities is to Google. A search for things instead of strings makes named entity disambiguation essential.
It contains a pretty thoughtful approach to named entity disambiguation.
The Web is filled with factual information, and Search on the web has been going through changes to try to take advantage of all of the data found there. Mainstream search engines, such as Google, Bing, and Yahoo, traditionally haven’t given us simple and short direct answers to our queries; instead showing us a list of Web pages (often historically referred to as 10 blue links) where that data might be found; and then forcing us to sort through that list to find an answer.
Google introduced direct answers to questions at the Google Blog in April 2005, in Just the Facts, Fast. These direct answers appear above the 10 blue links leading to pages about the query.
That may have been in response to Tim Berners-Lee writing about the Semantic Web back in 2001, where he alerted us to the possibilities that freeing data otherwise locked into documents might bring to us. By search engines finding ways to crawl the web collecting information about objects and data associated with them, we begin approaching the possibilities he mentioned. And we get direct answers that we otherwise couldn’t find as easily.
A patent granted to Google this week attempts to identify similarities between different types of entities, when it finds information about them on the Web. It refers to these types of similarities as commonalities, as in things they may have in common. Google may use these similarities in a number of ways, such as supplementing search results containing related information based upon results that might be in the same category or possibly located in the same region.
The things identified as common may be for things that are moderately unique, but not completely rare.
The patent say “entities,” but it seems to be focusing upon different businesses that might share some similarities. For example, they refer to a food critic writing about restaurants a few times and tell us that the things such a critic might write about different restaurants might be used to find similarities between those places.
Added 11:48 AM (pst) May 3, 2015, H/t to Natzir Turrado, incoming news is that Google+ is introducing a new feature they are referring to as Collections, and that announcement from The Windows Club features the word “curation” prominently as do the two Google patent applications I write about in this post. Here’s how Susannah Lindsay in The Windows Club article uses the concept:
Google Plus users will get an opportunity to curate pieces of content into their collection, with others holding the permission of viewing, sharing, and following those collections as they please.