There are a number of patents from Google, both granted patents and pending patent applications, that describe ways that Google might learn about entities and about facts associated with those by extracting the information from the Web itself instead of relying upon people submitting information to knowledge bases such as Freebase.
We learned from Google’s recent announcement that they would be replacing the Google Knowledge Base with their Knowledge Vault, and that supposedly brings a whole new set of extraction approaches with it that have high levels of confidence with them as to how accurate they might be.
It’s hard to tell exactly which approaches Google might be relying upon, and which ones that Google might have introduced through something like a patent that is no longer being used. But, it doesn’t hurt to learn some of the history and some of the approaches that might have been used in the past.
I’m blogging about a patent today that describes an approach that many of us have assumed that Google has been using for years to identify Objects or Entities and attributes about those and the values that fit those attributes.