Extracting Facts for Entities from Sources such as Wikipedia Titles and Infoboxes

There are a number of patents from Google, both granted patents and pending patent applications, that describe ways that Google might learn about entities and about facts associated with those by extracting the information from the Web itself instead of relying upon people submitting information to knowledge bases such as Freebase.

We learned from Google’s recent announcement that they would be replacing the Google Knowledge Base with their Knowledge Vault, and that supposedly brings a whole new set of extraction approaches with it that have high levels of confidence with them as to how accurate they might be.

It’s hard to tell exactly which approaches Google might be relying upon, and which ones that Google might have introduced through something like a patent that is no longer being used. But, it doesn’t hurt to learn some of the history and some of the approaches that might have been used in the past.

I’m blogging about a patent today that describes an approach that many of us have assumed that Google has been using for years to identify Objects or Entities and attributes about those and the values that fit those attributes.

Continue reading

Share

Did Google acquire Game Maker and Distributor CiiNow?

CiiNow Logo - cloud gaming defined

Google was officially assigned the pending patent applications from CiiNow last Wednesday (August 27, 2014) in a transaction that was reported as being executed at the end of July.

USPTO Assignment of patents from CiiNow to Google

From searching through the USPTO, I don’t see any other patents assigned to CiiNow, so that appears to have been all they owned. The USPTO assignments don’t include financial details, so that information is unavailable.

The Ciinow.com website appears to be completely unresponsive to visits. The LinkedIn profile of CiiNow Co-Founder and VP of Engineering Devendra (Deven) Raut left CiiNow in 2014 and joined Google as a Tech Biz Dev. It looks to me that Google acquired CiiNow, Inc.

Continue reading

Share

Google Patent Attacks Reverse Engineering of Local Search Listings

The title from a Google patent reached out and grabbed me as I was skimming through Google’s patents. It has the kind of title that captures your attention, as a weapon in the war that Google wages against people who might try to spam the search engine.

The title for the patent is Reverse engineering circumvention of spam detection algorithms. The context is local search, where some business owners might be striving to show up in results in places where they don’t actually have a business location, or where heavy competition might convince them that having additional or better entries in Google Maps is going to help their business.

The result of such efforts might be for their local listings to disappear completely from Google Maps results. The category Google seems to have placed such listings under is “Fake Business Spam.”

Google spam score flow chart from patent

Continue reading

Share

Identifying Entity Types and the Transfiguration of Search @Google

The World Wide Web is a vast resource for information. At the same time it is extremely distributed.

A particular type of data such as restaurant lists may be scattered across thousands of independent information sources in many different formats. In this paper, we consider the problem of extracting a relation for such a data type from all of these sources automatically.

We present a technique which exploits the duality between sets of patterns and relations to grow the target relation starting from a small sample. To test our technique we use it to extract a relation of (author, title) pairs from the World Wide Web.

Sergey Brin, Extracting Patterns and Relations from the World Wide Web (pdf), Stanford University, 1999

Torpedo as Aft, in the Torpedo Factory in Alexandria
Entities Change – Torpedoes become art and Search Engines become Knowledge Repositories.

Continue reading

Share

Semantic SEO or Semantic Search?

A few years ago, I presented at SES San Jose and someone asked me what they should be keeping an eye upon in SEO. I told them “named entities.” I was reminded of that conversation as I gave a talk today about named entities and other semantics.

I presented this morning at San Jose McEnery Convention Center at the Semantic Technology and Business Conference (#SemTechBiz2014).

Barbara Starr and I gave a 3 hour Tutorial on Semantic Search to an enthusiastic and engaged audience. We also discussed which might be a better name for the tutorial, “Semantic Search” (the name it had) or Semantic SEO (what do you think?).

Here’s Barbara’s presentation, which is the first half of the tutorial Thanks, Barbara – totally brilliant stuff:

Continue reading

Share

Google on Finding Entities: A Tale of Two Michael Jacksons

I’ve been saying for at least a couple of years that Google’s local search is a proof of concept for the search giant to use on how to find and understand entities.

With local search, Google goes out and looks for a mention of a business on the Web, especially when it it accompanied by geographic location information. It collects and gathers facts related to businesses (entities are people, places, and things) and then it clusters information about the objects it finds to make sure that those mentions across the Web are all referring to the same places.

If you start reading about local search, you’ll see people referring to the importance of consistency in how you present address information for a business, and the same thing is true for entities.

Two different michael jacksons

Continue reading

Share

Entity Mentions are Good: Brand Mentions are not the New Link Building

A couple of months ago, I wrote a post about a new patent from Google that was the first Google patent granted to Navneet Panda as an inventor. The patent described a complicated way for Google to judge the quality of websites, and my post was titled Is this Really the Panda Patent?. Simon Penson wrote a followup post at Moz titled The Panda Patent: Brand Mentions Are the Future of Link Building which looked at some other aspects of the patent.

On August 1st, Jayson Demers published a post to Forbes titled Implied Links, Brand Mentions And The Future Of SEO Link Building which covers a lot of the same ground as Simon’s post. I contacted an editor at Forbes and stated that the post plagiarized Simon’s post. Jayson didn’t give me any credit for my post about the patent either, but Simon did.

The patent office in Washington DC, prior to 1940

Continue reading

Share

Getting Information about Search, SEO, and the Semantic Web Directly from the Search Engines