Category Archives: Fact Extraction and Knowledge Graphs

Techniques and approaches that search engines might use to extract facts and information from the Web, as uncovered in search-related patents and whitepapers.

Rich Snippets and Patterned Queries

Revisting the Subscribed Links Patent Five Years Later and Finding the Rich Snippets Patent

I first looked at this patent five years ago, but called it the Subscribed Links Patent.

At the time, Google had a Subscribed links program, where site owners could create specialized search results based upon certain patterns of queries, that would show additional content for a searcher. For some of those, you had to log into your Google Account and subscribe to certain links to be shown special content.

Oddly, some of those specialized search results didn’t require subscriptions, and didn’t require logging in. Much like these NFL sports Scores from this weekend:

A Football Score Rich Snippet

Continue reading

How Google May Answer Fact Questions Using Entity References in Unstructured Data

A Google patent application explores how Google may answer factual questions from unstructured Web pages and results rather than from more structured sources such as Freebase or Wikipedia. The processes described in the patent are pretty interesting, and they might be more familiar to an SEO trained audience than a Semantic Web one, like a result that ranks well because of a “query deserves freshness” approach.

They also avoid a problem for the search engines that I’ve been thinking about for weeks.

The problem was one that came to me when I attended The Semantic Web Business and Technology 2014 conference around a month or so ago. In a presentation by Yahoo!’s Nicolas Torzec, he discussed Yahoo!’s relatively new Knowledge Graph, and was asked a question by someone from the audience about

Continue reading

At Pubcon, Presenting on a Semantic Timeline at Google

Tomorrow morning, I’m presenting on the Semantic Web at Google at Pubcon in Las Vegas. I’ve included my presentation deck here to use as a kicking off point for further discussion.

Changes to what Google shows in search results have been difficult to miss, from many different types of rich snippets to recent additions of search boxes in search results and Google showing snippets from pages that contain both query answering and question answering results mixed together.

Thanks to Barbara Starr for taking a look at the presentation, and for suggesting that I look for a Google patent for rich snippets which I hadn’t included. I went searching the patent in the US Patent office and found a good candidate for it, and will probably post a more detailed look at that one in the near future. It’s Generating specialized search results in response to patterned queries.

Here’s my presentation:

Continue reading

At SMX East; Presenting on Google and the Semantic Web

The Semantic Web is making an even stronger appearance recently at Google than it has in the past. With knowledge panels, carousels listing all kinds of things (and people and places), structured snippets merging query answers with question answers into a single snippet, OneBoxes of many different kinds, and even Hummingbird responding better to longer and more complex queries, it’s the future of Google.

I’m presenting on it this morning at the Javit’s Center in Manhattan at SMX (Search Marketing Expo) East, in a session titled “Hummingbird and the Entity Revolution”

msimmonds-smx-east

Continue reading

Extracting Semantic Classes and Corresponding Instances from Web Pages and Query Logs

In creating a knowledge base, there seem to be a number of approaches that can be used to supply entities and facts from sources like web pages and query logs.

In my last post, I wrote about how search queries might be used, along with linguistic patterns, to extract attributes about facts from those search queries, as described in a patent titled Inferring attributes from search queries.

A Microsoft paper from 2009, Named Entity Recognition in Query, tells of a manual analysis they performed of 1,000 queries, and told us that 70% of those queries contained named entities.

So entities do appear in queries, and Google receives a lot of queries a day (as does Microsoft and Yahoo).

Continue reading

How Google May Add to its Knowledge Base with Entities and Attributes from Search Queries

Millions of searches stream into Google everyday as people try to meet their informational and situational needs. But those searches don’t disappear after the searches. They provide Google with some very interesting and useful information in return. For instance, they tell Google what people are interested in real time – right at this moment.

Those queries can help Google populate its knowledge base with more information as well.

When Google collects information about entities – people, places, and things, including products and brands, it might collect information about entities as well as information about attributes associated with those entities.

A couple of days ago, the Google Research Blog told us about how it might include that kind of factual information in search results, what they called Structured Snippets. In that post, Google gave us the news that Google finds information like this from Tables across the web.

Continue reading

How Google Might Fill in Missing and Incorrect Data in its Knowledge Graph

Entities change all the time, and facts about them do as well. Imagine when Derek Jeter retires from playing baseball, that he might decide to become a coach. Or Tom Cruise acting in a new movie, and deciding to try directing it and producing it as well. And Scotland decides whether or not it should be independent of the UK after 300 years.

What we think of entities can change over time, when it comes to the type of entity they are, and the facts associated with them. When populations of places change, and they do on a regular basis, how does that information get updated? And unfortunately, sometimes some information never quite makes it to Google’s knowledge base.

A patent application published last week looks at some ways that a knowledge base might be updated when a question answering query is asked of it, and the search system notices that some information is missing.

Poster for using a library for greater knowledge.

Continue reading

Images in Question Answers, Carousels, and Knowledge Panels at Google

When Google introduced us to the knowledge graph, it also introduced us to pictures and the possibility of other kinds of rich content (video, audio, etc.) in those knowledge panels, and pictorial lists displayed in carousels at the top of pages in response to a query, such as “What is the tallest building in the World?”

A carousel in response to  a question of 'what is the tallest building in the world?

A Google patent granted a couple of weeks ago, describes how Google processes search system queries, and might display knowledge graph answers to questions that include images. Here’s where they introduced carousels, in their page on the Knowledge Graph:

Google's Intro to carousels on the Google Knowledge Graph page.

Continue reading