Google on Crawling the Web of Data

A patent granted to Google this past fall explores how the search engine looks for patterns on Web pages to use to find facts on the Web to fill up Google’s data repository (Knowledge Base).

An image from a local park in Carlsbad symbolizing the Sun.
An image from a local park in Carlsbad symbolizing the Sun.

I recently wrote a series of posts about Google collecting data to enable them to answer Direct answers. starting with one titled Direct Answers – Natural Language Search Results for Intent Queries.

In one of those posts, I write about a paper (pdf) that the inventors of that patent co-authored which describes ways that Google was finding and extracting facts from pages to include in a repository of facts.

Continue reading Google on Crawling the Web of Data

Share

Google Media Consumption History Patent Filed

Google published a foreign patent at WIPO today that has an interesting perspective to it. When someone performs a search that involves a specific entity, their search may be influenced by the search engine’s knowledge of their past interactions with content involving that entity.

For example, someone searches for “Justin Timberlake” and the search system may have collected information about the searcher’s past consumption of content related to that entity, like having attended a concert featuring him, or a movie that he was in:

In some applications, the server-based system additionally receives and stores information describing the user’s consumption of the content. For example, the system can determine that the user viewed the movie “The Social Network” featuring “Justin Timberlake” on a particular date and at a particular location. The system can store the information at the media consumption history that identifies the particular date and the particular location where the user viewed the movie “The Social Network,” and can subsequently receive a request that identifies the user and “Justin Timberlake.” The system can provide a response to the request that includes information about “Justin Timberlake” and can also indicate that the user viewed the movie “The Social Network” that features “Justin Timberlake” on the particular date and at the particular location.

The patent application is:

Continue reading Google Media Consumption History Patent Filed

Share

How Google was Corroborating Facts for Direct Answers

When someone searches the web, and asks a question such as “what is the capital of Poland” or “what is the birth date of George Washington” a web search engine such as Google may not be very helpful in providing an answer if it provides a list of web pages that might answer that query instead of an actual answer. People in the SEO community have been referring to such answers as “direct answers.”

Google answering a direct question with a factual answer.
Google answering a direct question with a factual answer.

A patent granted to Google this week describes how Google indexes data across the web, and may look to a large collection of facts (in a fact repository such as a knowledge graph) to check upon and verify such answers, so that it can deliver them with more confidence and certainty, like in the answer to the question about George Washington’s birthday shown above.

The patent tells us that some efforts to build a search engine that can “provide quick answers to factual questions have their own shortcomings.” One of these is that the answers may come from a single source, such as “a particular encyclopedia.” Why this is perceived as a shortcoming is that it is:

Continue reading How Google was Corroborating Facts for Direct Answers

Share

Some Patents Behind Microsoft’s Personal Assistant Cortana

In January, Microsoft introduced a new build of Windows 10, which it will be giving away for free for non-enterprise users running Windows 7 and Windows 8.1. One of the features on this update is a personal digital assistant that goes by the name Cortana.

It’s one of the most anticipated features of the new Windows 10, and I’ve started digging through patents at the USPTO to get some hints of what this might mean for us. An article published recently got me started, with the name, Here’s how to make the most of Cortana, the Windows 10 digital assistant.

You’ve likely seen Apple’s Personal Assistant Siri, which was featured on a number of celebrity enhanced advertisements, and you may have seen people writing about Google Now, which feeds you cards to give you information that it predicts you might need or want when that information becomes available. Cortana is Microsoft’s entry into the Personal Assistant field.

Cortana is supposedly “powered by Bing” and “developed for Windows Phone 8.1″, and it looks like an important feature in Windows 10. I’ve been having difficulties defining what “powered by Bing” actually means, except that it seems to imply that all of the questions asked to Cortana are answered by the Bing search engine.

Continue reading Some Patents Behind Microsoft’s Personal Assistant Cortana

Share

All Hands on the Microsoft Holodeck: A Look at Some of the Hololens Patents

I’ve written about some of the patents involved in Google’s Project Glass in the past, and very recently about the Google Ventures’ funded Magic Leap. Project Glass still exists, but it appears to now have new leadership and a new direction.

A heads Up Display from Microsoft
From “Exercising applications for personal audio/visual system” US8847988 B2

And then seemingly out of nowhere Microsoft announces a pair of goggles that they’ve been developing secretly, named the Hololens. And they’ve been feeding news sources some interesting information about them, like the article at Wired titled, “Project HoloLens: Our Exclusive Hands-On With Microsoft’s Holographic Goggles“. Continue reading All Hands on the Microsoft Holodeck: A Look at Some of the Hololens Patents

Share

Magic Leap and Their Augmented Reality Semantic Robots

The temptation was to write this blog post mostly in pictures, since it’s about visual representations of things, based sometimes on a combination of objects that were understood using object recognition, and virtual semantic images superimposed on those, learned of from a knowledge base.

Google Ventures and a couple of partners funded the company Magic Leap with a substantial amount of money ($542 million), and Magic Leap responded with a new 180 page patent application that shows how it might create a “Cinematic Reality” in the world around us.

Here's a view of the glasses, and a belt pouch that does with them.
Here’s a view of the glasses, and a belt pouch that does with them.

With an 180 page long patent, there are a lot of images that go with it, so I’m going to mostly use pictures from the patent, Planar Waveguide Apparatus With Defraction Element(s) and System Employing Same for the rest of this post. Note that at least one of the pictures has a semantic element to it, which is pretty interesting, and there are mentions of the Semantic Web, like this one:

Continue reading Magic Leap and Their Augmented Reality Semantic Robots

Share

Direct Answers: Extracting Text from Pages Citations

This is the last post in a series about Google’s International patent application Natural Language Search Results for Intent Queries.

This section was inspired by the citations list at the end of a paper used by the listed inventors as a provisional patent, that preceded that patent. The paper was Scalable Attribute-Value Extraction from Semi-Structured Text (pdf).

I sometimes like to start looking through the documents I see listed as citations or footnotes in a paper I find interesting, As I started looking at the documents in that paper, I found many of them to be very interesting.

And then an idea struck me.

Continue reading Direct Answers: Extracting Text from Pages Citations

Share

Direct Answers: How Answers are Extracted from Web Pages

I’ve been writing recently about a patent from Google on Direct Answers, and how Google might take those from authoritative sources, using an intent template process (“what are the symptoms for [measles, flu, athlete’s foot,ebola]”) to include many direct answer responses to natural language queries, while also showing keyword-based search results.

The patent doesn’t tell us about how such natural language direct answers are chosen by the search engine, but the following document, which shares the same authors as the inventors of the patent, and which was filed by them as a provisional patent, does give us some ideas on how those are found on the web.

We know that Google is looking for responses from pages that they consider to be “authoritative” pages.

Continue reading Direct Answers: How Answers are Extracted from Web Pages

Share

Getting Information about Search, SEO, and the Semantic Web Directly from the Search Engines