The inventor of the Web, Tim Berners-Lee came out with a revision of his original vision of the Web when he started writing about the Semantic Web. He published an article in Scientific American about The Semantic Web, which is recommended reading.
Search on the Web has been evolving to focus more on showing results that find things, instead of strings, or matching keywords within queries to keywords on documents on the web. We are in a age of semantic search.
This is seen in Google’s original fact repository, followed by their knowledge graph, and Microsoft’s concept graph. Google, Bing, Yahoo, and Yandex all are followers of schema.org structured data markup, and it can be used to show rich results in search results.
We see semantic search results in featured snippets, rich results, structured results, knowledge panels, and special query processing of entities during this semantic search.
When you optimize a site for the HTML Web and for the Semantic Web, you’re performing two different tasks that can complement each other, and both of them can be very helpful. But not if you forget whom you’re doing it for.
I had an opportunity to watch a Webinar a couple of weeks ago, and it was about using some software that looked at your messages on your pages and the words that you were using on landing pages and your advertisements, and suggesting semantically related terms to include in those landing pages and in your advertisements.
During the Webinar, we had the chance to ask questions, and I had noticed that the word “audience” hadn’t been mentioned once.
In 2003, a paper titled Semantic Search , was published by Ramanathan Guha of IBM, Rob McCool of Knowledge Systems Lab at Stanford University, and Eric Miller of W3C and MIT. Their stated goal in the paper was to take technologies such as web services and the Semantic Web and use them to “improve traditional web searching.”
This was written before Ramanathan Guha joined Google, started Google Custom Search Engines, created Google’s version of Trust Rank, and introduced Schema.
I’m working on putting together a history of the Semantic Web at Google, and this early look at Semantic Web provides some insights from at least one person who played a major role in how the Semantic Web is becoming part of search at Google.
At Google’s 15th anniversary celebration last summer, shortly after Hummingbird was introduced, Tamar Yehoshua, Google VP of Search, showed us conversational search at Google by first demonstrating a query asking for “pictures of the Eiffel Tower”, and then following up with the query “How tall is It?”
In that second query, Google had to not only remember the Eiffel Tower was being asked about, but also to recognize the Eiffel Tower when it was being referred to as “it.” That is part of the new “conversational search” that Google is now engaging in, using something know by linguists as a “coreference.” I wanted to write about coreferences to clear up confusion that people might have had about them.
Google recently started showing “How to” lists in search results, which tend to show the first few steps of some task, and then let you click through to a page to see more. Like the recipes above for things like guacamole:
They have also published an interesting paper that describes some of the steps that need to take place for one of these snippets to be created, which is titled Cooking with Semantics (pdf).
At Google, when you asked a question, you could sometimes get a response providing answers to questions such as:
“When was George W. Bush’s birth-date?”.
We knew that Google could answer some questions like that, even if it might have been challenging, but we didn’t have much of a clue regarding the existence of something like Google’s Knowledge Graph until 2011. The answers we would see would sometimes be regular snippets where a word such as “birth-date” might be bolded.
Our set of 17 “related patents” that I first saw mentioned in a patent I wrote about this past Tuesday, and which was granted on August 19th, appear to have been created by a team under Andrew Hogue who was tasked to create “an annotation framework” to index more objects and facts associated with them on the web, which he would discuss more deeply during the presentation The Structured Search Engine, which is highly recommended.
He also oversaw the acquisition of MetaWeb by Google and the introduction of 25 former Meta-Web staff members from the company into Google.
Yesterday, I wrote about how Google might present facts extracted from pages in timelines or maps, according to a patent application filed last week.
It wasn’t the only piece of intellectual property coming out of the US Patent and Trademark Office for Google on the extraction and visualization of facts. Another that maybe even more interesting describes the possibility of a user extracting facts found in a query of the fact database, and choosing to present those facts in a number of ways.
Designating data objects for analysis
Invented by Andrew W. Hogue, David J. Vespe, Alexander Kehlenbeck, Michael Gordon, Jeffrey C. Reynar, and David B. Alpert
US Patent Application 20070179965
Published August 2, 2007
Filed: January 27, 2006