A transformation was triggered at Google with their announcement of the Knowledge Graph in the Official Google Blog post, Introducing the Knowledge Graph: things, not strings. That transformation was one less concerned with matching keywords, and more concerned with matching concepts, understanding entities, and bringing knowledge about entities to searchers in knowledge panels next to search results.
Google published a patent application last week that describes the knowledge panels that appear next to search results as part of the new knowledge graph. Here’s the video that accompanied the post (note the reference to a “panel” in the presentation):
When we talk about indexing and crawling content on the Web, it’s usually within the context of pages being ranked on the basis of a number of signals found on Web pages that might be ranked in response to queries. Google has told us that the future of search involves Knowledge Bases, and the indexing of Things, Not Strings. Gianluca Fiorelli explored Google’s ideas of Search in the Knowledge Graph Era earlier this week.
A few years back, I wrote some posts about some Google Patents that explored how Google might be extracting and visualizing facts, and using Data Janitors to process that information and clean it up and sort it. Google was granted another patent this week that’s very much related, looking at how Google might understand locations for places collected from Web pages. One of the inventors, Andrew Hogue, gave this Google Tech Talk presentation last year:
When you walk into the lobby of Building 42 at the Googleplex, you can see a display that shows you queries entered into the search engine at any one time. It’s a mesmerizing sight, and I found myself wondering about the people and motivations behind some of the search terms I saw flowing down the screen.
Imagine that instead of seeing one query at a time, that search information was analyzed, and queries were bundled together, to maybe provide us with more meaning.
Can search engines be used to tell us what the world is thinking at anyone time? Would looking at the most popular keywords or queries that people type into a search engine provide us with some insights?
A tool from Google that is often overlooked is Google Sets (no longer available), which allows you to “automatically create sets of items from a few examples.”
Google Sets was one of the first applications in the Google Labs (no longer available) pages.
Those pages are “Google’s Technology Playground,” and contain a number of programs that may or may not be tomorrow’s useful applications from the search engine. As Google tells us,
Google labs showcases a few of our favorite ideas that aren’t quite ready for prime time. Your feedback can help us improve them. Please play with these prototypes and send your comments directly to the Googlers who developed them.
Google was granted a patent this week on the process behind Google Sets, and the patent document provides some details on how the program finds additional words based on “items from a set of things” that you enter.
In Google’s search results, depending upon your query, when and where you are searching, and what your browser and search engine settings might be, you may receive a different set of search results than other folks performing a search using the same query terms.
And those results may include a mix of links and images from different data sources including Web results, images, advertisments, local business, books, products, and others.
Google’s Universal Search provides a blended mix of results which incorporate results from a number of different data respositories all together into search results.
Does a search engine work better if it can figure out whether or not a search query is a name?
The folks at Ask.com appear to think so, and even want to know if the name is that of someone famous. I’m not sure how they measure fame, but they have a method for flagging names of the famous, as well as names that look like names, and names that really aren’t names (Brandy Alexander, anyone?)
The process is described in a patent application from Ask, and details how they might go about figuring out whether “Usher” or “50 Cent” or “Attila the Hun” refer to people, or to something else completely.
Fact extraction is growing as a method that search engines can use to identify and understand what pages on website are about, and to collect facts about subjects and answer questions posed by people submitting queries to a search engine.
A recent paper from Google provides a nice overview of some methods being used for fact extraction. A Google patent application published last week explores looking at titles on pages, and anchor text in related pages on the same domain to determine a subject for a document.
It starts with facts imported from one website and takes them as known facts (seed facts). Then it tries to find mentions of the seed facts on other web sites. This involves retrieving relevant pages for each entity and then corroborates facts in them.