All Your Knowledge Bases Belong to Google

identifying query aspects

Sharing is caring!

In a Google Inside Search blog post, Introducing the Knowledge Graph: Things, not strings we’re told of a new initiative from Google to show us more information within search results themselves about the things we search for. This is a potential paradigm-shifting view of what a search engine does. The post tells us:

The Knowledge Graph enables you to search for things, people or places that Google knows about “landmarks, celebrities, cities, sports teams, buildings, geographical features, movies, celestial objects, works of art and more” and instantly get information that’s relevant to your query. This is a critical first step towards building the next generation of search, which taps into the collective intelligence of the web and understands the world a bit more like people do.

It’s not a surprise that Google’s been working towards reinventing themselves and what they do. With an increased emphasis on social and real-time search results, Google’s been transforming themselves into a way to monitor activities and events in the world as a near real-time monitor, rather than just a repository of links to web pages that might satisfy situational and informational needs.

With this move towards displaying information directly within search results about specific people, places, and things, Google is tapping into resources like Wikipedia, Freebase (which Google acquired when they purchased Metaweb), and other places on the Web, as well as the knowledge derived from what people are searching for, and how they might do things like refining search queries.

A recent paper from three Google Engineering team members, Extracting Unambiguous Keywords from Microposts Using Web and Query Logs Data (pdf) even provide details on how they might be working to better understand the ideas and concepts expressed in Tweets and Status Updates and other social media, from statistical analysis performed on documents found on the Web and information gleaned from their query logs. So, social media is also potentially a source of the kind of information that could be included within a knowledge base as well.

A patent application that Google published in August of 2010, Identifying Query Aspects hinted at this kind of knowledge base search and how it might be used to transform search results. I wrote about it when it was published in the post, Google and Metaweb: Named Entities and Mashup Search Results?

The inventors listed in that patent application have also written a whitepaper that describes how combining information from knowledge bases like Wikipedia and Freebase about “named entities,” or specific people and places and things, with information from search query logs could help to identify different aspects of those named entities, and help the search engine decide what to display about them. The paper is Identifying Aspects for Web-Search Queries, and it was initially published in the Journal of Artificial Intelligence Research in March of 2011.

The paper gives us an idea of why Google might have decided to move towards a knowledge bases model of search with the opening statement:

Many web-search queries serve as the beginning of an exploration of an unknown space of information, rather than looking for a specific web page. To answer such queries effectively, the search engine should attempt to organize the space of relevant information in a way that facilitates exploration.

So how do knowledge bases results help “facilitate exploration?” How does Google decide what to show about specific people or places or things? The paper tells us that it looks to at least a couple of sources to understand different “aspects” of a particular named entity:

Aspector combines two sources of information to compute aspects. We discover candidate aspects by analyzing query logs and cluster them to eliminate redundancies. We then use a mass-collaboration knowledge base (e.g., Wikipedia) to compute candidate aspects for queries that occur less frequently and to group together aspects that are likely to be “semantically” related.

It really shouldn’t come as a surprise that Google would venture off into this knowledge bases direction. It’s an interest on the part of Google that could be seen as far back as Sergey Brin’s publication of the paper Extracting Patterns and Relations from the World Wide Web (pdf) in the 90s around the time that he and Lawrence Page worked to transform their Backrub search engine into Google.

Google projects like that described in the paper WebTables: Exploring the Power of Tables on the Web (pdf) also shows Google attempting to extract information from structured tables found within the unstructured pages of the Web to understand the semantic relatedness of data about people, places, and things. Google Squared was powered by such an analysis and understanding derived from projects like that one.

This knowledge bases approach isn’t even very novel at Google if you look at how Google has been collecting and ranking results for Google Maps for more than a couple of years. In addition to looking at databases from telecom directories for information about distinct businesses and organizations at specific locations, Google has also been crawling the Web to find mentions of those businesses where geographic location information is also included.

Google may also determine that a particular site or page is the authoritative page for a business or organization, but there are businesses and locations and landmarks that don’t even have webpages that are included in Google Maps.

Google’s knowledge bases results provide information about entities that might appear within queries, and may even anticipate and answer subsequent queries, but they are summaries that both provide value to searchers and may help lead to the further exploration of topics and ideas around aspects related to searches. If you’re a site owner, it might not hurt to be perceived as an authority for those topics.

Sharing is caring!

25 thoughts on “All Your Knowledge Bases Belong to Google”

  1. Pingback: All Your Knowledge Bases Belong to Google |
  2. Thanks for the great post. I for one am excited about the changes that this more knowledge based approach will bring to search results at Google, especially for it’s implications on using social media as a basis for more intuitive results. Will wait to see what implication this has for the practice of SEO at small businesses.

  3. It makes sense.

    My thoughts are that they have to go to display over influencing rankings because they simply can not increase the weight of the signal, due to the fact that they have no real data to go off of..

    If they launch this too aggressively, if it is displayed on many SERPs, there will be a drastic decrease in revenue by many online retailers.

    They are basically spicing up wiki pages. Bing is showing you engagement, recommendations, and so forth. They still have a long ways to go, but with the new design and ability to use FB, I’m much more eager to see where they end up. With Google, it’s now.. “OK, well how to do we fix this now?” with every little update.

  4. It seems that Google is duplicating Amazons business model to a degree, “allowing sellers to sell goods, then picking the cream for themselves” with Google they are using the data gleaned from millions of web pages and searches, then just serving up the data on the search result page, so cutting out the owners of the websites.

    Makes for a good user experience for sure if you don’t need to make multiple clicks to find what you need.

  5. Hi Daniel,

    Thank you. The patent filing and the papers I wrote about don’t really tie together how social media such as Google Plus and this knowledge base approach might work together. But it’s fairly obvious that Google Plus is treating participants as if they are named entities. It is going to be interesting to see what the future has for the intersection of a knowledge base approach and Google’s social media strategies.

  6. Hi Steve.

    Thanks for the link to the Semantic Web post.

    I understand why Google used their own vocabulary in their blog post announcing their knowledge graph search results. They were writing for a mainstream audience, in a way that they could hopefully best understand the changes.

    I’d love to see Google give a nod to all of the people who have been working on the Semantic Web.

  7. Hi Brent,

    I’m not convinced that Google showing summaries of Wikipedia information, and links to more information that they believe might be related based upon things like an analysis of query logs related to specific entities is going to take away from searches of pages that provide great information. Even Wikipedia pages

    If a page relies upon answering a simple factual answer to receive searches, then maybe it really isn’t doing as much as it could.

    This knowledge base update seems to be related to helping people who are performing exploratory searches on topics that they don’t know much about. A casual searcher who may just want to know some simple fact may not search much more than learning about someone’s date or place of birth, or so on. But The document summary that Google’s showing likely is just the first part of a deeper level of search that might encompass multiple pages.

  8. Hi Terry,

    I’m not sure that particular page would really have been too useful in this discussion since it focuses more upon providing reviews or recommendations from organizations that people might consider to be authorities.

    I pointed out the way that Google attempts to find “authority” pages for specific Google Maps listings because it’s possible that Google might do some similar analysis of associating different aspects or attributes about an entity with a specific website or web page.

  9. Hi neale,

    Google’s been providing document summaries of websites for years to deliver people to web sites that contain more information. While this knowledge base approach might focus upon specific named entities and provide facts and information about them, it also ties those document summaries to sources that people can click through to find more information.

    We can look at this as Google preempting some searches, but we can also look at it as an opportunity to be the source of the information that Google displays, and a place were they can learn even more.

    For many searches that are exploratory in nature, the knowledge base results shouldn’t produce less searches, but likely more of them in that they provide additional topics for searchers to search upon that might not be so obvious from just a number of titles, snippets and URLs from some web sites.

  10. It makes sense to me too, I mean that’s what Google is supposed to do…they make data “findable” so, in turn, all databases will be indexed and categorized by Google. Great post, thanks for sharing!

  11. Hi Molly,

    Thanks. In many ways, I think just indexing the content found at different URLs on the Web is easier than indexing content related to different entities and concepts, and trying to organize it in a meaningful way. It’s a challenge, but it’s the kind of challenge that could result in more useful and meaningful search results. We’ll see. 🙂

  12. All your base are belong to Google. 😀 All Internet meme aside, it’s a fact that so many people depend on Google for information. For one, Google is my newspaper, my grammar checker, my spelling checker, my encyclopedia, etc. This is how millions of people around the world feel about Google, too.

  13. Hi Bree,

    I was talking with one of my neighbors last week about how he’s been increasingly looking things up on his phone to find out information on things that come up during conversations, on TV, and elsewhere. We do have a lot more information right at our finger tips than we ever have before. It’s in Google’s best interest to make search as good an experience as possible, and I think these knowledge graph results are s step in that direction. In addition to possibly answering some fairly simple questions, they are also aimed at helping you better understand the different aspects of a topic as you’re searching for them so that your searches can be more fruitful.

  14. I think that this could make SERPs more substantial, so that it would be quicker for the users to find what they are looking for.
    But as Daniel I also think it will be interesting to see what implications it will have for practicing SEO at small businesses.

  15. Pingback: Content Uniquness | Synonymic Connections | Concept Arch | Search Engine Israel

Comments are closed.