How Google Might Top Search Results with Additional Information

How would you feel about Google showing information above search results (taken from one of those results) in response to a query, to give searchers a sense of the the information provided in the results on that page?

For example, imagine performing a search at Google for the word “burns” and getting back a set of search results with a paragraph or two above the results that provide information on how to treat burns, taken a page in the search results. Sound like a good idea?

Choosing What Information to Display

How would Google decide to display information about burn treatment instead of the medical condition itself, or information about someone with the name Burns (such as the comedian George Burns)?

Many of the queries that people perform searches with are ambiguous, and might trigger pages that cover a wide range of topics. A search for “java” could bring back search results about a programming language, an island, or a type of coffee. A search for “jaguar” could result in listings of pages about cars or about animals.

If Google looked at a classification associated with the query itself to decide what kind of information to display, it might have trouble making a decision. If it classified information related to search results for a query, and used those classifications to decide what information to show, the search engine might get a better sense of what may be appropriate to display at the top of search results.

Before the search engine decided what kind of information to display above the search results, it might classify a certain number of the top search results to see which classifications most appear the most frequently. It might do that by looking at different parts of those results, such as their URLs, titles, snippets associated with them, and labels that may have been attached to those pages.

A newly published patent application from Google explores the use of this classification information to decide the topic for information that might be shown at the top of results for a specific query:

Methods and Systems for Classifying Search Results to Determine Page Elements
Invented by Tania Bedrax-Weiss, Ramanthan Guha, Patrick Riley, and Corin Anderson
Assigned to Google Inc.
US Patent Application 20090100036
Published April 16, 2009
Filed: October 11, 2007

Abstract

This invention relates to determining page elements to display in response to a search. A method embodiment of this invention determines a page element based on a search result. The method includes:

(1) determining a set of result classifications based on the search result, wherein each result classification includes a result category and a result score; and

(2) determining the page element based on the set of result classifications.

In this way, a classification is determined based on a search result and page elements are generated based on the classification. By using the search result, as opposed to just the query, page elements are generated that corresponds to a predominant interpretation of the user’s query within the search results. As result, the page elements may, in most cases, accurately reflect the user’s intent.

Why Provide Extra Information at the top of a search result?

The following line from the abstract also shows up in the description from the patent application a number of times, and presents one of the major assumptions behind the method presented in the patent application:

In this way, a classification is determined based on a search result, and a page element is generated based on the classification. By using the search result, as opposed to just the query, page elements are generated that correspond to the predominant interpretation of the user’s query within the search results. As result, the page elements may, in most cases, accurately correspond to the user’s intent.

If you’re looking for information about George Burns, and you type in “burns” as a query, the first thing you might see is a paragraph about treating burns. That would probably be a good indication that most of the search results that you would see in response to your query are related to a medical condition, and to the treatment of that condition. If the intent behind your search was medical treatment, the paragraph may answer your question, and it might provide you with some confidence that a number of the search results may also be helpful. If you’re looking for George Burns, that paragraph might be a good sign that you would be better off making your query more specific.

Classifications based upon Different Elements of Documents in Search Results

The patent application provides some information about how it might look at different elements such as page titles and snippets and URLs, and classify those to learn about the topics that are covered in search results, but it doesn’t go into much depth on how it might analyze those elements.

A whitepaper co-written by Google Researcher Monika Henzinger for the WWW 2009 conference in Madrid titled “Purely URL-based Topic Classification” describes how classifications might be determined from the URLs of pages, but unfortunately that paper doesn’t seem available this morning. The URL for the paper was: http://www.sheridanprinting.com/09-www-cd35mxg/docs/p1109.pdf

Part of the abstract from that paper provides a number of reasons why a search engine might want to try to understand the topic of a page while only looking at its URL:

Usually, web pages are classified using their content,but a URL-only classifier is preferable,

(i) when speed is crucial,
(ii) to enable content filtering before an (objectionable) web page is downloaded,
(iii) when a page’s content is hidden in images,
(iv) to annotate hyperlinks in a personalized web browser, without fetching the target page, and
(v) when a focused crawler wants to infer the topic of a target page before devoting bandwidth to download it.

Of course, attempting to classify results based upon the titles of pages could possibly be helpful, especially if the page authors used titles that were descriptive of the content of those pages. Unfortunately that doesn’t happen all of the time.

Taking classification information from a snippet might also be helpful, since the search engine decides upon a snippet to show for a page that is relevant to the query used to find the page. Many pages that should show up in search results in responses to a query should be relevant in some way to that query, and a snippet might hold some useful information that can be used to classify search results.

The patent application also discusses the use of “labels” associated with pages that appear in search results, but doesn’t tell us much about where these labels come from. Are they labels that the search engine has developed that relate to the top queries or the top phrases that the search engine might believe pages are related to? Are they labels that people annotating a page through something like a searchwiki have added? Are they labels taken from advertising keywords that people using something like Adwords might have chosen to use to lead viewers to the page? Do these labels come from somewhere else? We can’t tell with any certainty from the patent application.

The patent application also tells us that it might look at the labels used for the documents that appear in search results, and determine which ones appear the most frequently to determine an overall classification based upon those labels.

What other information might be used to determine a classification for a set of search results?

We are also told that the search engine might give extra weight to classifications from the URLs and snippets and titles and labels that show up in higher ranked search results.

Deciding upon Page Elements to Use

Information about classifications from different parts (URLs, titles, snippets, labels, histograms of labels) of search results would be used to decide upon what might be the most likely classification to choose information to display above search results. We’re told that this information would be used to determine an overall classification, and would be used to decide upon information (a page element) that Google might show at the top of search results.

But, we aren’t shown much about the actual method that the search engine would use to decide upon a specific classification based upon those results classifications, or how it would use that classification to determine what information to show at the top of search results. It’s possible that may be the topic of another patent application that hasn’t been published yet.

Conclusion

Would you find it useful if Google started showing some information at the top of search results from one of the pages of the results, based upon a classification that the search engine determined to be the most relevant to the results that showed up for a query?

Would showing that information be helpful to searchers? Would it be helpful or harmful to site owners whose pages might show up in those search results?

Share

18 thoughts on “How Google Might Top Search Results with Additional Information”

  1. Hi Nick,

    Thanks for your kind words, and for taking the questions that I asked at the end of the post, and coming up with some thoughtful reasons why this approach might be a good idea over at your blog. I like your term “SERP summaries” for the page extract that they might show at the top of search results. I guess we wait and see if serp summaries are something that Google adopts.

  2. While I understand search engines wanting to get an advantage over competitors and add value to the internet search experience; all of these search results ‘extras’ that Google search, Yahoo search and other search engines are starting to add to their SERPS are starting to look like overkill.

    At some point all the search engine errata is starting to become information overload for the average search engine user.

  3. I can remember a class in elementary school where we were taught how to use (search) for information in a library. Maybe we now need classes to teach users how to search for data on the web.

    Searching a URL for data does not always prove efficient. You can set up a url for a post on a blog to include /category/ and then /title/, and hopefully the title will relate to the topic of the post. However, it is easier to not change the settings where you have urls with indicators like a page number. You could have a meta description in your head space, but I have seen Google ignore those. The idea of adding extra info into a result can be truly beneficial, but can it reflect the content accurately would be the concern. When perusing the results from my Google Alerts, I really have to wonder how certain posts were chosen for the keyword inputted. It might be better for the engines to work on delivering results from what the assumed user intent is, and for users to learn how to search better.

  4. Hi People Finder,

    I think that is a very real risk that they are taking in adding new features to search results – too much information, too much clutter, too many potential distractions. Looking at the success behind the simplicity of Google’s front page, there is some value in letting searchers focus on what they are looking for. At what point does it become too much?

  5. Hi Frank,

    You have me wondering if they now have classes in elementary school on how to search for information on the Web.

    I agree with you that classifications from URLs and titles and meta descriptions may not accurately reflect the topic of a page for a number of reasons, and that it’s possible that a search summary may be misleading. I’ve seen some confusing results from Google Alerts as well.

    I do think that suggested queries that the search engines sometimes show can be helpful, but I am wondering if this kind of summary at the top of search results would be helpful or misleading.

  6. While I think that getting page titles and a meta description is usually sufficient information, I think a little extra information wouldn’t go amiss. Very often you see a list of links on a SERP and you see title and meta tags spammed the hell out of meaning it’s hard to actually tell what type of site it is. Perhaps info such as a screenshot, does this site have an rss feed or blog, is it e-commerce etc. would provide more info.

    But remember a SERP should only be a SERP, not a web directory.

  7. Hi Gabriel,

    I think you’ve come up with an interesting approach. I’ll have to spend some time with it before I can give you any serious feedback, but I see more than a few things that I like about it.

  8. Hi Adam,

    But remember a SERP should only be a SERP, not a web directory.

    Good point. :)

    I do wish that people would spend more time with their titles and meta descriptions, and content, and recognize that those will be displayed outside of the context of their pages in places like search results, and that it’s more likely that someone will follow them if they are interesting and engaging.

    The evolution of search results is pretty interesting, with the addtion of things like sitelinks and single line sitelinks, and some onebox results with stock tickers and contact information and other kinds of information for some search results. How much is too much, though?

  9. I’d rather deal with a lot of unrelevant results than let google take away my privacy. It’s not like my opinion matters to them but oh well.

  10. Hi czerwona sciana,

    I agree with you. I’d rather see results that don’t quite match “my intent” than have a search engine track my every step on the Web. I suspect that there are many others who feel the same way.

  11. Hi Pozycjonowanie,

    Search results pages are being transformed from a simple set of descriptions of web pages to portal like pages, filled with much more than links to pages. I think that sometimes it’s helpful, and other times it is too much.

  12. while i understand why they want to do this, i yet again see Google attempting to trample on people’s rights. I don’t mean to sound so harsh but Google doesn’t seem to walk that line very well between offering a service and people’s privacy. I’d rather input a more specific search term than have google track me.

  13. Hi Craig,

    The patent that I’ve written about in this post doesn’t involve tracking people in any way, or infringing upon anyone’s privacy.

    It does involve Google possibly putting some additional information at the tops of pages when it seems like the search results for a particular query might focus upon a certain type of information.

    Where this additional kind of information might be especially useful is when the topic is something that a searcher may not know too much about, and is exploring a topic to learn more. The additional information can help provide them with ideas for other things to explore and search for, and help them learn more about that topic.

Comments are closed.