Search engines have transformed the way that we locate information and learn about the world around us. When we type a term into a search box, we are presented with pages of search results that bring a wealth of information to our fingertips.
The results that we see often include more than just a list of web pages. A search for [baseball] at Google provides links to web pages, videos, news articles, book results, and related search queries.
The top result I received was a link to the Major League Baseball (MLB) site, with a list of sitelinks to eight additional pages related to that domain. Interestingly, four of those sitelinks are to different subdomains on the MLB site, to team pages for the Boston Red Sox, The New York Yankees, the Los Angeles Dodgers, and the Baltimore Orioles.
There may be many pages that show up in search results relevant to a query that we perform. In my search for [baseball], I was shown “Results 1 – 10 of about 197,000,000 for baseball.” I’m not going to look at all 197 million pages, and chances are that I might not make it past the first page of the search results.
The top results for a query term are the ones that most people will visit, or they might change the query terms that they’ve used to something broader or more specific if those top results don’t look encouraging.
Search results pages are supposed to be listed in an order that places the most relevant and important results at or near the top of the list. There are times when many of the most relevant pages are from the same domain, so you could possibly have a result from that domain on the first page of search results, and another page from that domain on the fifth page.
Or, you could see results from the same domain filling up a number of spots on the front page of search results.
The way that search engines have been addressing when multiple search results from the same domain that are relevant to a query is to show the most relevant and/or important result as a normal result, and then show another page from that domain under it as an indented result, possibly with a link to “more results” from that same domain under the indented result. We see this at all of the major commercial search engines.
This indentation of results is helpful for searchers because it provides them with a chance to see that a site might contain multiple relevant pages that may provide information about their query. It is also attractive to site owners, who may like that their pages are shown more than once in top search results in response to a certain query.
A recent patent application published from Microsoft, Domain Collapsing of Search Results (US Patent Application 20080294602), provides some of the technical details behind how search results are indented.
The process itself probably wouldn’t surprise most people who pay a lot of attention to the way that search results are presented to searchers.
Quite simply, a top number of search results are returned from a search index based upon a searchers’query, and results from the same domain are associated and clustered together so that two or more search results might be presented as a single cluster of search results rather than presented individually. An option to see more search results from the same domain may be provided to the searcher.
A domain is identified under this process by looking at the structures of the URLs for pages. So, when a URL ends with a country-specific tag, the domain would include the last three words of the URL before the first forward slash, i.e. /. So, the domain of the URL www.msn.co.in/ is “msn.co.in”. When the URL does not end with a country-specific tag, the domain would include the last two words of the URL before the first forward slash, i.e. /. So, the domain of the URL www.msn.com is “msn.com”.
So, under this approach, when more than one URL from the same domain is in a top number of results for a specific query, those URLs may be clustered together, with the main result, an indented result or results, and a link to more results from that domain.
Domain Collapsing and Page Titles
Most people who have used one of the major search engines for a while have probably seen indented search results at some point in time, and the process above probably comes as no surprise. I don’t know if it’s novel enough to be the subject of a patent filing, but there was one idea presented that was interesting and might be new to most people who have seen indented search results before.
Each page of a site should ideally have a unique title that describes the content of the page that it appears upon. Unfortunately, some pages of a domain share the same title.
When more than one page from the same domain is determined to amongst the most relevant and important for the same query term, and they share the same title, an indented result might not be shown to searchers.
Unanswered Questions Involving Domain Collapsing
This patent filing doesn’t address what happens when there are different sites that share a domain, like at wordpress.com.
We are told that a search engine can turn on or turn off domain collapsing, and can possibly enable searchers to turn the feature on or off too. But, can domain collapsing be turned on for some kinds of queries and not others?
There is no discussion of site links – which are links that might appear below the top search result for a query, and which are intended to be navigational shortcuts to pages that are related to that top listing.
Those site link results appear to work differently than this domain collapsing process in that collapsed results pages are pages that show up as relevant and important to the query searched for, while site links are links that might be final destination pages when someone is performing a navigational query.
I remember going to the library and finding books by looking through big books of categories, searching at dumb terminals or microfilm indexes, and walking through bookshelves and scanning the titles of books.
The web brings a library of information to our fingertips, and search engines provide indexes of information which can be searched much more quickly than the shelves of a library and can potentially deliver much more information to us.
The way that search results are presented to us determines what we end up finding when we are looking for information, or when we are attempting to perform a task on the Web. It’s worth paying attention to how a search engine might cluster together results from the same domain or how it might provide results of different types such as news or web results or videos or books.
While indented results might not be something new to most site owners or searchers, one of the biggest takeaways from this patent application for site owners might be to make sure that each page of their site has unique page titles, so that if a search engine determines that more than one page of the site might be relevant for a query, it will present the pages together as a clustered result, and show searchers that the site contains multiple pages that are relevant and important for that query term.
15 thoughts on “Domain Collapsing, Indented Pages, and Search Results”
As far as i know Google can also read the content of e-forms, radio buttons, cascade lists and frames, but do you happen to know if there is any correlation between the text on HTML buttons and on-site SEO (a bigger importance relative to regular text or links) ?
Thanks in advance,
Your comment ventures pretty far away from the topic of this post. 🙂
I really haven’t seen anything from any of the search engines that might describe the relevance value of text appearing on form fields, radio buttons, or select lists from a straight forward on-page SEO analysis.
However, I have seen some discussion of the use of that text in whitepapers from Google when it comes to their efforts to index content from the deep web. For instance, check out the 2008 paper Googleâ€™s Deep-Web Crawl (pdf), where it describes how some form text from select lists and radio buttons can help the search engine understand data input types that might help it find pages containing information behind a form.
I have read it in a book i recently bought. If you are interested we can make a short exchange pf ideas and strategies for online marketing. I also have a couple of ideas.
A great post, There is a lot to know about SERPS and it can be a full time job just keeping up with the latest tips. I am glad that there are great bloggers/researchers like you that can help us understand and learn and be more effective.
I’d enjoy talking with you more about SEO and onine marketing.
Thank you very much for your kind words. Search is evolving, and it is very interesting, as well as time consuming, keeping an eye out for the changes.
Thank you for your kind words.
Domain collapsing means that pages in search results for a specific query (maybe just within the top 100 or so results) from your site’s domain “phone.com” may be collapsed together. The first page from phone.com would show up as a regular search result, and the second page from phone.com would be shown immediately below it as an indented result. If there are other pages from the site that are relevant for that query, a link might appear below the indented result saying something like “click here to see more relevant pages from phone.com.”
A normal link or image based advertisment on your pages to Verizon.com might not pay much of a role in what you see under domain collapsing.
If you create a page on your domain about Verizon, that page might show up as relevant to a search query for [verizon]. If you have more than one page about verizon on your site, and they are very relevant for a search for the query [verizon], pages from your phone.com domain might be collapsed together so that you have indented results showing up in the search results. But search results from phone.com and verizon.com would not be collapsed together, since they are different domains.
Sometimes domains are purchased and a redirect on those domains to point to a different domain. An example would be “www.mickeymouse.com”, which is redirected to http://disney.go.com/mickey/
Since those are different domains, even though going to each would bring you to the same place, they likely wouldn’t be collapsed together under the domain collapsing process described in this patent. Chances are that one of the domains, possibly the domain being redirected (depending upon the kind of redirect being used – a temporary one or a permanent one) might end up being filtered or dropped out of search results at some point.
An example is the major league baseball page for the New York Yankees. A search for [new york yankees] in Google shows “yankees.mlb.com/” as the only URL for the site. That URL redirects (using a temporary 302) to http://newyork.yankees.mlb.com/index.jsp?c_id=nyy
The search results from these different URLs weren’t collapsed, but one version of the URL isn’t being shown in search results, and is likely filtered out of the results. That’s different than domain collapsing, but the reason for it is similar – Google doesn’t want to show a bunch of relevant search results that all end up pointing to the same page, or substantially similar pages. With domain collapsing, results from the same domain are limited, and clustered together. With URLs at different domains thar redirect to the same page, results are filtered out of search results.
What a fascinating and informative site you have! My question/comment comes from an advertising perspective, though I’m newly consumed by the ongoing science of SEO.
If I owned a generic domain like Phone.com and was contacted by Verizon to advertise on my site(phone.com), would I have to use Verizon.com as one of my “unique site titles” to have it show up as a search result with regards to Domain Collapsing? Also, could I keep the searcher on my phone.com, or would they be directed away to Verizon.com?
Thanks in advance
Thank you for your prompt response, Can I possibly pm you with a few other questions I have regarding the relationship of SEO and advertising?
Sure. You can send me a message on my contact page at:
My experience with Verizon from a promotion angle was that I had an email from verizon telling me to remove my advertising with any mention of verizon. I had to email google, youtube and digg not to mention other sites to get them to manually delete the listings in the serps. This was a real chore.
I would suggest that if you decide to use any Intellectual property from any sizeble corporation, that you should contact them with your intentions first.
Hope this helps.
I was only using Verizon as an example- but, I do wish I had Phone.com in my arsenal.
Thx at first for this nice informative article – In fact a was thinking about about this procedure quite often, as there are a number of practical implications.
* Grouping does certainly make from the visitors point of view (at least in my opinion)
* Anyhow my impression is very mixed about the results. In difference to the “site links” which usually are quite good in their relevance, the combinations coming out of collapsing domains are much less precise. Makes sense, as most likely they are just created “on the fly”. But the results are – as said – not convincing, as usually only a few pages are extracted, and those do not give the best picture of the site
* no systematical approach. I still do not understand when Google collapse sites – and when not. Some keywords show a lot of collapsing, others not
* Hidden collapsing. Indeed IÂ´m pretty sure that Google is collapsing often, but doesnÂ´t show it. Just an example: My site with the keyword “Haarausfall” is ranking exactly at 8 in this moment. But with my programme which is watching the position itÂ´s 13. For another site, where collapsing is visible, both numbers are identical, but INCLUDING the collapsed sites
* Last point: Collapsing definetly increases impact of the site. Having 2 or 3 entries grouped gives a pretty strong facing …
You’re welcome, Alopezie
Thanks for asking some tough questions, and raising some interesting points.
The purpose behind Site Links and Domain Collapsing are very different.
Consider that when a search engine shows site links to searchers, the search engines are likely viewing the search query as if it were a navigational type search, as opposed to an informational or transactional search. The additional site links under the first result are an attempt by the search engine to provide additional navigation to a final destination page, to help a searcher find the information (or transactional activity) that they are attempting to locate with their initial query. Chances are that site links may present information about a site much better than results which are collapsed because they are from the same domain, because the search engine has determined that the query was navigational in nature.
The purpose behind domain collapsing is much simpler – trying to present relevant search results for a specific query near each other, if the pages are from the same domain. There’s none of the analysis that takes place in domain collapsing like in site links, aimed at trying to guess which pages are the best to show to searchers.
Domain collapsing doesn’t happen for every query term, and results may vary based upon your search settings, so that you might see collapsed results if you have set the search engine to show you only 10 results at a time, and you might not if you have your preferences set to see 50 or 100 results at a time. It’s also likely that if you are using an Application Programming Interface (API) from a search engine with a program to retrieve search results, that the results it retrieves might not be collapsed based upon a domain.
It may be possible that for some query terms, there may be a desire on the part of search engineers to focus upon other considerations than showing results from the same domain collapsed together, such as results that contain the freshest information, or evidence some kind of commercial intent, or providing a diversity of results from different categories or different kinds of documents (news, blogs, images, videos, web pages). There are many possible reasons that a search engine might reorder the search results that are shown to searchers, and some of those might be considered more important than showing pages from the same domain together in the search results.
Thanks very much William.
this is a very informative thread and i am very please and glad that people like you are doing so much research on these types of things that can help a lot of people like me who doesn’t know why those things happen. Like indented collapsing. I never knew why two of the site and similar pages are right after each other. Your report is very much exhaustive and very informative. I really appreciate this kinds of information.
I have one little concern about this. The first page on a query is about 10 of the top rank pages from a search engine, and if say 3 different sites have their sites on the first page they would be occupying six spots. So the rest of the other sites may be relegated to page two.
Just my thought
That’s a very good point. I’m not sure that I’ve seen too many search results where there were that may indented pages from multiple domains, but it sounds like it is possible. I’m wondering if the search engines might purposefully avoid showing too many indented results in the top ten. It’s possible that they may.
Comments are closed.