Son of SEO is Undead (Google Caffeine and New Product Refinements)

Some changes to search, and search engines are easy to spot and see and understand. Others are highly visible, but their actual impact is less transparent. Still others are somewhat mystifying, and there’s little information about them published by any of the search engines. Some others aren’t very visible, and the search engines are fairly silent about them.

Last week, Google launched a new feature on search results pages called Google Instant Previews. If you click on the magnifying glass that shows up next to a search result, you’ll see a preview of the web page in the right column of the results.

In some results, where snippets are taken from the content on the page (as opposed to those from meta descriptions), the area where that snippet text appears on the page is highlighted, and magnified for viewers.

A number of reactions online note that these previews will harm SEO and Web Design, or will transform both in ways that are burdensome to SEOs and designers. The opposite is probably more likely true – sites that follow good SEO and Design practices stand a good chance of getting more click-throughs because of Google’s Instant Previews.

Google’s Amit Singhal mentioned in a Businessweek interview last year that Google makes updates to search fairly frequently:

A: We launch hundreds of changes every year. Some are small, such as getting an acronym right, some are as big as Universal Search.

Some Google updates stand out significantly, especially ones that are more visible and impact the look and feel of how Google does what they do. A few examples include Google’s Instant Previews, Google Instant, and Google’s Universal Search.

Other updates may have less of a visual aspect, like Google’s recent addition of query refinements during product web searches. These refinements provide links to search results on “brands, stores, and types.” Some critiques of these changes state that Google is showing a preference for large brands over small web businesses.

The reality of the change may be more revolutionary than that. It’s possible that Google is relying upon outside sources of information to power those query refinements, like the information found in Google Custom Search Engines, and the labels and context files that may be created with those.

Other updates from Google generate a lot of noise on the Web, but little in the way of actual visual impact. An example of one of those is Google Caffeine, which has been the target of a lot of speculation about ranking changes for web sites, even though Google representatives have stressed that it’s an infrastructure change in how Google operates rather than a ranking change.

A few other algorithm changes to the search engines are quietly made, and sometimes fortunately come up in community discussions, such as the Florida Google update from November 2003, which saw many ecommerce sites lose considerable search rankings just before the busy holiday season, or the May Day update from earlier this year that impacted long tail searches on many other ecommerce sites.

Google didn’t provide much information about the Florida update when it happened, but did give us a little information about the May Day change, which was supposed to provide “higher quality” results for long tail searches.

SEOs are often called upon by their clients and others to voice an opinion on how changes like those might impact search. An answer can sometimes be hard to provide, especially since search engines make so many changes, and when there isn’t much shared by the search engines about the methods and processes behind those changes.

It can be hard work to pull together enough hints from different sources to get an idea of what might actually be behind some changes. A couple of examples are Google Caffeine and the new Brand, Store, and Type query refinements that Google is now sometimes showing when it thinks you are performing a product search.

Google Caffeine

The major changes behind Google Caffeine had nothing to do with the way that pages are ranked in Google. Instead, Google Caffeine was a change to the way that Google stores, retreives, and updates information within its file system:

In a nutshell, those changes involved:

  1. Reducing the default sizes of blocks within which files are stored in the Google File System, from 64mb to 1mb, which enables hard drives to hold considerable more information when they contain small files.
  2. Allowing metadata on a master server to be distributed to multiple master servers, so that searches can more easily be split into parts.
  3. Placing information about specific pages on multiple smaller files instead of one larger file, and only updating the parts that change for a page instead of all of the information about that page.

Google Caffeine happened to allow Google get considerably more storage out of the same hard drives. It enabled the search engine to search for information about a query faster by splitting up parts of that search.

It enabled the search engine to make updates to its index quicker by only having to make changes to relevant files – for instance, if a new link was found to be pointing to a page, but nothing else about the page had changed, only a file about the links pointing to the page would be updated, and not the rest of the information about the page.

Google was granted a patent a couple of weeks ago that actually gives us a pretty good picture of how Google’s File System worked before the Google Caffeine changes. The patent, Maintaining data in a file system, describes how a master server is used to store information about the location of files on other servers connected to it.

The clearest explanation for changes to the Google File System that Caffeine delivered, are probably found in the ACM Queue interview, Case Study GFS: Evolution on Fast-forward A discussion between Kirk McKusick and Sean Quinlan about the origin and evolution of the Google File System.

One of the offshoots from Google Caffeine is that the search engine is supposed to be able to update information about pages on the Web faster.

Document treadmilling system and method for updating documents in a document repository and recovering storage space from invalidated documents, provides a look at how files might be updated incrementally under a file system like Google Caffeine.

If someone asks you about how the rankings of web pages have changed because of Google Caffeine, the easy answer is to say that the update didn’t directly influence how pages are ranked in the search engine.

The indirect impact though, is that it enabled the search engine to become more efficient and index pages faster, and freed up some processing power that could be used for Google to try other things that might influence the ways that pages are ranked.

Brand, Store, and Type Query Refinements

When you perform a search for a product name on Google recently, you may have started to notice some query refinement suggestions at the tops of the search results.

These can include suggestions for “brands” that might be related to the search, ecommerce “stores” where you might find the product, and “types” of those products that can be used to narrow down your search.

For instance, a type of laptop might be one with a touch screen, or a mini, or one used for gaming. An example of these types of refinements on a search for a [laptop]:

A Google screenshot of brand, product, and type query refinements in a search for the word laptop.

An Official Google Blog post from the end of October refers to these as “New product search refinements,” and tells us that:

These refinements are unpaid and ranked algorithmically to show the most relevant searches you may be interested in.

Where do the refinements come from?

One possible source that seems likely are context files of the type that can be created through Google Custom Search Engines.

I recently wrote a three post series that explore how those context files might be used to rerank search results in Google’s web search, or to create query refinements.

If I were to come up with a list of things to do to learn SEO in 2010, unquestionably one of the items on that list would be to experiment with Google Custom Search Engines, and to develop a few context files to use with those search engines.

The use of context files developed from external sources by people who might have an expertise in a vertical market, like computers or digital cameras or mystery fiction or many other types, is somewhat of a sea change for Google – a radical transformation from the search engine and its reliance upon its own data.


Search is constantly changing and evolving, and changes at the search engines reflect new advancements in technology and new perceptions about how people search and how best to help them with those searches.

An important aspect behind SEO is investigating those changes, and trying to understand their impacts. Some changes appear to be cosmetic in nature, like Google Instant Previews and Google Instant, but there’s a possibility that the way that search results are presented to searchers may influence how people may search.

Other changes have less of a visual impact, like the new product query refinements, but may signal a new approach from a search engine such as a reliance upon information taken from vertical search engines set up by people with an expertise in a specific field of information.

Still others may be less clear on the surface, such as the May Day change from earlier this year, and the role that Named Entities might play in search in the future. In the last part of the SEO is Undead series, I’ll explore those topics, and some of the possible reasons for those changes.

It’s likely that we’ll continue to see “SEO is Dead” posts in the future, but it’s the opposite that’s true – SEO is a vibrant, growing field that constantly evolves with search and the search engines.

The first part of this series is SEO is Undead 1 (Links and Keyword Proximity)

The third and final part of the series is SEO is Undead Again (Profiles, Phrases, Entities, and Language Models)


Author: Bill Slawski

Share This Post On