Imagine gathering together 10 extremely knowledgeable search engineers, locking them into a room for a couple of days with walls filled with whiteboards, with the intent of having them brainstorm ways to limit stale content and web spam from ranking highly in search results. Add to their challenge that the methods they come up with should focus upon “the nature and extent of changes over time” to web sites. Once they’ve finished, then imagine taking what appears on those whiteboards and condensing it into a patent.
The end result would likely look like Google’s patent Information Retrieval based on Historical Data. When this patent was originally published as a pending patent application awaiting prosecution and approval back on March 31, 2005, it caused quite a stir in the SEO community. Here are a few of the many reactions in forums and blog posts as a result:
I like looking at patents and whitepapers and other primary sources from search engines to help me in my practice of SEO. I’ve been writing about them for more than 5 years now, and am putting together this series of the 10 Most important SEO patents to share some of what I’ve learned during that time. These aren’t patents about SEO, but rather ones that I would recommend to anyone interested in learning more about SEO by looking at patents from sources like Google or Microsoft or Yahoo.
The first PageRank patent application was never published by the United States Patent and Trademark Office (USPTO), it was never assigned to a particular company or organization, and it was never granted. It avoids dense legal language and mathematics that can make reading patents difficult, and it captures the excitement of a candidate Ph.D. student, Larry Page, who has just come up with a breakthrough in indexing webpages that had the potential to be a vast improvement over other search engines at the time it was published.
The decision process that you go through when deciding to make changes to your site can be tough. Even if those changes are likely necessary and needed, determining the best way to implement them can make you pause, and spend a lot of time considering all the potential alternatives that you might have. You can do a cost/benefit analysis, where you consider how much change you might make to your site, what the benefits of making that change might be, and what the costs might be in both making the change and deciding not to do so.
It shouldn’t require much thought to do things like make your website more usable, but it can, especially if the changes you make change around the look and feel of your pages, and the way that people interact with them. A good example are the changes taking place at Google, where the search engine has implemented a number of new design elements over the past year or so, including new colors and formatting of their search results pages, a different look to how local search results are presented within Web search results, URLs now appearing under page titles and above snippets for pages, and Instant Previews, which show a thumbnail of a page and call out boxes of text showing where query terms appear within those thumbnails.
On the subject of those Instant Previews, one of the challenges that search engines face is presenting web pages returned from a search in a way that helps searchers locate the information they want to find. A typical search result for a web page includes a page title, a URL for the page, and a short snippet that might be taken from a meta description or from text found on the page itself. A searcher is shown a page filled with these document representions to choose from, but sometimes that’s not enough to make a decision as to what page to click through.
Once upon a time, when you searched the Web at Google, the results displayed were limited to a list of 10 pages with page title, snippet of text from meta description or page content, and URL to that page. We’ve been seeing the search engines diversifying what they might display for certain pages, with special formats for things like forum posts, Q&A listings, pages that include events, and sometimes sitelinks or quicklinks to other pages as well.
The URL shown for some pages might have hinted at the structure of sites and locations of pages within a site hierarchy, if they showed directories and subdirectories within paths to pages. Some websites include breadcrumb navigation on their pages to show you more explicitly where you might be at within a site, and provide an easy way to visit higher categories. Google has started showing those breadcrumb listings for some pages, to make those listings more useful for searchers, and to make it more clear where those pages are within the hierarchy of a site.
For years the New York Times website was a great example I could point people to of a very high profile site doing one of the basics of SEO very very wrong.
If you visited the site at “http://newyorktimes.com/” you would see a toolbar pagerank of 7 for its homepage. If instead you visited the site at “http://www.newyorktimes.com/” you would see a toolbar PageRank of 9 for the site. The New York Times pages resolved at both sets of URLs, with and without the www hostname. Because all indexed pages of the site were accessible with and without the “www”, those pages weren’t getting all the PageRank that they should have been, splitting PageRank between the two versions of the site, and that probably cost them in rankings at Google, and in traffic from the Web. Google likely also wasted their own bandwidth and the Times returning to crawl both versions of the site instead of just one of them.
A few years ago, someone with at least a basic knowledge of SEO came along and fixed the New York Times site so that if you followed a link to a page on the site without the “www”, you would be sent to the “www” version with the use of a status code 301 redirect. The change ruined the example that I loved showing people, primarily because even very well known websites make mistakes and ignore the basics. It’s one of the things that makes the Web a place where small businesses can compete against much larger companies with much higher budgets.
Timing is everything. On Monday, I was asked if I would give a presentation at the Internet Summit 2011 in Raleigh, NC on November 15th and 16th on an advanced SEO topic. I thought about it, and agreed, and decided to give a presentation on how social media has been transforming search on Tuesday night. On Thursday morning, Google delivered a present in the form of a Blog post at the Official Google Blog titled Giving you fresher, more recent search results.
The title for my presentation is “Next Level SEO: Social Media Integration” and the basic premise is that social media has changed the expectation of searchers. Searchers want fresher content, they want to see what their friends and contacts have to say, and they want access to experts and authority figures and their thoughts on timely events and news. Search engines have no recourse but to respond.
I didn’t see the Google blog post until yesterday afternoon, and quickly wrote up some of my thoughts at Google Plus regarding Fresher Results at Google? There are a number of other very thoughtful reactions to the change, and I thought I might point those out, and maybe expand upon my thoughts from yesterday.
Sometimes when you search at Google, you might not find any results that you find interesting and may search again using a somewhat similar query. Chances are that you don’t want to see the same sites or pages all over again. A newly granted patent from Google describes how the search engine might demote results for pages from sites that show up in an earlier search when they appear in your results during a following search during the same query session.
For example, imagine that you search for [black jacket] and don’t see any results that you like on the first page, regardless of whether you clicked upon any of them or not. Instead of going to the second page, you search for [black coat]. Since the queries are related, it’s possible that you might see results from some of the same sites in both searches, which the patent refers to as “repetitive” search results. Google may take your decision to search more as an indication that you weren’t satisfied with the pages shown in the first set of results, and may demote some of the “repetitive” sites or pages from that first query so they aren’t as prominent in the second set of search results.
So, your search for [black jacket] might show a page from the site “Winter Coats Online.” You might click upon it, or you might not. Regardless, when you move on to search for [black coat], if a page (the same page or another page) from “Winter Coats Online” would have ranked highly for that search, Google might push it down in search results so that it isn’t listed as prominently.
You go to a site that you’ve enjoyed and bookmarked sometime in the past but haven’t visited in a while, and it’s changed. The topics it discusses are different, or the writing style isn’t quite the same, or it suddenly has links within its content to commercial pages that it probably wouldn’t have linked to before, or all of those things. It also seems heavily focused upon more commercial terms and content. It’s changed, and now its pages now have the appearance of what many might call “doorway pages.”
Doorway pages have also been referred to by terms like gateway pages, entry pages, bridge pagers, portal pages, and their primary purpose is to attract visitors from search engines in order to send them to other places.
As a site owner, you don’t want Google to start identifying your pages as doorway pages. Google’s Webmaster Guidelines tell us to: