Garrett Rogers, at Googling Google, writes of new domain names registered to Google, including a number that hint at a Google Archive service, perhaps similar to that offered by the Internet Archive’s Wayback Machine.
Some comments in a post over at ResourceShelf take up the idea, and offer some additional commentary in Yet Another Day and More Google Domain Names.
A patent application assigned to Google in June, which was published this May – Multiple index based information retrieval system, describes an archive system that Google could offer. Here’s a snippet:
 c) Indexing Instances of Documents for Archival Retrieval
 Another embodiment of the present invention allows the capability to store and maintain historical documents in the indices, and thereby enable archival retrieval of date specific instances (versions) of individual documents or pages. This capability has various beneficial uses, including enabling a user may search for documents within a specific range of dates, enabling the search system 120 to use date or version related relevance information in evaluating documents in response to a search query, and in organizing search results.
I wrote some about this document the day after it was published in a blog post titled Google Aiming at 100 Billion Pages? As I noted there, the inventor listed in the patent filing is Anna Patterson, who developed a beta search engine for the Internet Archives before it was removed from the site, sometime around when she went to work for Google. There’s a possibility that the technology she developed accompanied her in the move to Mountain View, according to a news article from Stephanie Olsen:
Stanford continues in its role as a breeding ground for search projects. Since 2003, Google has purchased at least two projects hatched at Stanford–personalization search tool Kaltix and a project from Anna Patterson, a Stanford computer science research associate.
So, Google has an employee with experience in building a search system for such an archive system, they have intellectual property assigned to them that describes such as system, and it’s possible that they licensed or purchased a search system that successfully worked as a beta in performing searches on the Internet Archive.
The newly registered domain names may be in reference to some other type of service that offers historical records of things like newspapers, magazines, and other periodicals. But, Google seems to have a lot of pieces in place to offer an Archive system like the Internet Archives.
Anna Patterson is the listed inventor on a number of patent applications that appear to be related. The first two below are included in the USPTO Assignment Database, with assignments recorded to Google, while the remainder aren’t.
- Multiple index based information retrieval system (20060106792)
- Phrase-based searching in an information retrieval system (20060031195)
- Phrase-based indexing in an information retrieval system (20060020607)
- Phrase-based generation of document descriptions (20060020571)
- Phrase identification in an information retrieval system (20060018551)
In addition to an archive system, these patent applications also include such things as a description of a supplemental index, personalization methods, presentation of search results, classification of documents to topics, and an annotation method that could lessen the impact of Google bombing.
Added 9/6/2006 – Google has added a news archive search. Garrett Rogers writes more about it in Google News Archive Search released today. I’ve tried it out – News Archive Search – and it is a nice addition to the searches that Google provides. It has more than news sources. For instance, Ancestry.com told me that my grandfather’s WWI draft registration card is available for me to see if I sign up with them.
I’m not seeing many results in the News Archive Search that don’t require a subscription or a pay-per-view fee.