If you were asked to point out the patent that describes PageRank, and you went searching at the US Patent and Trademark Office (USPTO), you might quickly get confused. A little more confusion comes today, with the granting of a new patent on PageRank to Stanford University. I’ve also located the very first PageRank patent which I haven’t seen anywhere else other than in the USPTO information retrieval system.
There were many related patents about PageRank filed in the late 90s addressing different aspects of PageRank by Lawrence Page and a stream of continuation patents that updated the originals. Many of the patents either claim priority over earlier patents, or state that they are continuations of some of the earlier ones.
The earliest filing was for a provisional patent (application number 60035205) which was never officially assigned or published but was filed on January 10, 1997. Titled Improved Text Searching in Hypertext Systems (pdf – 1.7mb), the patent office information retrieval system contains a document it describes as “Miscellaneous Incoming Letter,” which contains the provisional patent filing and an appendix describing processes being applied for. It is highly recommended reading if you’re interested in the history of PageRank and Google.
Here’s a snippet from the First PageRank Patent filing:
Existing search engines on the web produce very poor results when the query matches large numbers of documents. Yet, these simple queries are very frequently issued by users.
Described here is a system that yields radically improved results for these queries using the additional information available from a large database of web links. This database of web citations is used to determine a citation importance ranking for every web page, which is then used to sort the query results.
This system has been implemented, and yields excellent results, even on a relatively small database of four million web pages. Not only does the system yield better results, but it does so at a significantly reduced computational cost, which can be a very large expense for web search engines.
Demonstrating the improvement is as easy as picking a general query, for example, “weather,” and comparing the results to the results from a traditional web search engine, like AltaVista (the results section shows some sample queries).
On January 9, 1998, a new patent was filed, which claimed priority over that provisional patent filing Method for node ranking in a linked database (US Patent 6,285,999).
That patent filing was updated with Method for node ranking in a linked database (US Patent 7,058,628), originally filed on July 2, 2001.
Another patent filing a few days later, on July 6th, 2001, Method for scoring documents in a linked database (US Patent 6,799,176) isn’t mentioned in the newest patent to be granted, but it is related and notes that it is a continuation of the first provisional patent and U.S. Pat. No. 6,285,999.
Scoring documents in a linked database (US Patent 7,269,587), filed on December 1, 2004, claims to be a continuation of US Patent 7,058,628.
Finally, Stanford University was granted a patent today titled Annotating links in a document based on the ranks of documents pointed to by the links, which is a continuation of this line of PageRank patents.
The claims in the patent are written very differently than in some of the earlier patent filings, but they cover substantially the same ground as many of the earlier versions with a difference involving the annotation of links being pointed to in a document.
This is the part of the claims that describe how a link on a page might be annotated:
24. The method of claim 18, where annotating the one or more links includes: associating an icon or text indicative of the one or more determined ranks with the one or more links.
25. A method performed by a computer, the method comprising: determining, by the computer, a rank for each of a plurality of documents in a database, the documents including linking documents and linked documents, one of the linking documents including a link to one of the linked documents; annotating, by the computer, the link in one of the linking documents, based on the determined rank of one of the linked documents, to form a modified document; and providing, by the computer, the modified document to a user.*
26. The method of claim 23, where annotating the link includes: associating an indicator of the determined rank with the link within one of the linking documents
* My emphasis.
I’m not sure if we will start seeing Google annotating links with the PageRank for the pages those links point to any time in the future, but that seems to be the major difference between this patent and the earlier ones.
The claims in the newest version of Stanford’s PageRank Patent do seem to explain PageRank in plainer and more understandable language than the earlier patents, but I’m not sure if we will see the annotation system in action that it describes.
The most interesting discovery for me in researching this newest patent was the letter that I linked to which contains the very first PageRank provisional patent, Improved Text Searching in Hypertext Systems. I hadn’t seen that before, and I’m not sure if it’s available anywhere else yet but here and in the USTPO Information Retrieval system.