Xerox Helps Google Fill in Some Search Gaps: From Pre-Web to Post Panda

If Google had launched in the early 90s, it might have come out with technology that could be used to search some of the electronic databases of the day, prior to the World Wide Web, such as Lexis or Dialog. It would have developed ways to visualize results from those systems in useful ways, and custom user interfaces. It might have developed a progress bar that would show you that your search was taking place, and the system hadn’t failed, back when searches took more than milliseconds.

If Google got its start before a WWW had a place in front of its name in a browser address bar, it might have developed very similar technology to what it’s working on today, but with a slightly different approach that can be sensed when reading through a number of Web-based patents from a company like Xerox.

Google was assigned 94 granted (90) and pending (4) patents from Xerox as indicated by an assignment recorded by the United States Patent Office last week, on February 16th, 2012. The execution date of the assignment is November 10, 2011. The USPTO assignment database doesn’t include any information regarding the details of the transaction, such as financial terms.

My last post linking Google and Xerox together was titled Xerox Brings Patent Infringement Suit Against Google, Yahoo, and YouTube. A look at the PACER records for the case (1:10-cv-00136-UNA) in US District Court for the District of Delaware shows it being closed on December 15th, 2011. The case docket includes a stipulation between Google and Xerox dismissing Google from the case on 11/11/11, the day after the assignment of these patents was executed. It appears that the assignments of the patents might have been related in some way to the stipulation, though the patents Xerox claimed were being infringed upon by Google and YouTube weren’t included in the assignment.

While the patent filings include a number outside of search and information retrieval, such as a few involving handheld devices, printing over a network, distributed networking systems, optical character recognition, and workflow processes, many of the patents do seem related to search based services that Google provides.

A number of the patents involved focus upon reviews and collaborative filtering of those reviews, caching of webpages in part and in whole, managing online documents, and what seems to be a large family of patents by the same or similar names that focus upon comparing and determining the quality of documents. Reading through a number of those, I was reminded that today is the one year anniversary of Google’s announcement of their Panda Algorithm.

The patents that focus upon document quality could potentially influence some aspects of the quality scoring of web pages that might be classified based upon an algorithmic machine learning approach such as Panda. Here’s the abstract from one of those patents:

Text, images, and/or graphics of electronic documents should be organized and laid out in a two-dimensional format for presentation to the viewer. The best such layout depends upon the content present, the creator’s intent, the output device, and the viewer’s interests. To analyze the qualitative nature of the layout in quantifiable terms, the electronic document is measure using various quantifiable factors; such as, balance, uniformity, white space management, alignment, consistency, legibility, etc.; that impact a qualitative nature of a document. Such quantifiable factors are then used to quantize the aesthetics, ease of use, eye-catching ability, interest, communicability, comfort, and convenience of the document.

I haven’t had the chance to read through all of these, and pick them apart, and will probably be doing that as time permits, but thought that might be easier with more eyeballs on the patent filings. Here are the granted and pending patents that were included in the USPTO assignment:

Granted Patents

Pending Patent Applications


Google has been acquiring a large number of pending and granted patents from other companies in the past couple of years. A number of those covered a very wide range of technologies, from sensor technology for driverless cars, to fiber optics networking processes and devices, to computer and database architecture, and more.

This acquisition seems a little more focused upon some of the core search technologies that Google is best known for, from some fairly old patents still focused upon search, to some newer patents that might help Google with its move towards improving its processes for reviews and recommendations and determining quality scores for documents on the Web. For anyone interested in how Google is evolving towards machine learning processes to rank web pages, there can be some value in spending some time going through these patents.


Author: Bill Slawski

Share This Post On