On December 12, 2007, Girafa.com Inc. filed a lawsuit agains Amazon Web Services and a number of other parties for patent infringement over a patent titled Framework for providing visual context to www hyperlinks (6,864,904).
The case claimed that defendants Amazon Web Services LLC, Amazon.com, Inc., Alexa Internet, Inc., IAC Search & Media, Inc., Snap Technologies, Inc., Yahoo! Inc., Smartdevil Inc., Exalead, Inc. and Exalead S.A. were infringing upon Girafa.com’s patent by displaying thumbnail images of websites as described in Girafa.com’s patent. The case was closed in The Delaware District Court for the District of Delaware in September of 2009 after claims against the parties involved were either dismissed or settled or both.
Google wasn’t a named party in the suit, but has been displaying Instant Previews of websites on desktop search results since last November, and on search results on mobile devices since March 8, 2011. Earlier this month, Google was assigned the Girafa.com patent at the heart of the earlier law suit, along with an updated continuation patent.
There’s some evidence that the Panda updates to Google’s ranking algorithm may be based upon a decision tree approach to classifying and creating quality scores for web pages and sites. Curious as to whether Google might be using a decision tree approach to classify other information, I went digging through some of Google’s other patent filings that I might not covered here in the past.
I found one that may have interesting implications regarding how queries are classified and different data that may be stored and emphasized at different Google data centers or data partitions.
When you design a web page with fixed dimensions, set for a specific display resolution, sometimes visitors will arrive at your page with a higher web page resolution level. What this means is that there can be empty space showing in their browser window when viewing your page. There are other times when someone visits your page, and their browser window isn’t using their whole monitor display, and they might resize their browser to include a higher resolution level, which can then cause unused browser space to appear.
A Google patent application published this morning describes how Google might identify when such unused space exists, and include content within that space. The patent filing tells us that this content can include text, images, videos, animations, and other types of content that can be displayed in a browser.
A patent application was published at the USPTO this morning that describes an interesting new application from Apple, enabling people to find others with common interests or common experiences or both, based upon location. The patent is fairly detailed, and I’ve somewhat brushed the surface with my description below. If you’re interested in location based services and social networking, it’s definitely worth a read.
It also has some of the more interesting images that I’ve seen so far in a patent filing this year (The person shown in them looks a little like a comic book villian), and they do a very good job of displaying an example of how this system could be used.
In 2005, Google’s John Lamping gave a presentation to a class at Berkeley on the Quality of Information, titled On the internet, nobody knows you’re a dog (pdf). In his talk he raised questions such as:
- Why is the Daily Californian advertising German pages?
- How much can the spam industry make by spamming search engines?
During his speech, he pointed out ways that people have attempted to manipulate search results, such as mad libs-like insertions of keywords into templates for pages like in his slide above, cloaking and other spam approaches to optimizing pages, and paid links and comment spamming. In addition to talking to academic audiences about search quality, he has been working on doing something to increase the quality of search results.
I don’t know who said that novelists read the novels of others only to figure out how they are written. I believe it’s true. We aren’t satisfied with the secrets exposed on the surface of the page: we turn the book around to find the seams.
In a way that’s impossible to explain, we break the book down to its essential parts and then put it back together after we understand the mysteries of its personal clockwork.
- Gabriel Garcia Marquez Meets Ernest Hemingway
I’m told that if you want to be a good photographer, you should look at a lot of photos. If you want to be a good painter, you should look at a lot of paintings. I believe that the same holds true with being a blogger, and seeing how other bloggers present their messages, tell their tales, and report their news.
Google, Yahoo, and Bing have joined forces to enable web publishers to include additional HTML that adds more structure to their pages, and possibly makes those pages easier to index and may provide them with a little more control over what may show up in search results for pages. There’s some controversy over the approach, some questions about the impact of related patents that all three search engines have been granted, and web publishers should be paying attention to the possible impacts of this initiative from the search giants.
Google’s Author Markup
Yesterday, Google announced that they were introducing a way to add HTML code to a page to indicate who the author of the page might be. This code would appear as part of a link pointing to an author’s page on the same site, so that a search engine might associate the content of that page with the author who wrote it. The announcement was made in the Google Inside Search blog, in the post Authorship markup and web search, which told us how Google would use rel=”author” and rel=”me” to learn about who may have authored what on the Web.
Unlike Web pages, there are no links in books for Google to index and use to calculate PageRank. There’s no anchor text in links to use as if it were meta data about pages being pointed towards. Books aren’t broken down into separate pages that have a somewhat independent existence of their own the way that Web pages do, with unique title elements and meta descriptions and headings. There isn’t a structure of internal links in a book, with file and folder names between pages or sections that a search engine might used to try to understand and classify different sections of a book, like it might with a website.
A Google patent granted today describes some of the methods that Google might follow to index content found in books that people might search for. It’s probably not hard for the search engine to perform simple text based matching to find a specific passage that might be mentioned in a book. It’s probably also not hard to find all of the books that include a term or phrase in their title or text or which were written by a specific author. But how do you rank those? How do you decide which to show first, and which should follow?