Google Sidewiki enabled people to leave a comment on virtually any page on the Web, and could be accessed through the Google toolbar. A 1999 survey of Web annotation services showed that they have been around since the earliest days of the Web, and they differ from commenting systems in that they’ve been aimed at providing ways for people to leave private or public notes about web pages, sometimes but not necessarily with the participation of the authors of those pages. When Google announced that they were closing down Sidewiki last September, they told us that:
I’ve been faced with a pretty difficult decision, choosing the last of the patents, or patent families to include in this series of posts about the most important search-related patents to people who promote sites on the Web. I find I just can’t choose one.
For the last few weeks, I’ve been arguing with myself over a choice of at least two sets of patents. One patent that I wanted to include involved responding to informational needs by going beyond matching keywords to expand the query terms used in search results to include synonyms and pages on related concepts. There are a number of related patents granted to Google that describe how the search engine might identify synonyms, and it’s worth spending some time with all of them.
Thursday evening I visited the Philadelphia offices of Seer Interactive to give a presentation on some of the changes in Search and Social activities involving SEO in a free presentation hosted by Wil Reynolds and the Seer Interactive team. Amongst the possible changes I pointed out included more emphasis on search as a knowledge base, with more Q&A results, and a greater emphasis on information extraction around entities as described in the Wall Street Journal article.
Nuance Communications, which partners with Apple Computers to provide the voice recognition software behind Apple’s intelligent assistant Siri, had 4 patent applications published today at the USPTO that focus upon search and search technology. While the company has at least 274 granted patents and 104 pending patents listed as assigned to it at the US patent and trademark office, these appear to be the first that focus upon the operations of a search engine. They reference the Dragon Search application built for iPhones:
The topics covered in the Nuance patent portfolio primarily involve speech recognition technology, but include some areas that companies like Google have been focusing upon within a few of their patents as well, such as statistical language models and document segmentation algorithms, as well as a browser for the voice web which was filed in 1998.
When a judge writes a judicial opinion upon a case, he often includes more than just his ruling on the case. It usually contains an analysis of the present law, the legal atmosphere, and how the ultimate holding on the case was arrived at. Those written rulings can also include some legal opinions on issues that don’t necessarily play an essential role in the outcome of the case at hand, and those are often referred to as “dicta.”
When you read a patent, you’ll see that it’s broken into a number of parts. The most important of those is the claims section, which is what a patent examiner focuses upon when prosecuting a patent, and deciding whether or not it should be granted. There are also description sections in patents which give a richer and more detailed look at how the technology behind a patent might be implemented (with emphasis on the “might”). Often those descriptions include material that isn’t reflected within the claims section of a patent, and in many ways, those description sections could be considered as similar to the dicta that I mentioned sometimes appears within judicial opinions.
Stanford University was granted two new patents today under the name, Scoring documents in a database, both of which were filed at the United States Patent and Trademark Office on January 19, 2010. These two patents, assigned to Stanford and listing Lawrence Page as inventor, are described as continuation patents of the following patents assigned to Stanford which focus upon PageRank:
Link evaluation. We often use characteristics of links to help us figure out the topic of a linked page. We have changed the way in which we evaluate links; in particular, we are turning off a method of link analysis that we used for several years. We often rearchitect or turn off parts of our scoring in order to keep our system maintainable, clean and understandable.
A lot of people were guessing which “method of link analysis” might have been changed, from PageRank being turned off, to anchor text being devalued, to Google ignoring rel=”nofollow” attributes in links, to others. I was asked my opinion by a few people, and mentioned that there were a number of potential approaches that Google might have changed.
According to Google’s Director of Research, Peter Norvig, if you look at Google Trends for trends related to “full moon” or “ice cream”, you’ll see that Google searches for those terms imitate actual physical trends in the world. With a very large number of queries performed for those terms, searches for “full moon” peak every 28 days. Searches for “ice cream” peak every summer, 365 days apart. Large amounts of data make interesting things possible.
If you’re interested in how search engines work, and how large amounts of data can help them do what they do more effectively, it’s highly recommended that you read the paper The Unreasonable Effectiveness of Data (pdf), written by Alon Halevy, Peter Norvig, and Fernando Pereira, from Google. Even more highly recommended is a presentation from Peter Norvig of the same name from a Distinguished Lecture Series at the University of British Columbia last fall, which sadly has less than a 1,000 views at YouTube presently:
In the early days of Google, when you performed a search, the results you received were just links to pages found on the Web, showing page titles, snippets, and URLs. Google started adding other types of searches to its Web search, such as:
While these launched as separate search repositories, they weren’t going to stay that way, and may never have been planned as solely being standalone data repositories. In 2007, Google introduced Universal Search. At a Google presentation called Searchology in May of 2007, Google announced Universal Search, which included video, news, books, image and local results incorporated into Web search results. According to the Official Google Blog post, the roots of Universal Search can be traced back to 2001, with a lot of effort leading to its launch:
Over several years, with the help of more than 100 people, we’ve built the infrastructure, search algorithms, and presentation mechanisms to provide what we see as just the first step in the evolution toward universal search. Today, we’re making that first step available on google.com by launching the new architecture and using it to blend content from Images, Maps, Books, Video, and News into our web results.