Link evaluation. We often use characteristics of links to help us figure out the topic of a linked page. We have changed the way in which we evaluate links; in particular, we are turning off a method of link analysis that we used for several years. We often rearchitect or turn off parts of our scoring in order to keep our system maintainable, clean and understandable.
A lot of people were guessing which “method of link analysis” might have been changed, from PageRank being turned off, to anchor text being devalued, to Google ignoring rel=”nofollow” attributes in links, to others. I was asked my opinion by a few people, and mentioned that there were a number of potential approaches that Google might have changed.
I love local search. In many ways, it’s similar to Google’s Web Search, but with its own unique features. In addition to Googlebot, Local search has street view cars. In addition to looking at links, local search also looks for mentions of businesses that appear with location-based information. Instead of robots.txt files, local search is stopped by signs like “military base,” or “private street.”
I also appreciate the local search ranking factors that a good number of people who are involved in local search have been putting together every year lately, but I’m also a little apprehensive about those, and I’m going to illustrate why with this post. Imagine for instance that Google considers the names of businesses in the way that it ranks them in local search, but that it doesn’t treat every name the same. For example, “Frost Diner” might be treated one way by Google Local Search.
And, because it has a somewhat longer name, “Red Truck Bakery” might be treated differently by the algorithms that use business names as a ranking signal by Google’s Local Search:
If Google had launched in the early 90s, it might have come out with technology that could be used to search some of the electronic databases of the day, prior to the World Wide Web, such as Lexis or Dialog. It would have developed ways to visualize results from those systems in useful ways, and custom user interfaces. It might have developed a progress bar that would show you that your search was taking place, and the system hadn’t failed, back when searches took more than milliseconds.
If Google got its start before a WWW had a place in front of its name in a browser address bar, it might have developed very similar technology to what it’s working on today, but with a slightly different approach that can be sensed when reading through a number of Web-based patents from a company like Xerox.
Google was assigned 94 granted (90) and pending (4) patents from Xerox as indicated by an assignment recorded by the United States Patent Office last week, on February 16th, 2012. The execution date of the assignment is November 10, 2011. The USPTO assignment database doesn’t include any information regarding the details of the transaction, such as financial terms.
Google has been busy over the past couple of years acquiring a good number of small startups, including some that may help or have helped contribute features to Google Plus, such as Fridge, Tweet counting SocialGrapple, people sorting Katango, the team behind JustSpotted, social ranking PostRank, and social movie recommendation service fflick.
Google hasn’t publicly announced every acquisition that it has made, and the search engine has also purchased intellectual property such as pending and granted patents from some companies as well, without necessarily buying the companies behind the patents. For example, in August of 2010, Google was assigned a handful of patent filings from Appmail, LLC, recorded at the USPTO in May of 2011. A pending and a granted patent from that group appear to be related to Grouptivity, which was a social service run by Appmail that used a social mail service to enable people to share content they found on the web with others, either privately or publicly. That service allowed for the creation of groups to “keep your personal contacts separate from co-workers and other categories.” As a publisher-centric web service, grouptivity was described as a service that:
According to the United States Patent and Trademark Office (USPTO) assignment database, Google has acquired the pending patent applications of one time search rival Cuil, touted when launched as a potential Google Killer.
On July 28, 2008, the search engine Cuil went live with hopes from many that it would rival Google in technological know-how and create some competition for the search engine. Those hopes were fueled in part by the fact that search engine was started by former Google employees Anna Patterson and Russell Power, a co-founder from IBM, Tom Costello, and they were also joined by Altavista founder and ex-googler Louis Monier as well. The company received a fair amount of funding before it launched, likely in-part due to the past employment history of its founders.
When Cuil launched, it supposedly had within its index more that three times as many Web pages as Google, and ten times as many as Microsoft. It promised not to retain information about searchers past search histories or surfing patterns as a way of distinguishing itself from Google. Bloomberg News called it one of the most successful start ups of 2008, and there were some real high hopes that the new search engine would rival Google.
Things seemed to start going south for Cuil shortly after launch. After a month, Louis Monier left the company after disagreements with CEO Tom Costello. Search results were presented in a 2 column format rather than a single column, and were accompanied by thumbnail images. I noticed at the time, a few complaints about the two column format, and in my personal experience using the site, the thumbnails presented often weren’t very good choices, and not representative of the pages or topics being returned in search results. The Cuil website shut down in September of 2010, with news of a mysterious acquisition falling through surfacing a week later.
At Hewlett-Packard’s global partner summit in Las Vegas yesterday, President and CEO Meg Whitman gave a keynote presentation on the state of the company and made a prediction about Google’s Android operating system:
We decided to contribute WebOS to the open source community and this will take three to four years to play out,” said Whitman. “I think there is room for another operating system. iOS is great but it is a closed system. I think that Android may end up as a closed system because of [Google’s] relationship with Motorola.
Interesting timing on a statement like that, as I noticed the appearance yesterday afternoon of the assignment of 97 patents from HP to Google in the USPTO’s patent assignment database. Then again, the assignment is listed at having been executed on 10/25/2011, and not recorded until 02/06/2012. The patents cover a wide range of technologies, including at least one search related patent on Dynamic query expansion, that Hewlett-Packard acquired at some point from Digital Equipment Corporation’s search engine AltaVista, with search pioneer Louis Monier listed as a co-inventor.
There are also a couple of patents involving Java, as well as a number involving computer architecture and distributed networking, multi-thread processing and operating systems, telecommunications and video, software and hardware monitoring, amongst others. There’s also one on auxiliary propane fuel tanks for vehicles (driver less cars?), and another on paper making.
Google was granted a patent yesterday on Blog Search, and how the search engine might filter blog posts out of blog search based upon a number of factors. The patent was originally filed in 2006, and it’s the first patent filing I’ve seen from Google that uses the term “splog.” The screenshot from the patent below shows some of that potential filtering process
According to Google’s Director of Research, Peter Norvig, if you look at Google Trends for trends related to “full moon” or “ice cream”, you’ll see that Google searches for those terms imitate actual physical trends in the world. With a very large number of queries performed for those terms, searches for “full moon” peak every 28 days. Searches for “ice cream” peak every summer, 365 days apart. Large amounts of data make interesting things possible.
If you’re interested in how search engines work, and how large amounts of data can help them do what they do more effectively, it’s highly recommended that you read the paper The Unreasonable Effectiveness of Data (pdf), written by Alon Halevy, Peter Norvig, and Fernando Pereira, from Google. Even more highly recommended is a presentation from Peter Norvig of the same name from a Distinguished Lecture Series at the University of British Columbia last fall, which sadly has less than a 1,000 views at YouTube presently: