According to Google’s Director of Research, Peter Norvig, if you look at Google Trends for trends related to “full moon” or “ice cream”, you’ll see that Google searches for those terms imitate actual physical trends in the world. With a very large number of queries performed for those terms, searches for “full moon” peak every 28 days. Searches for “ice cream” peak every summer, 365 days apart. Large amounts of data make interesting things possible.
If you’re interested in how search engines work, and how large amounts of data can help them do what they do more effectively, it’s highly recommended that you read the paper The Unreasonable Effectiveness of Data (pdf), written by Alon Halevy, Peter Norvig, and Fernando Pereira, from Google. Even more highly recommended is a presentation from Peter Norvig of the same name from a Distinguished Lecture Series at the University of British Columbia last fall, which sadly has less than a 1,000 views at YouTube presently:
In the early days of Google, when you performed a search, the results you received were just links to pages found on the Web, showing page titles, snippets, and URLs. Google started adding other types of searches to its Web search, such as:
While these launched as separate search repositories, they weren’t going to stay that way, and may never have been planned as solely being standalone data repositories. In 2007, Google introduced Universal Search. At a Google presentation called Searchology in May of 2007, Google announced Universal Search, which included video, news, books, image and local results incorporated into Web search results. According to the Official Google Blog post, the roots of Universal Search can be traced back to 2001, with a lot of effort leading to its launch:
Over several years, with the help of more than 100 people, we’ve built the infrastructure, search algorithms, and presentation mechanisms to provide what we see as just the first step in the evolution toward universal search. Today, we’re making that first step available on google.com by launching the new architecture and using it to blend content from Images, Maps, Books, Video, and News into our web results.
A few days ago, I asked the question, Is Google Aiming at Building Faster Networks and Data Transmissions? Google had acquired some interesting patent applications that have the potential to increase the speed and quality of data transmissions. An even more recent intellectual property acquisition by Google points to a growing interest in networking technology.
Google is planning on bringing ultra high speed broadband access to Kansas City, with fiber optic cable connections between homes that Google promised will deliver 1 gigabyte-per-second speeds, or a speed that’s “20,000 times faster than dial-up and more than 100 times faster than a typical broadband connection!.” That’s pretty fast. According to the Official Google Blog post, Google may be in talks with other cities to bring them this kind of high speed internet access as well.
How much might one page on a website influence the rankings of other pages? When I joined an agency in 2005, our focus was on rankings for individual pages – optimizing their content for specific terms and phrases, and making sure that they had links from other pages, both onsite and off. I found myself unable to color just within those lines. It was impossible to ignore the impact of global issues on a website when trying optimize individual pages for terms. Every page on a site has the ability to impact how each page might be crawled and indexed and displayed by search engines.
For example, if the home page of a site was accessible at multiple URLs, there was the very real risk that PageRank for that page could be split multiple ways, such as amongst:
Last July, a Google Blog post titled More Wood Behind Fewer Arrows announced the closing of Google Labs, where a number of experimental projects taking place at Google were available for the public to explore and try out. Many of those projects sprouted out of Google’s 20 percent time approach, where engineers are encouraged to spend one day a week, or 20 percent of their time, working on projects that aren’t necessarily part of their job description. Amongst those projects starting out as 20 percent time projects are Gmail, Adsense for content, Orkut, and Google Suggest. We’ve been told that the 20 percent initiative isn’t going away, but Google seems to be growing a little more secretive.
When Eric Schmidt stepped down as CEO of Google, and Larry Page took over that role, Co-Founder Sergey Brin’s position of the company was redefined as well, and we were told that he would be in charge of “special projects” at Google. A New York Times article published in November of last year told us about Google’s Lab of Wildest Dreams or a “top-secret lab in an undisclosed Bay Area location where robots run free,” referred to as Google X. This is the home of Google’s Driverless cars. It’s a place where “shoot for the stars” type technology is being explored.
It might also now be the home to a project that has roots in a technology essential to the laying of the transatlantic cable back in the 1860s, developed by Oliver Heavyside.
Has an improvement in how Google understands the layout of pages, and understands and classifies different elements found on page had an impact on the titles and snippets that we see in search results? Google may classify queries to decide what to show for those page titles and snippets in search results, but it’s possible that they might also be classifying the contents of “original titles and snippets and URLs” when deciding to show different titles and expanded snippets. Might Google do that in combination with a classification of page elements (a portion of HTML containing some text) found on the pages in search results to try to determine the best representation of a search result in response to a query?
Google May Chose Titles and Snippets for Pages
When you search at Google, the search results displayed for web pages include titles, URLs, and snippets for the pages listed in the results. In those, the query terms you used, or sometimes synonyms for them, may be included in the title and snippet, and Google will highlight those. As a site owner, you should have unique and engaging titles and meta descriptions for each page you want indexed by search engines. Not only does that make it more likely that search engines will crawl, index, and display those pages, but if you use the keywords you’re optimizing those pages for within those titles and descriptions, Google may show your choice of title and meta description within search results.
Somewhere in an alternative universe, it’s possible that one of the most feared hitters in baseball might have instead been known as one of its greatest pitchers. Babe Ruth started out as a pitcher for the Boston Red Sox in 1914, and when approached about getting his bat into the lineup on a daily basis in 1918, his manager Ed Barrow responded that “I’d be the laughingstock of baseball if I took the best lefthander in the league and put him in the outfield.” A couple of years later, Ruth was sold to New York’s team for an unprecedented $125,000 where he proceeded to hit 54 home runs for the Yankees, and begin a pretty good career hitting a baseball instead of throwing it at people.
In 1920, anyone looking for information about the Babe probably weren’t too interested in his pitching career. Likewise, when someone searches today for [world series champion], it’s likely that they are looking for fresh results. How does a search engine like Google determine when searchers might prefer fresh results, and when they might prefer older results?
A patent application was published today which describes the kind of intelligent automated assistant that we see in use on Apple’s iPhone 4S, known as Siri. But the patent isn’t necessarily limited to the iPhone application itself, and the describes how such a system could be used in a number of ways, including with mobile phones, PDAs, tablets, game consoles, embedded computer systems in cars, and possibly others. This assistant might provide information and services on a single client device or multiple devices, and possibly in combination with applications and information on servers as well.
It could also act as an active participant in messaging platforms such as email, instant messaging, discussion forums, group chat sessions, and customer support sessions.