Google and Location Searches

Ever go to a search engine to find out more about a specific place, such as a street or park or business? Want to see what the area around a historic monument is like? These types of searches are often referred to as location searches because the intent behind them is to find information about a specific location.

You can perform a location search in map-oriented search engines such as Google Maps or Yahoo Local or Bing Maps, but the search engines may also provide map type results in their Web search results as well. Before Universal Search was part of Google, maps had started showing up in Google’s web search results. If you searched for a business name or category with some geographic information included in your query, you may have been shown a map in your web search results alongside a listing relevant to your search.

A Google patent granted this week explores some of the challenges that a search engine may face when performing a location search. The way that search engines respond to those challenges shows off some of technical abilities of search engines, and the methods that they use.

For instance, in location searches, some of the issues that search engines may have to resolve can include:

Continue reading “Google and Location Searches”

How Search Engines May Use Geography and Population Info in Deciding to Show News in Web Searches

How does a search engine choose whether to show news items in web search results and when not to?

If you live in Bealton, Virginia, chances are that you may not be too interested in news of a car crash in Brooklyn, New York, when searching for information about Brooklyn. If you’re from Brooklyn, and want to find vacation information about the parks in Wisconsin, you may not be very concerned about the latest winning numbers in the Wisconsin lottery. Yet, someone searching for information about one of the states bordering the Gulf of Mexico these days might be likely to want to see news about the Oil spill in the region.

A Yahoo patent filing published recently describes how they might use a prediction system based upon the search engine’s query logs to decide whether or not to show news results. The prediction system uses a mixture of geographic information related to queries and to searchers as well as information about how “newsworthy” a location might be to make that determination. The patent tells us that it might create similar prediction models to determine whether or not to show other types of results as well. The patent application is:

System and Method of Geo-Based Prediction in Search Result Selection
Invented by Rosie Jones, Fernando Diaz, and Ahmed Hassan Awadallah
US Patent Application 20100161591
Published June 24, 2010
Filed December 22, 2008

Continue reading “How Search Engines May Use Geography and Population Info in Deciding to Show News in Web Searches”

Teaching Computers to Read Newspapers: How a Search Engine Might Use OCR to Index Complex Printed Pages

Optical Character Recognition, or OCR, is a technology that can enable a computer to look at pictures that include text, and translate those visual representations of text into actual text. If you have words within images on your web pages, there’s a good chance that search engines are ignoring those words, when it comes to indexing your pages.

But that might change sometime in the future.

While OCR has been around for a while, search engines haven’t been using the technology when crawling and indexing the content of Web pages. Google’s webmaster guidelines tell us:

Try to use text instead of images to display important names, content, or links. The Google crawler doesn’t recognize text contained in images. If you must use images for textual content, consider using the “ALT” attribute to include a few words of descriptive text.

Yahoo’s page, How to Improve the Position of Your Website in Yahoo! Search Results provides the following tip:

Continue reading “Teaching Computers to Read Newspapers: How a Search Engine Might Use OCR to Index Complex Printed Pages”

How Demand Media May Target Keywords for Profitability

Earlier this month I wrote about a granted Google patent, and a continuation of that patent filed earlier this year, that describe How Google Might Suggest Topics for You to Write About, by providing information to web publishers on queries and topics that are either under-represented in search results or where there’s more demand for information about those topics or queries than there are search results to meet that demand.

The topic struck home with a number of people, especially journalists, and I had a chance to have a conversion with Financial Times ( reporter Kenneth Li about Google’s patents. The Financial Times ran with two different stories on the topic (Google shadow over new media groups, and Google eyes Demand Media’s way with words), focusing primarily on how the technology involved in the patents could bring Google into competition with companies such as Demand Media, Associated Content, and AOL.

While searching through patent filings this morning, I came across an interesting newly published patent application from Demand Media. In the article on Demand Media, we’re told that:

Continue reading “How Demand Media May Target Keywords for Profitability”

Google as an Internet Archive?

Interested in what people were saying the day after Barack Obama was elected president in 2008? Or how people reacted on the Web to the Chicago Whitesox winning the World Series in 2005? Or the early news on the Gulf oil spill on April 20, 2010?

When you search at Google, you can click on “more search tools” in the left column, and enter a “from” and “to” date in the custom range section. If you want to see what pages were showing up on Google on a search for Barack Obama on the day after the election, you can enter 11/4/2008 in the from and to fields. To see what pages were ranking on Google on the day after the Whitesox series ended, entering 10/28/2005 into the date range text boxes.

A custom date range search at Google for Barack Obama on November 4, 2008.

If you click on any of the results that appear, you see versions of pages listed in the results as they appear today. If you click on the Google cache links for those entries, you see the most recent cached versions of those pages. But, what if you saw a copy of the page as it appeared within the date range selected? What if Google decided that it would create an archive of the Web, where it showed older copies of web pages, and used the custom date range to help you find those pages?

Continue reading “Google as an Internet Archive?”