A good number of the posts at SEO by the Sea lately, about the use of geography and location in searches have involved specialized local searches from the major search engines. Chances are good that more people use the main index and results from those search engines than the Local searches, to find out information involving places. (see Danny Sullivan’s article: Searching with Invisible Tabs)
There has been a significant amount of research about searches for information related to specific geographical areas though, and some that is worth looking at carefully is the papers coming out of Microsoft research. One of those is Detecting Dominant Locations from Search Queries, which focuses upon the “location intent” of the search, and attempts to understand a “dominant location” of a query, based upon an agreement between “a majority of people who know the answer for that query.”
The processes described in that paper have been filed with the US Patent and Trademark Office, and were officially published this past week:
Search query dominant location detection
Invented by Chuang Wang, Joshua Forman, Lee Wang, Xing Xie, Ying Li
Assigned to Microsoft
US Patent Application 20060271518
Published November 30, 2006
Filed: May 27, 2005
A system and method for location-specific searching. The invention correctly identifies explicit and implicit locations in a search query and provides an appropriate dominant location. Top search results are obtained and analyzed to determine which terms in the query often appear in combination, and the query is tokenized based on the analysis. An explicit location indicating a location intent is most likely treated as an individual token, and the explicit location is treated as the dominant location of the query. In the case of a false positive, wherein the explicit location in a query is not the location intent, the explicit location is likely to be present with other terms that provide context. A token will likely include these terms together. The explicit location will therefore not be used to generate location-specific results in the case of a false positive.
Explicit and Implicit Locations in Queries
The patent defines some terms that are helpful when reading the paper and patent from Microsoft:
Location intent – an indication in a search that the searcher is looking for something related to a geographic area. For example, a search for “Seattle Restaurant” may indicate to a search engine that the searcher is looking for web pages for restaurants based in Seattle. This can get confusing for a search engine, though. For example, a search for “Kentucky Fried Chicken” isn’t necessarily a search for places in Kentucky that serve fried chicken. The wording of a query and the type of information sought may create the possibility of false negatives and false positives.
There are at least two different types of search queries that can show location intent – ones which express explicit locations, and others that indicate implicit locations.
Explicit location – a geographical name is present in the query. So, the term “Seattle” in the query “Seattle Restaurant” is seen as an explicit location. But, we’ve also seen from the Kentucky Fried chicken example, that an explicit location in a query may not be the actual location intent of the query.
The patent application uses another example of a query that uses a geographical location that doesn’t express a location intent:
“Indiana” is the explicit location of the query “Indiana Jones” but it is not the location intent.
Implicit location – A query with an implicit location doesn’t contain a location name within the query but is associated with a location intent. So, a query of “restaurant around Space Needle” is an implicit query because it names a landmark rather than a geographic location. The implication is that the searcher is looking for restaurants in downtown Seattle.
If the search engine fails to recognize that there is a “location intent” in that implicit query and returns results that ignore it, the patent application tells us that it is an example of a false negative.
So a false positive in geographic-based searches is when a search engine sees a term in a context that isn’t geographically related and returns location-based results. A false negative is when a location is implied in a query, and the results don’t include location-based results.
IP-Based Results vs. Dominant Location Results
One method that could be used to provide searchers with location-specific results relies upon performing a reverse IP lookup of the user searching, and basing results upon their physical location. That’s not helpful when someone is looking for information about a distant location. The patent application instead looks to a “dominant location” for a query:
A dominant location is, for example, a prominent location that is agreed upon by a majority of people who know the answer to the query.
If a query has a dominant location, it may be used as the location intent for that query.
However, detecting a dominant location is difficult because it is a subjective and collective measure: it is the location existing in the collective human knowledge.
I didn’t go into details on how a query is broken down into tokens, as described in the patent application, to try to locate information about the different terms in a query. The paper from Microsoft on this topic explains the process in an easier to understand manner than the patent application.
The process described in this patent looks to see if a location in a query is an explicit location, an implicit location and if there is a location intent if there is a dominant location associated with that location intent. This process is intended to avoid false positives and false negatives appearing in search results.
A closely related patent is cited in the patent application, and I’ve discussed it (and an associated paper) in a post from September: Location Still Matters on the Web: Types of Location Information. It looks at the actual geographic location of the owner of a web resource, the location that the content of a site may be about, and the geographic scope of the audience that the site aims to reach.
Some Other Documents About Geography in Search Results
The first three papers I’ve listed are Microsoft related documents. I also found a paper about efforts from Russian researchers looking at sites in the Yandex search engine.
Detecting Geographical Serving Area of Web Resources (pdf)
Indexing implicit locations for geographical information retrieval (pdf)
Web Resource Geographic Location Classification
4 thoughts on “How Search Engines May Look at Queries Which Include Locations”
Just another confirmation, Bill, if such were needed of the importance of Local Search to the majors. Live seems to be the weakest of the four (now including Ask City), but concepts such as this may help them beef up what they offer.
I’ve been trying to test live.com to see if it looks like they are presently using this, and I’m not sure that they are. The results I get for a search of something like “restaurants around space needle” don’t seem to be returning anything more than keyword matchup results, without recognizing that I’m more interested in restaurants than I am in pages that include the phrase “space needle” in them.
Then again, Google, Yahoo, and Ask.com aren’t serving anything better in response to such a query. I do like the idea of addressing and understanding a location intent, and having a search engine being able to respond to that type of query in normal search results.
I don’t know how to identify the location around the space needle, to get meaningful responses to my query in local search. At this point, the Yahoo Trip Planner that I wrote about a couple of days ago might be the best way to find restaurants near the Space Needle. But even that doesn’t make it easy to find specific locations around a well known landmark (getting to the Space needle page meant finding it on someone else’s travel journal first.)
geographic identifiers are a critical component of targeted services and targeted advertising. without geograhy information, web advertising loses much of its fundamental advantage of traditional media company by losing the ability to target the consumer. geospatial data is already being incorporated into almost all searches in one form or another and the trend is accelerating exponentially. this is why MSFT and others are attempting to patent geographic processes. MSFT actually appears to be aggressively taking on the issue to provide a competitive advantage over google. MSFT, however, is late to the game. an inventor filed a patent in January 1996 for ‘geographic search’. US Patent No. 5,930,474 was granted in 1999. MSFT has cited 5,930,474 at least three times as prior art in their patents. seems MSFT will need to license (or buy) this patent to prevent infringement claims.
Comments are closed.