Are Google’s query-based social circles the answer to Facebook’s Graph Search?
Not too long ago, Facebook launched its Graph Search, which enables people to search for things like “My Friends who live in San Francisco,” and My Friends who like Surfing,’ and “Places my Friends like.”
Imagine if Google Plus allowed you to perform searches such as, “People who take the same bus as me into the city,” or “People who like to eat at the Red Truck Bakery,” or “People attending the Dave Matthews Band Concert next Friday,” and creates in response a social network circle that other people might be invited to join, even temporarily, or who could join anonymously. Or Google Plus may dynamically create such a query-based social circle which it may recommend that you share through as you create a post about a music festival you’re going to, or a meal you’re reviewing from a local hotel.
The image above from the patent filing shows a query-based circle for a “Music Festival” and a query-based circle for a “Grand Hotel,” as well as a button to only display query-based circles in the interface.
Google’s patents have provided a great number of hints over the past 10 years about local search and how Google treats businesses and landmarks in Maps and Web results and elsewhere. I’ve been fortunate enough to have uncovered some of these patents and written about many of the algorithms and approaches that Google has used, including concepts like location prominence, location sensitivity, Maps in Universal Search, Google’s Crowdsensus Algorithm, and more.
I am going to be the keynote speaker at Local U Advanced, Baltimore, starting Friday night, March 8 at 7:00 pm through Saturday at 5:00 pm on March 9 (There’s an early bird discount of $100 if you sign up before Feb. 8th). This Local University presentation will be taking place in Hunt Valley, MD. There’s an amazing group of speakers lined up for the event, covering local, mobile, and social aspects of local search.
When we talk about indexing and crawling content on the Web, it’s usually within the context of pages being ranked on the basis of a number of signals found on Web pages that might be ranked in response to queries. Google has told us that the future of search involves Knowledge Bases, and the indexing of Things, Not Strings. Gianluca Fiorelli explored Google’s ideas of Search in the Knowledge Graph Era earlier this week.
A few years back, I wrote some posts about some Google Patents that explored how Google might be extracting and visualizing facts, and using Data Janitors to process that information and clean it up and sort it. Google was granted another patent this week that’s very much related, looking at how Google might understand locations for places collected from Web pages. One of the inventors, Andrew Hogue, gave this Google Tech Talk presentation last year:
Imagine the Earth broken down into a series of cells, and each of those cells broken down into a series of even smaller cells, and then into smaller cells again, and so on, in a spatial index. Each of the levels become increasingly narrow, and increasingly more precise areas or zoom levels of the surface of the Earth.
As these cells decrease in size, they increase in numbers, which has the impact of increasing the zoom level and the accuracy of areas represented in such an index. Might work good in a place like China, where latitude and longitude are banned for export as munitions. Such a set of cells might be part of a geospatial analyzing module that links specific businesses and points of interest (parks, public regions, landmarks, etc.) to specific places on this model or index of the earth. That might be one index of the businesses and one index for the points of interest, or a combined database that includes both.
Sometimes that index might include a business and a landmark within the same cell. While that could be correct in some instances, such as a shop appearing within the Empire State Building, Often its an error, and sometimes even an intentional error. People will sometimes enter incorrect information into a geographic system like this to try to gain some kind of advantage.
If people search for something like a motel “near” a particular park for instance, the motel that appears to be next to, or even within the boundaries of that part might seem to have something of an an advantage in terms of distance from that part when it comes to ranking the motel. And, sometimes Google doesn’t seem to do the best job in the world at putting businesses in the right locations at Google Maps.
When Google ranks businesses at locations in Google Maps, they turn to a number of sources to find mentions of the name of the business coupled with some location data. They can look at the information that a site owner might have provided when verifying their business with Google and Bing and Yahoo. They may look at sources that include business location information such as telecom directories like superpages.com or yellowpages.com. or business location databases such as Localeze. They likely also look at the website for the business itself, as well as other websites that might include the name of the business and some location data for the business, too.
What happens when the information from those sources doesn’t match. Even worse, what happens when one of these sources includes information that might be on the spammy side? A patent granted to Google this week describes a way that Google might use to police for such places. The patent warns against titles for business entities that include terms such as “cheap hotels,” “discounts,” Dr. ABC–555 777 8888.” It also might identify spam in categories for businesses that might include things such as “City X,” “sale,” “City A B C D,” “Hotel X in City Y,” and “Luxury Hotel in City Y.”
In the context of a business entity, information that skews the identity of or does not accurately represent the business entity or both is considered spam.
Google’s local search may be getting smarter one streetview scene at a time. A few years back, I jokingly made a robots.txt sign for my front door that had the following statement in it:
In the root level directory of a website, a robots.txt file containing those two lines would tell Google’s page crawling program not to index any pages from the site. On the front of a home in my small town, it might have gotten some odd looks, but that’s about it. I had expected at some point that Google would send a streetview car or two down my street, and I would have been able to write a blog post with a streetviews image of the front of my house with a title along the lines of “Google Ignores Robots.txt File: Indexes My House.” I ended up not leaving the sign up, but I’m second guessing that now that I know streetviews cars can read.
That really shouldn’t have been a surprise back then. I wrote a post in 2007 titled Better Business Location Search using OCR with Street Views which described how Google might use OCR to gather information from signs it takes video of for street views. The patent filing I wrote about really didn’t discuss how that information might be used, but it presented the possibility of its use. I suspect my real life robots.txt file would have been ignored back then, though the drivers of those cars had learned at that point that signs like “Private Street” and “Military Base,” were areas they couldn’t film.
Google was granted a patent last week that gives us a look at how information from street level signs might be collected and indexed by Google, and compared to online information about the same locations to try to “calibrate” and “score” any information about the places being listed in Google’s index. Here’s an image from the patent that shows at a glance the kinds of information it might attempt to read:
A pending Google patent published this past week describes how the locations of entities included in queries might be identified from information found in the search engine’s query logs, based upon click histories and other information. Query log information may also be used to associate locations with websites and web pages.
Are the Empire State Building or the Golden Gate Bridge places, or are they things? A search for just [washington monument] or [eiffel tower] doesn’t actually specify a physical address. Search for the [Statue of Liberty] and chances are that you want the one in the New York Harbor, but if your search was conducted in Paris, France, you might have wanted to see one of the ones in Paris (yes, there’s more than one). There are a number of replicas of the statue world wide.
A search for [concord point lighthouse hotels] returns a number of pages that successfully point out that the lighthouse is in Havre de Grace Maryland, even though my query doesn’t mention the actual location. Is the search engine just finding the most relevant results for those keywords, or is it identifying the location of the lighthouse, and then trying to find web sites that are the best match both for the query term and the location?
When you perform some searches, Google might include Maps results within the web search results for those queries, or it might include some local results that change when you change your location in Google. Those queries are ones that don’t include geographic information within them, yet Google somehow decides that there’s some geographic relevance to the terms being searched for.
Some query terms likely have no geographic relevance to them, such as a query like [linux], which pretty much has a meaning unrelated to any specific location. Other queries may evidence an intent to find a location near a searcher, such as [restaurant]. A patent granted to Google this past week describes an approach that Google may be using to assign an implicit local relevance to a query term or phrase when that query doesn’t contain any explicit references to a location.
A friend asked a few months ago why Google might decide that a particular phrase might be seen to have a geographical relevance in his region, but not show localized or Google Maps search results in other locations. My answer was that Google likely had developed a statistical geographical model which would trigger localized results based upon a combination of query used and location of the person searching. I’ve written a few posts in the past about a Yahoo! paper on geographic intentions, as well as a Yahoo! patent covering similar territory.