When we talk about how web sites are related, it’s not unusual for us to talk about links between sites and pages. Google pays a lot of attention between such links, and they are at the heart of one of its most well known ranking signal – PageRank. PageRank is now more than 15 years old, predating the origin of Google itself in the BackRub search engine.
Google is exploring other signals that may be used to rank pages in search results, including social signals that may result in reputation scores for authors, in relationships between words that might appear together on pages ranking for the same queries, and in relationships between pages that show up in the same search results and in the same search sessions. The Google paper presented at an October 2013 natural language processing conference, Open-Domain Fine-Grained Class Extraction from Web Search Queries (pdf), provides some interesting hints at a possible Google of the future.
Google also seems to be very interested in building a knowledge base of concepts that better understands things like what different businesses or entities are ‘Known for’ or by defining entities better in ‘is a’ relationships. Sometimes pages for specific entities show up at the top of search results because they seem to be the page that people are looking for when they include that entity within a query, like the first two results on a search for [Roald Dahl], as seen in the image below:
Continue reading Entity Associations with Websites and Related Entities
When specific people, places, and things show up in queries or in web pages, that can be a signal to search engines to do something special in the results that they show. How prepared are you to understand and anticipate how the search engines treat them? Do you have a strategy in place?
Named entities show up in a lot of queries – they may even be one of the kinds of things that people look for most online. In a 2010 white paper from Microsoft, Building Taxonomy of Web Search Intents for Name Entity Queries (pdf), we are told how large of a role that “named entities” play in search:
According to an internal study of Microsoft, at least 20-30% of queries submitted to Bing search are simply name entities, and it is reported 71% of queries contain name entities.
Continue reading Do You Have a Named Entity Strategy for Marketing Your Web Site?
In the last installment of this series, we looked at how Google may be using phrase based indexing to use the fact that many phrases often tend to co-occur with other phrases within the content of web pages, to re-rank those pages. When we look at phrases, we also need to drill down to a special set of phrases describing named entities, or specific people, places, or things. In addition to trying to understand which phrases might tend to co-occur with those named entities, the search engines may look to other sources such as Wikipedia, Freebase from Metaweb, the Internet Movie Database (IMDB), and different map databases to attempt to understand when a phrase indicates an actual (or fictional) entity.
Google, Bing, and Yahoo all look for named entities on web pages and in search queries, and will use their recognition of named entities to do things like answer questions such as “where was Barack Obama born?”
Continue reading 10 Most Important SEO Patents: Part 6 – Named Entity Detection in Queries
When someone performs a search at a search engine they tend to use only a handful or less words to try to find information about a topic. That presents a search engine with the challenge of trying to find web pages and other results in response and attempting to understand the intent behind that search.
If someone enters “new york pizza sunnyvale” (without the quotation marks) into a search box at Google or Yahoo or Bing, it’s not quite clear whether they are looking for: (1) pizza in New York, in a neighborhood or area referred to as Sunnyvale, (2) New York style pizza in a place called Sunnyvale, (3) a place called “New York Pizza,” in Sunnyvale, or (4) some other result.
One approach that could be followed to try to understand the intent behind a query like this is to break down the words in the query into entity types, and apply labels to those entities. With the “new york pizza sunnyvale” example, that could be done a few ways:
[new york pizza]/food [sunnyvale]/location
[new york pizza]/business [sunnyvale]/location
[new york]/location [pizza]/food [sunnyvale]/location
Continue reading How a Search Engine Might Interpret Ambiguous Queries through Entity Tags
When I’m looking for something at a search engine, I will often start out with a particular query and then depending upon the kinds of results I see I often change the query terms I use. It appears that Google has been paying attention to this kind of search behavior from people who search like me. A patent granted to Google earlier this month watches queries performed by a searcher during a search session, and may give more weight to the words and phrases used earlier in a session like that, and might give less weight to terms that might be added on as a session continues.
This patent seems like part of an evolution of algorithms from Google that has brought us to their Hummingbird update.
Continue reading Evolving Google Search Algorithms
In November, Twitter disclosed in an amendment to its S1 filing that IBM was demanding licenses for three patents issued in 2006 that it claimed that Twitter was infringing upon. As far as we know, IBM didn’t file a lawsuit against Twitter, and this took place shortly before Twitter held its initial public offering.
This dispute appears to have been resolved, but we don’t know all of the details, and it’s questionable if we will ever learn about them. Here’s what the amendment said about the matter:
From time to time we receive claims from third parties which allege that we have infringed upon their intellectual property rights. In this regard, we recently received a letter from International Business Machines Corporation, or IBM, alleging that we infringe on at least three U.S. patents held by IBM, and inviting us to negotiate a business resolution of the allegations.
Continue reading Twitter’s New Patent Trove (943 Patents) from IBM
The example for the post I was writing for today appears to have been hijacked by the Simpsons. They made an apology to Judas Priest, after referring to the band as a death metal band. The image below is from a Guardian news article on the apology which is presently highly ranked on a search for the word “Judas”. See the search results below:
I wanted to show a set of search results from Google that may have been based upon Google matching the topic of a post rather than keywords, which might help improve the relevance of search results for videos and media rich results, according to a Google patent granted on the last day of 2013, which uses that example.
Continue reading Will Keywords be Replaced by Topics for Some Searches?