When named entities such as specific people, places, and things show up in queries or web pages, that can be a signal to search engines to do something special in the results that they show. How prepared are you to understand and anticipate how the search engines treat them? Do you have a named entity strategy in place?
Named entities show up in any queries – they may even be one of the kinds of things that people look for most online. In a 2010 white paper from Microsoft, Building Taxonomy of Web Search Intents for Name Entity Queries (pdf), we are told how large of a role that “named entities” play in search:
According to an internal study of Microsoft, at least 20-30% of queries submitted to Bing search simply names entities, and it is reported 71% of queries contain name entities.
I’m working on a post about a new patent application published by Google that describes a process Google may use when it comes across named entities in a query. I decided to take a look back to see how much I had written about entities in the past. I was surprised to find that I had written almost 130 posts that mention the word “entities” within them, out of a total of 1,223 posts published on SEO by the Sea since 2005.
Given how frequently Named Entities appear in queries and how important they are to local search, and Google’s growing Knowledge Base, it started me thinking about what kind of role named entities had in my strategies and approaches for content creation and marketing sites. Those are fragmented, and possibly should be more unified.
With local search, for instance, and its focus upon specific businesses and organizations and specific locations, it didn’t surprise me that many of the posts I’ve written on Maps, and Google Maps, in particular, mention entities. In promoting a business within local search, simple things like being consistent in using business location information at different sites for businesses can make a significant difference in how well a business might rank in local search. The categories that you associate with a business in profiles, and in how you write about a business can also play a significant role in how search engines rank those businesses as well. This is why a named entity strategy can be valuable.
With Google’s and Bing’s growing approach to Knowledge Bases and showing knowledge panels along with search results, much of what I’ve written about the knowledge bases of concepts and things that each search engine is building also mention entities. You can learn a lot about entities by looking at whether or not knowledge base results show up for them, and what else might be shown, or even in suggested query refinements for them.
But I was still surprised by some of my mentions of entities that I found while looking back at some of my posts.
For example, I remembered that when Microsoft wrote about how they might attempt to identify the most important parts of web pages to analyze the content of those pages and identify the main content as opposed to sidebars and footers and heading sections, named entities are one of the linguistic features that they may use to determine what might be the main content area on a page.
A post from 2006, Google’s Query Rank, and Query Revisions on Search Result Pages, describes how Google might track changes that people make to queries when they are searching for something. One of the things they mention is that they often see people guessing the names of entities within query logs. So, in Google Knowledge Panels, we see “People also search for” results that are related to a search for a named entity that is likely found by Google in query sessions that include that entity and may also include other named entities.
Named Entity Disambiguation
A Google patent and paper that I wrote about in Google on Using a Knowledge Base of Articles to Make Searches Smarter describes how the disambiguation feature at Wikipedia might be useful in telling the search engine about Danny Sullivan the race car driver and Danny Sullivan, the writer at Search Engine Land. If you’re writing about a specific person or place, or thing, do you look at sources such as Wikipedia or the Internet Movie Database (IMDB) or similar sources to learn more about those entities?
Named Entity Association
Google may recognize that a query includes a named entity and other words and show several pages from a specific site that they have associated with that specific entity at the top of search results. I wrote about that in Boosting Brands, Businesses, and Other Entities: How a Search Engine Might Assume a Query Implies a Site Search. Knowing that Google might do this can help better understand some query spaces and search results that might include named entities within the original query, and how that might influence the difficulty of how easy or hard a keyword phrase that includes that entity might rank.
A Yahoo! patent filed in 2009 describes how Yahoo! might look for entities within queries by adding labels to words within those queries and applying confidence scores to the labels to gauge what the query is most likely about. That post is Not Brands but Entities: The Influence of Named Entities on Google and Yahoo Search Results
We do see Google working towards building a knowledge base filled with concepts and entities in many ways. Still, one of the most visible manifestations of that is through acquisitions such as the one I described in Google Gets Smarter with Named Entities: Acquires MetaWeb. If you’re not keeping a careful eye upon knowledge panel results and how they may be evolving, it’s worth keeping an eye upon. Google does include knowledge panel results for celebrities and bigger businesses. Still, the people and places and things included within those will likely grow as their knowledge graph gets bigger and more detailed.
The Growth of a Need for a Named Entity Strategy
A Google patent that appears to have been filed before Google acquired MetaWeb mentions MetaWeb specifically, and how information about named entities from the knowledge base might be used in search results in a mashup format – Google and Metaweb: Named Entities and Mashup Search Results? The patent includes some screenshot examples, and those formats haven’t made an appearance in search results (that I’ve seen), but they might.
Google Maps, in my opinion, is Google’s effort towards building a knowledge base about entities in a way that ties businesses to specific locations. It has used searcher’s queries to help build information about those entities. See: How Google Might Use Query Logs to Find Locations for Entities. It’s been interesting watching how Google Maps has grown over the years, and Google has made several missteps in its growth. Still, with competition from sources like Apple, Google has a lot of incentive to improve how their mapping and navigation information grows.
With the Google Hummingbird update, it’s become clear that Google is working hard at rewriting long and complex queries, and using information from sources such as query and click logs and search results for specific queries to learn more about entities, which may play a role in what they present in Google Maps, and possibly even search results as well. I pointed this out in
How Google Finds ‘Known For’ Terms for Entities and by identifying “is a” relationships for concepts and entities in the post How Google Attempts to Understand What a Query or Page is About Based Upon Word Relationships.
A Named Entity Strategy for All Search
It would be a mistake to focus solely upon Google in discussing named entities and search engines. Both Yahoo and Bing have been exploring ways to identify and use named entities in useful ways as well. I wrote about some patents I’ve seen from both of those search engines last year in Search Engines and Entities. Blaise Agüera y Arcas, who oversaw the development of Bing Maps, has recently left Microsoft for Google. What information about entities might he bring from Bing to Google Maps? So a named entity strategy should go beyond just Google.
Named Entities play many different roles in how search engines interpret queries and index content on web pages. With the growth of Knowledge Bases and query rewriting efforts like Hummingbird, those roles are likely to become even more important.
To repeat the question I asked in the title of this post, do you have a named entity strategy?
I’ve written a few posts about named entities. These are some that I wanted to share:
- Do You Have a Named Entity Strategy for Marketing Your Web Site?
- How I Came to Love Entities and Start Doing Entity Optimization
- How Google Uses Named Entity Disambiguation for Entities with the Same Names
- How Named Entities Connected to Trending Topics can be used to Address Real Time Search Results
- Not Brands but Entities: The Influence of Named Entities on Google and Yahoo Search Results
- How Knowledge Base Entities can be Used in Searches
- Finding Entity Names in Google’s Knowledge Graph
- Google Gets Smarter with Named Entities: Acquires MetaWeb
- Entity Associations with Websites and Related Entities
- How Google Might Identify Entity Synonyms Using Anchor Text
- Extracting Facts for Entities from Sources such as Wikipedia Titles and Infoboxes
- Extracting Semantic Classes and Corresponding Instances from Web Pages and Query Logs
- How Google May Identify Main Entities
- How Google’s Knowledge Graph Updates Itself by Answering Questions
Last Updated June 26, 2019.
From what i Know a named entity is a phrase that clearly identifies one item from a set of other items that have similar attributes. How does one implement this on a website can you please help for a novice like me. thanks.
Named-entity recognition (NER) (also known as entity identification and entity extraction) is a subtask of information extraction that seeks to locate and classify atomic elements in text into predefined categories such as the names of persons, organizations, locations, expressions of times, quantities, monetary values, percentages, etc. This is what i got from a search on the web. But can you please elaborate on how to use it on the website.
I assume you mean a citations strategy where you drop the named entity on a third-party site. For manufacturers or retailers that have multiple brands, or publishing companies with multiple imprints, this kind of strategy would be necessary for each brand or imprint.
The natural order is for the parent entity’s website to appear in the search results when that entity is searched for. If brands and imprints don’t have separate websites, then the parent entity’s site likely appears in the SERP if those brands or imprints are mentioned on the site. Otherwise, content would need to be pushed out to third-party websites in order to get those named entities a ranking profile worth mentioning. Is this what you are talking about?
Hi Gautam,
Named entities are specific people, places, and things, including concepts. These don’t have to have similar attributes at all, though it’s likely that each has its own unique attributes. For example, both Thomas Jefferson and the Reston Silver Diner and Democracy are all named entities. If you are writing about one of these, including information about the unique attributes that each are known for is one way of beginning to optimize for each of them.
I’ve filled this post with many links to posts I’ve written about named entities that link to many patents that describe different ways that the search engines may be using named entities.
Hi Allen,
I’m not exactly sure I understand what you are asking or are suggesting.
Hi Dan,
Named entity recognition is one aspect of how a computer might recognize when it comes across a named entity on a page, but it’s not really what this post is about.
I’ve included a lot of links to blog posts I’ve written that include links to patents that describe different ways that search engines might use named entities.
For example, under a visual segmentation approach, a search engine that is indexing a page might look for the presence of named entities in the main content areas of web pages to determine if that section is a main content area of a page.
To better understand that a specific named entity mentioned on a page is one Danny Sullivan (the writer) or another Danny Sullivan (the race car driver), a search engine might collect unique information about both from outside sources, such as Wikipedia that may mention search engines and Google for the first, and the Indy 500 for the second Danny Sullivan.
For a search engine to associate a specific site with a specific entity, it might look for other sites that point to that specific site with anchor text using the name of that entity. For example, on a search for “ESPN”, if a certain percentage of the top 100 sites that rank for the term point to one website in particular “about” ESPN then that website might be associated with the named entity.
Thanks bill for the detailed reply. Now i get it.
Hi Allen,
Thanks for explaining further – that’s definitely a highly recommended approach worth considering as part of an overall strategy involving named entities.
Since the search engines are using named entities in a number of different ways, a strategy might include some other tactics as well. For example, if you are writing about Danny Sullivan, the race car driver, and you see that he is known for racing in the Indianapolis 500 in places like Google’s knowledge Panel, adding a page about his Indy racing is probably a good idea as well, since it’s an “attribute” or “aspect” of his that the knowledge panel shows that people search for. I’d also suggest looking at suggested query refinements in search results to see if there are other things that might be worth writing about, too.
Hi Dan,
Good to hear. You’re welcome.
Bill, you hit upon it with your ESPN example. By a “named entity strategy,” I’m supposing that you are talking about using your brand and its associated sub-brands as the named entities. If the goal is to improve your search profile for those named entities, then you want to go beyond your own website and have your brands mentioned on other sites. That is, if I’m understanding this post correctly.
“it started me thinking about what kind of role named entities had in my strategies and approaches for content creation and marketing sites. Those are fragmented, and possibly should be more unified.â€
Creative works can be the layer that ties everything together for an entity based strategy.
Brand, Publisher, Authors, Personas, Articles, (Schema), Terms, Phrases etc. all play a role, each having their own influence upon the Entity, its relationships, trust and authority.
Having varying levels of control over these key elements, we can assist in conveying, adjusting and fine tuning the overall strategy, phrases, terms, messaging, etc, thereby optimizing accordingly.
Thanks, Jason
Great example of how thinking about entities can play a very strong role in the creation of content on the Web.
Most of the links I included within my post point to other ways that search engines identify and use entities when indexing pages and interpreting queries and presenting search results. I wrote that my approach to entities has been fragmented because I hadn’t put together an overall approach, but was treating each of these independently. I like the unified approach to entities you’ve suggested for creative works, and it’s a good start towards building an overall strategy.
Am I the only person who has no idea what this post is going on about?
Ah, I read the comments and understand it a bit more. Do we have anything in place to assist this? No. It seems very pedantic and minimalistic to do this, but I guess it’s minor changes like this that can see your traffic go through the shed roof.
While I named entities are nothing more than a way semantically correct way to placing uniquely identifying information with the information in a way to make the web more semantic. Think of categorizing some [widgets], It can be anything from movies, music, articles, books or a more robust citation system for the web.
I think most of us here understand this. The problem arises when you don’t have a holistic approach to your website or the information that you put on the web. Let’s say for example you have a design firm coming up with the perfect look of your website, then you have another firm that actually codes the website and then another team that adds the content.
With all these different moving parts, if you don’t have a complete understanding the entire process it’s easy to drop the ball. Say for example the firm that programs your website, they might not have a good understand of these trends and just create a website that works well but doesn’t expose the required information to make it easy for the content creators to actually put the data online in a named entity friendly way.
Now Google is doing a lot to help with these issues, especially with their new tools in Google Webmasters that allows you to go in and highlight semantic data but that’s just a bandaid on the overall problem that a lot of web design/ development companies doesn’t truly understand how the web works in a holistic way. I wrote a short blog post about that recently, you can check it out in the link above.
Hi Bill,
Thanks for describing this information in detail. I have searched some queries in Google and find that entities are very frequently captured by Its algorithm. For example I have searched a restaurant by name and It shows me Local results on Top. Yes It is co-related restaurant name with industry than it related to industry with its local search engine.
But is it part of entity or Rewriting queries? please let me know I think it is entity search because it shows me web results related to name please explain.
Exciting post …!!! Often these are issues and challenges not many content marketers think about so great idea to put some focus on these challenges.
I actually don’t agree this is the strategic level though. It’s more like the tactical level. The strategic level would be more about why we do things, how it’s organized, and how processes are structured. These are more practical (time, schedule etc.)
Great article! Bing and Yahoo continue to lag way behind Google in searches and value..as most of have found out. We have found that while the new changes have affected us we still are able to regroup and move up in rankings. As for strategy…It’s all about the S & C, being a social creature and writing intelligent content. Take Facebook…I heard this morning on NPR that the 12-16 year olds are already stating that FB is dead! So in 10 years where will we be when they are looking to purchase real estate. I will definitely recommend anyone to talk to you for their seo. Happy New Year!
So, to answer the question at the end of your post: not an explicit strategy on this aspect, but it is covered by other elements of our seo and marketing strategy.
As one major objective of the entity definition is to differentiation from non entities, and another is the differentiation from other entities, this has two implications:
1. Define the entities you want to promote (company, brand, people, concept, mainly setting the ‘boundaries’
2. Differentiate these entities from existing other entities.
Example – we have a storage division. Not related to hangers or drawers, but hard drives, servers and cloud. So we need to define what ‘company storage’ means (likely something around data storage) and then take a look at competitors. The goal should be to have enough overlap to get into the right category (data storage) and enough uniqueness to differentiate from others.
Hi Andreas,
Actually, I think you’ve done a good job of condensing an important aspect of why and how a strategy around entities works, with a useful example. Thank you.
So, brand mentions have been an interesting topic for a while, and it seems that there’s been a corrolation with rankings for a while at least with what I’ve seen, although this could be due to other factors, such as the increased branding signals that come along with it, increased CTR, higher brand search and the like.
Something I’ve been thinking about for a while and I wonder what your thoughts are is, do you think we may one day even see a penalty like penguin related to brand mentions and link ratios as Google becomes more sophisticated in the way it looks at connecting these things.
For example with penguin we saw money term anchor text vs brand anchors, because having a high money ratio was obviously manipulative. So do you think we may be heading towards a world in which having a link ratio which is too high against relevant mentions in contextual content could cause a penalty?