Image Search and Trends in Google Search Using FreeBase Entity Numbers

Google is organizing more and more things in its index based upon entity numbers. I have a couple of examples for you that show how they are being used.

It’s possible that you may have missed a reference to Freebase Entities in a Google Research Blog post from 2013. I missed it myself. The post is
Improving Photo Search: A Step Across the Semantic Gap.

In the post, the author (Chuck Rosenberg) tells us how they improve image searching at Google by labeling images with entities, rather than text strings. The entities they used are entities that you would find at a source such as Freebase. He tells us that they use Freebase Machine ID numbers for those labels:

As in ImageNet, the classes were not text strings, but are entities, in our case we use Freebase entities which form the basis of the Knowledge Graph used in Google search. An entity is a way to uniquely identify something in a language-independent way. In English when we encounter the word “jaguar”, it is hard to determine if it represents the animal or the car manufacturer. Entities assign a unique ID to each, removing that ambiguity, in this case “/m/0449p” for the former and “/m/012×34” for the latter.

You can see those labels in the Machine ID numbers in the Freebase URLs for them, and in the Freebase entries.

Jaguar Car Entity

"Paris Motor Show 2012 (8065248951)" by Nan Palmero from San Antonio, TX, USA - Mondial De L'automobile Paris 2012 | Paris Motor Show 2012Uploaded by FAEP. Licensed under CC BY 2.0 via Commons.
Paris Motor Show 2012 (8065248951)” by Nan Palmero from San Antonio, TX, USA – Mondial De L’automobile Paris 2012 | Paris Motor Show 2012Uploaded by FAEP. Licensed under CC BY 2.0 via Commons.

Freebase URL: http://www.freebase.com/m/012×34

Freebase Information for Jaguar Cars; note the Machine ID Number is similar to "/m/012x34"
Freebase Information for Jaguar Cars; note the Machine ID Number is similar to “/m/012×34”

Jaguar Cat Entity

"Junior-Jaguar-Belize-Zoo" by Bjørn Christian Tørrissen - Own work by uploader, http://bjornfree.com/galleries.html. Licensed under CC BY-SA 3.0 via Commons.
Junior-Jaguar-Belize-Zoo” by Bjørn Christian Tørrissen – Own work by uploader, http://bjornfree.com/galleries.html. Licensed under CC BY-SA 3.0 via Commons.

Freebase URL: http://www.freebase.com/m/0449p

Freebase entry information for Jaguar Cats. Note the Machine ID number is similar to "/m/0449p"
Freebase entry information for Jaguar Cats. Note the Machine ID number is similar to “/m/0449p”

Freebase Entities and Google Trends

Last July, I was joined by Barbara Starr in a joint meetup presentation of the Lotico San Diego Semantic Web Meetup (Barbara and I are co-administrators) and an SEO San Diego Meetup presentation titled, Ranking in Google Since The Advent of The Knowledge Graph where Barbara pointed out something that she had noticed, that the Machine ID numbers for Entities in Freebase were showing up as HTML encoded URLs in Google Trends (see page 26 of the presentation).

For example, the Google Trend URL for IBM (Computer Hardware Company) is https://www.google.com/trends/explore#q=%2Fm%2F03sc8

The Freebase Machine ID number for IBM is in the URL: http://www.freebase.com/m/03sc8 (note that this from the Google Trend’s URL “%2Fm%2F03sc8” differs from this by being HTML encoded the same as in the Freebase URL – unencoded it is “/m/03sc8”)

The Freebase IBM Entry. Note that the Machine ID number is similar to “/m/03sc8”
The Freebase IBM Entry. Note that the Machine ID number is similar to “/m/03sc8”

The future of Google search appears to be based upon entities. So, Google is using Machine ID numbers as Entity labels in Reverse Image Searches and is also using Machine ID numbers to track Trends for Entities.

59 thoughts on “Image Search and Trends in Google Search Using FreeBase Entity Numbers”

  1. Great stuff, I tried to help as many clients as possible fill out their freebase and create connectioms therein before it was too late.

    Any thoughts on what do since additional edits are no longer possible?

  2. Hi James,

    Thanks. FreeBase is read only now. It’s possible that IDs from Wikidata might end up replacing Freebase IDs. If so, helping clients become actually notable, so that they can be listed in Wikipedia might be a goal to consider.

  3. Great post! And an amazing catch by Barbara Starr! Bill, do you know of any way to leverage Freebase URLs with structured data markup on a website to improve SEO?

  4. Google have been using meta data such as this for a while. It’s good to see the inclusion of FreeBase as a differentiator between ambigious data sets. Images must be difficult to interpret in other ways. Using sameas links seems an obvious move for some sites.

  5. Hello Bill,
    what do you think will happen with these ids now that FreeBase is transferring data to Wikidata? Entities within Wikidata do have references to ids from FreeBase, but Wikidata itself seems to be using different thing as their primary id. At some point new entities in Wikidata will not be able to be linked to FreeBase just because they weren’t there in the first place. And this brings the question how long until Google comes up with its own id system.

  6. Hi Bartek

    Under Freebase, the ID system is Googles. There are machine ID numbers in Wikidata that are similar to the Freebase numbers now, and chances are that Google will start using those the way they’ve been using the numbers from Freebase. The details of the transition from one ID number type to another hasn’t been detailed publicly, but chances are it’s being worked upon.

  7. Fairly unique topic and perspective, I must say. The post was not only informative for the fresher minds in the field, it was a lesson well revised for the experienced readers too. Thanks for sharing your perspective.

  8. Hi Bill
    It is true that images are the lifeline of any web pages and these should be optimized properly. You have explained everything about it in a unique way.
    Thanks for sharing this with us.

    -Abhishek

  9. Hi Abhishek

    To optimize images that you may add to a web page under an entity approach, you would ideally want to include schema vocabulary upon the page that image appears upon, and use a sameas link that goes to the Freebase page of that entity. That would be one way of making that kind of connection between the entity and the image.

  10. Hey Bill – intriguing article.

    I recently began a discussion about the very idea about “sameAs-ing” entities by linking them to a web vocabulary of some kind in Aaron Bradley’s Semantic Search Marketing Community.

    The question I have however is when to use what vocab database. You have productontology.org (for “additionalType” schema), dbpedia, getty vocab, etc.

    Are any more effective or reputable than others in your experience? Is there one grandaddy master reference we should use for all entity declarations with “sameAs” or “additionalType?”

    Thanks

  11. Hi Bill
    I just would like to thanks you for sharing this article and take time to answer to the comments and put the link to Barbara Starr’s article. Really helpful

  12. This is definitely interesting to think about, but has it been proven that utilizing these entity numbers has any sort of positive effect on SEO (or anything else)? I would just think Google would be patently against anything that’s supposed to directly “tell” Google what a site is supposed to be about (like meta keywords, for instance).

  13. Hi Maxime,

    You’re welcome. Glad to hear that you liked the article. There’s a lot to learn about the changes that are taking place on the Web, and if you work on the Web, these are exciting times.

  14. Hi Ryan,

    I definitely recommend people visit and participate in Aaron Bradley’s community on Google+. There’s a lot of thoughtful and helpful discussion going on there.

    I think it’ worth paying a lot of attention to schema.org, especially since there is a schema extension process in place there, which means that it will grow over time. Keep on eye on the Schema blog for news of additions to the site.

    “Mainentity” seems like a useful reference, when you want to make sure that people understand what a page is truly about (https://schema.org/mainEntity).

  15. Hi Chris,

    The head of search at Google, Amit Singhal, just announced his retirement at the end of February at Google, and he is going to be replaced by the founder of Metaweb, John Giannandrea. Metaweb is the company that brought Machine ID numbers to Google through their Freebase Knowledgebase. If you’ve been following how Google Trends works, you can see how Google has been tracking searches for entities by Machine ID number, as mentioned in this post.

    Google stopped using meta keywords for rankings because people would often put dictionaries of things they wanted to rank for within those. If you include a sameas link on a page that contains an entity mentioned on the page where you’ve placed that link, and it is the same entity, so that the use of the Machine ID is consistently correct, then there shouldn’t be a problem; but if you put incorrect entity numbers on a page, then Google might not like that, and could potentially penalize a page, and they have said that they might take action against incorrect usage of schema vocabulary on pages of a site. Google isn’t against the use of correct information; it’s when you set out to deceive and manipulate rankings that Google has a problem with your using something like a sameas link.

  16. This website is very informative to read. I am a huge follower of the things you talk about. I also love reading the comments, but it seems like a great deal of readers need to stay on topic to try and add new things in the original topic.

  17. I wonder if Google will monetize this into some sort of paid model? BING has done so on the shopping front but its not very well organized.

  18. I’m curious to find out what blog system, you are utilizing? I’m having some minor security problems with my latest website and I would like to find something safer. Do you have any suggestions?

  19. Google Trends is such a useful tool! We used it successfully to find creative new blog post ideas which brought a tremendous amount of traffic to our client sites.

  20. Hey Bill!
    Nice meeting you.. 🙂

    Really one of the best post shared by you. And I agree with you that images play an important role for making our site absolutely fantastic and better. This post will really help beginner’s mind because it is so informative. Deep description makes this post easy to understand and more clear.

    Thanks a lot for sharing a something different…. 🙂
    Keep writing.. 🙂

    Have a nice day ahead..

    – Ravi.

  21. Hi Ravi,

    It’s good to meet you, too. Happy that you liked this post. Images can make a difference in how web pages are received by visitors; and are a strong part in the transformation of Search towards a more entity-based endeavor. You have a great day, too.

  22. Hi Alin,

    Search Engines are shaped in large part by the expectations of searchers, and how they look for information on the Web. People do look for entities, and that has become a large part of how Google tracks information on the Web.

  23. Hi Michael,

    Trends is a really useful tool; it provides a glimpse into what people are actively engaged in researching on the Web, and where their interests are pointed.

  24. Wonderful work! This is the kind of info that are supposed to be shared around the net. Disgrace on Google for now not positioning this post upper! Come on over and visit my website . Thank you.

  25. That a good number of list. Thanks a lot for share.. loved your article..
    Its really important to a newbie like me to have a look n use these tools!.. thanks for the share

  26. I’ve hear that Bing has done this for shopping, but since Google is more about simplification and people want to jump on their bandwagon, I wonder how Google will look to make money off of this.

  27. best tip to be strong in seo is selection of best keywords. if you will choose best kw then you can get good business

  28. Definitely another step forward when it comes to Google search. These IDs look like something we can all benefit from. Especially the SEO experts 😀

  29. Hi Nikolay,

    It’s nice learning the MIDs, especially if you use the Knowledge Graph Search API – which can give you an idea of how Google might perceive multiple entities that share the same or very similar names.

  30. It was a good read Bill.

    Something I been trying to understand about freebase entity.

    I think in coming future to be successfully rank at the top those wikidata IDs can be very handy 😉

  31. Hi Bill,
    I am visiting your blog for the first time and for the first i have heard about the term freebase entity terms..thanks for sharing with us valuable information on image search

  32. Really great article.

    Google stopped using meta keywords for rankings because people would often put dictionaries of things they wanted to rank for within those.

    Thanks for sharing your opinion.

  33. Great post! And an amazing catch by Barbara Starr! Bill, do you know of any way to leverage Freebase URLs with structured data markup on a website to improve SEO?

  34. Hi dk,

    If you point to the freebase URL in organizational schema markup for a business, you are telling Google that your business is the same entity as that described on the Freebase page a link is being provided to. That adds a preciseness that otherwise wouldn’t be there, which is helpful.

  35. Google has been constantly on the look out to mess up the people who are not the real things when comes to the blogging and running websites.

Leave a Reply

Your email address will not be published. Required fields are marked *