The inventor of the Web, Tim Berners-Lee came out with a revision of his original vision of the Web when he started writing about the Semantic Web. He published an article in Scientific American about The Semantic Web, which is recommended reading.
Search on the Web has been evolving to focus more on showing results that finds things, instead of strings, or matching keywords within queries to keywords on documents on the web.
This is seen in Google’s original fact repository, followed by their knowledge graph, and Microsoft’s concept graph. Google, Bing, Yahoo, and Yandex all are followers of schema.org structured data markup, and it can be used to show rich results in search results.
When Google crawls the Web, it extracts facts from content on the pages it finds as well as links on pages. How much information does it extract about facts on the Web? IN Providing fact answers? Microsoft showed off an object-based search about 10 years ago, in the paper, Object-Level Ranking: Bringing Order to Web Objects..
The team from Microsoft Research Asia tells us in that paper:
Existing Web search engines generally treat a whole Web page as the unit for retrieval and consuming. However, there are various kinds of objects embedded in the static Web pages or Web databases. Typical objects are products, people, papers, organizations, etc. We can imagine that if these objects can be extracted and integrated from the Web, powerful object-level search engines can be built to meet users’ information needs more precisely, especially for some specific domains.
If you do SEO and aren’t familiar with GS1, you probably should be. They invented the use of bar codes in shopping. They also came up with GTINS (Global Trade Item Numbers) which are used online at places such as eBay and Amazon, and Google Product Search. A recent blog post by GS1 Vice President Rich Richardson is also worth reading: Why bar code numbers matter.
In February, GS1 published an extension to a wb vocabulary Schema for products. Extensions like this are how Search and SEO are growing. The Schema blog told us about it in:
2. What other people are searching for, including trending searches. Trending searches are popular stories in your area that change throughout the day. Trending searches aren’t related to your search history.
3. Relevant searches you’ve done in the past (if you’re signed in to your Google Account and have Web & App Activity turned on).
Note: Search predictions aren’t the answer to your search, and they’re not statements by other people or Google about your search terms.
Visitors to a website may want to perform certain actions related to Entities (specific places or people or things) that are displayed to them on the Web.
For example, at a page for a restaurant (an entity), a person viewing the site may want to create a reservation or get driving directions to the restaurant from their current location. Doing those things may require a person to take a number of steps, such as selecting the name of the restaurant and copying it, pasting that information into a search box, and submitting it as a search query, selecting the site from search results, determining if making a reservation is possible on the site, and then providing information necessary to make a reservation; getting driving directions may also require multiple steps.
Using a touch screen device may potentially be even more difficult because the site would possibly then be limited to touch input. This patent is very much about using touch screens.
A patent granted to Google this week describes a way to easily identify an entity such as a restaurant on a touch device, and select it online and take some action associated with that entity based upon the context of a site the entity is found upon. Actions such as booking a reservation at a restaurant found on a website, or procuring driving directions to that site, or other actions could be easily selected by the user of a site.
In the post, the author (Chuck Rosenberg) tells us how they improve image searching at Google by labeling images with entities, rather than text strings. The entities they used are entities that you would find at a source such as Freebase. He tells us that they use Freebase Machine ID numbers for those labels:
As in ImageNet, the classes were not text strings, but are entities, in our case we use Freebase entities which form the basis of the Knowledge Graph used in Google search. An entity is a way to uniquely identify something in a language-independent way. In English when we encounter the word â€œjaguarâ€, it is hard to determine if it represents the animal or the car manufacturer. Entities assign a unique ID to each, removing that ambiguity, in this case â€œ/m/0449pâ€ for the former and â€œ/m/012×34â€ for the latter.
How Google May Use Synonym Substitutions to Rewrite Queries
A couple of months ago, I wrote about a Google patent that involved rewriting queries, titled Investigating Google RankBrain and Query Term Substitutions. There’s likely a lot more to how Google’s RankBrain approach works, but I came across a patent that seems to be related to the patent I wrote about in that post and thought it was worth sharing and starting a discussion about. The patent I wrote about in that post was Using concepts as contexts for query term substitutions. The title for this new patent was very similar to that one (Synonym identification based on categorical contexts), and the more recent patent was granted on December 1st of this year.