This post is about a Google patent from a well-known Google Engineer, that describes ranking search query results at an internet search engine, such as Google.
Google aims at identifying resources, of different types such as web pages, images, text documents, multimedia content, that may be relevant to a searchers situational and information needs and does so in a manner that they hope is as useful as possible to a searcher. They do this while responding to queries submitted by searchers.
One of the inventors from this patent carries one of the most well-known names at Google, the surname, “Panda.” which became well-known because of a Google update that was named after him in February 2011.
A focus of that update was upon improving the quality of sites that it targeted, and Navneet Panda specializes in site quality at Google. When I saw that a patent was granted this week that listed his name as an inventor, I looked forward to reading it, and seeing how it might attempt to define site quality, and how that definition might be used to rank search query results.
Added 11:48 AM (pst) May 3, 2015, H/t to Natzir Turrado, incoming news is that Google+ is introducing a new feature they are referring to as Collections, and that announcement from The Windows Club features the word “curation” prominently as do the two Google patent applications I write about in this post. Here’s how Susannah Lindsay in The Windows Club article uses the concept:
Google Plus users will get an opportunity to curate pieces of content into their collection, with others holding the permission of viewing, sharing, and following those collections as they please.
There are a lot of Government Web sites that have made the data that they collect and compile freely available to the public. The licenses that data has been released under are described on the following Pages:
If you are considering starting a project using that kind of data, you should read the Open Data Handbook, which provides a lot in the way of details, and much more information is available on Data.gov, including a broad overview of different types of topics that data is available about, including:
A Google patent granted this week describes how Google might try to understand Entities that appear on Web pages, and how that awareness might influence the search results that the search engine shows off in search results.
An Entity is a specifically named person, place, or thing (including ideas and objects) that could be connected to other entities based upon relationships between them. Some pages may make certain Entities to be the main Subject of a page, while other may include additional information about entities that are related in some manner to those first entities. When some entities appear on pages, they may be presented in an ambiguous manner that doesn’t make them the main topic for the page they appear upon.
Entities are said to exist in a graph that connects them to other entities based upon relationships between them. For instance, Google and Bing are both Search Engines, both internet domains, both employers of many search engineers, and have CEOs, Vice Presidents, Marketing staff, headquarters, data centers, Web indexes. There are a lot of related entities that might show up on Web pages about both.
This view of Entities being related to each other, and belonging to an “Entity Graph” is very similar to what the Microsoft Patent I wrote about recently in How Bing May Expand Queries Based upon Finding Entities Within them. A number of the ideas behind how that patent works and this one are similar in that some knowledge about an entity might cause a search engine to display information about related entities.
Below is a creative commons Image from Flickr. In the Caption to the photo is the kind of attribution that a Creative Commons Attribution License calls for when using an image like this from Flickr:
When you choose to use a photo or data available under a Creative Commons’ License, you’re giving other people information about their rights to use your copyrighted materials. This means that you should understand what the different licenses mean
Notice that the infomation on these sites provides Open Data available through licenses that allow people to create something new or useful
I’ve shared links to and information about the Open licenses that data there has been published under below, so that the information can be shared and it can help encourage others to create using Linked Data, and to share data under Open data licenses like the ones described..
If you’ve been doing SEO for a while, one of the papers that you may have read describes how Google was attempting to index content found on the Web that might be difficult for their crawlers to access, such as financial statements from the SEC. The search engine would have to try to access this information by filling out a form and guessing good queries, because that was the only way to access the information – they couldn’t crawl it without querying it first. This paper describes efforts that Google undertook to access that information:
“Examples of entity graphs include Microsoft Corporation’s Satori and Google’s Knowledge Graph, or Facebook’s semantic graph.”
A Microsoft patent application was published at the World Intellectual Property Organization this week on Semantic Search issues that describe how Microsoft’s Understanding of Entities may influence the search results you might see at Bing.