Open Data Commons Opportunities

There are a lot of Government Web sites that have made the data that they collect and compile freely available to the public. The licenses that data has been released under are described on the following Pages:

ODC Public Domain Dedication and License (PDDL)
Open Data Commons Open Database License (ODbL)
Open Data Commons Attribution License

If you are considering starting a project using that kind of data, you should read the Open Data Handbook, which provides a lot in the way of details, and much more information is available on Data.gov, including a broad overview of different types of topics that data is available about, including:

  • Agriculture,
  • Business,
  • Climate,
  • Consumer,
  • Ecosystems,
  • Education,
  • Energy,
  • Finance,
  • Health,
  • Local Government,
  • Manufacturing,
  • Ocean,
  • Public Safety,
  • Science & Research.

There’s some interesting discussion of Licensing, openness of data, and attribution in a blog post from the Open Knowledge Blog, titled Open Data: Openness and Licensing.

Continue reading “Open Data Commons Opportunities”

How Google May Identify Central Entities from Resources

A Google patent granted this week describes how Google might try to understand Entities that appear on Web pages, and how that awareness might influence the search results that the search engine shows off in search results.

An Entity is a specifically named person, place, or thing (including ideas and objects) that could be connected to other entities based upon relationships between them. Some pages may make certain Entities to be the main Subject of a page, while other may include additional information about entities that are related in some manner to those first entities. When some entities appear on pages, they may be presented in an ambiguous manner that doesn’t make them the main topic for the page they appear upon.

Entities are said to exist in a graph that connects them to other entities based upon relationships between them. For instance, Google and Bing are both Search Engines, both internet domains, both employers of many search engineers, and have CEOs, Vice Presidents, Marketing staff, headquarters, data centers, Web indexes. There are a lot of related entities that might show up on Web pages about both.

This view of Entities being related to each other, and belonging to an “Entity Graph” is very similar to what the Microsoft Patent I wrote about recently in How Bing May Expand Queries Based upon Finding Entities Within them. A number of the ideas behind how that patent works and this one are similar in that some knowledge about an entity might cause a search engine to display information about related entities.

Continue reading “How Google May Identify Central Entities from Resources”

Using Photos & Data Under a Creative Commons License

Below is a creative commons Image from Flickr. In the Caption to the photo is the kind of attribution that a Creative Commons Attribution License calls for when using an image like this from Flickr:

Don McCullough
Some Rights Reserved
Sunset and silhouette

When you choose to use a photo or data available under a Creative Commons’ License, you’re giving other people information about their rights to use your copyrighted materials. This means that you should understand what the different licenses mean

Notice that the information on these sites provides Open Data available through licenses that allow people to create something new or useful

Continue reading “Using Photos & Data Under a Creative Commons License”

Licensing Requirement Information found in Linked Open Data

In 2009, Tim Berners-Lee, the inventor of the World Wide Web recorded the following TED Presentation about Linked Open Data.

This video describes his next idea for a use for the Web, where Data is shared and can flow freely, published by people under open licenses.

On the L4LOD Vocabulary Specification 0.2 is Information about the different licenses over the data sets of the LOD cloud.

I’ve shared links to and information about the Open licenses that data there has been published under below, so that the information can be shared and it can help encourage others to create using Linked Data, and to share data under Open data licenses like the ones described..

Creative Commons

Continue reading “Licensing Requirement Information found in Linked Open Data”

How Google May Index Deep Web Entities

If you’ve been doing SEO for a while, one of the papers that you may have read describes how Google was attempting to index content found on the Web that might be difficult for their crawlers to access, such as financial statements from the SEC. The search engine would have to try to access this information by filling out a form and guessing good queries, because that was the only way to access the information – they couldn’t crawl it without querying it first. This paper describes efforts that Google undertook to access that information:

Google’s Deep-Web Crawl

From the abstract to the paper:

Continue reading “How Google May Index Deep Web Entities”

How Bing May Expand Queries Based upon Finding Entities Within them

“Examples of entity graphs include Microsoft Corporation’s Satori and Google’s Knowledge Graph, or Facebook’s semantic graph.”

Bing's Satori Result needs updating on a search for [Satori knowledge graph]
Bing’s Satori Result needs updating on a search for [Satori knowledge graph]

A Microsoft patent application was published at the World Intellectual Property Organization this week on Semantic Search issues that describe how Microsoft’s Understanding of Entities may influence the search results you might see at Bing.

As a Microsoft patent tells us:

Continue reading “How Bing May Expand Queries Based upon Finding Entities Within them”