Open Data Commons Opportunities

There are a lot of Government Web sites that have made the data that they collect and compile freely available to the public. The licenses that data has been released under are described on the following Pages:

ODC Public Domain Dedication and License (PDDL)
Open Data Commons Open Database License (ODbL)
Open Data Commons Attribution License

If you are considering starting a project using that kind of data, you should read the Open Data Handbook, which provides a lot in the way of details, and much more information is available on Data.gov, including a broad overview of different types of topics that data is available about, including:

  • Agriculture,
  • Business,
  • Climate,
  • Consumer,
  • Ecosystems,
  • Education,
  • Energy,
  • Finance,
  • Health,
  • Local Government,
  • Manufacturing,
  • Ocean,
  • Public Safety,
  • Science & Research.

Continue reading Open Data Commons Opportunities

Share

How Google May Identify Central Entities from Resources

A Google patent granted this week describes how Google might try to understand Entities that appear on Web pages, and how that awareness might influence the search results that the search engine shows off in search results.

An Entity is a specifically named person, place, or thing (including ideas and objects) that could be connected to other entities based upon relationships between them. Some pages may make certain Entities to be the main Subject of a page, while other may include additional information about entities that are related in some manner to those first entities. When some entities appear on pages, they may be presented in an ambiguous manner that doesn’t make them the main topic for the page they appear upon.

Entities are said to exist in a graph that connects them to other entities based upon relationships between them. For instance, Google and Bing are both Search Engines, both internet domains, both employers of many search engineers, and have CEOs, Vice Presidents, Marketing staff, headquarters, data centers, Web indexes. There are a lot of related entities that might show up on Web pages about both.

This view of Entities being related to each other, and belonging to an “Entity Graph” is very similar to what the Microsoft Patent I wrote about recently in How Bing May Expand Queries Based upon Finding Entities Within them. A number of the ideas behind how that patent works and this one are similar in that some knowledge about an entity might cause a search engine to display information about related entities.

Continue reading How Google May Identify Central Entities from Resources

Share

Using Photos & Data Under a Creative Commons License

Below is a creative commons Image from Flickr. In the Caption to the photo is the kind of attribution that a Creative Commons Attribution License calls for when using an image like this from Flickr:

Sunset and Silhouette Don McCullough  Some Rights Reserved  Sunset and silhouette
Don McCullough
Some Rights Reserved
Sunset and silhouette

When you choose to use a photo or data available under a Creative Commons‘ License, you’re giving other people information about their rights to use your copyrighted materials. This means that you should understand what the different licenses mean

Notice that the infomation on these sites provides Open Data available through licenses that allow people to create something new or useful

Continue reading Using Photos & Data Under a Creative Commons License

Share

Licensing Requirement Information found in Linked Open Data

In 2009, Tim Berners-Lee, the inventor of the World Wide Web recorded the following TED Presentation about Linked Open Data.

This video describes his next idea for a use for the Web, where Data is shared and can flow freely, published by people under open licenses.

On the L4LOD Vocabulary Specification 0.2 is Information about the different licenses over the data sets of the LOD cloud.

I’ve shared links to and information about the Open licenses that data there has been published under below, so that the information can be shared and it can help encourage others to create using Linked Data, and to share data under Open data licenses like the ones described..

Creative Commons

Continue reading Licensing Requirement Information found in Linked Open Data

Share

How Google May Index Deep Web Entities

If you’ve been doing SEO for a while, one of the papers that you may have read describes how Google was attempting to index content found on the Web that might be difficult for their crawlers to access, such as financial statements from the SEC. The search engine would have to try to access this information by filling out a form and guessing good queries, because that was the only way to access the information – they couldn’t crawl it without querying it first. This paper describes efforts that Google undertook to access that information:

Google’s Deep-Web Crawl

From the abstract to the paper:

Continue reading How Google May Index Deep Web Entities

Share

How Bing May Expand Queries Based upon Finding Entities Within them

“Examples of entity graphs include Microsoft Corporation’s Satori and Google’s Knowledge Graph, or Facebook’s semantic graph.”

Bing's Satori Result needs updating on a search for [Satori knowledge graph]
Bing’s Satori Result needs updating on a search for [Satori knowledge graph]

A Microsoft patent application was published at the World Intellectual Property Organization this week on Semantic Search issues that describe how Microsoft’s Understanding of Entities may influence the search results you might see at Bing.

As a Microsoft patent tells us:

Continue reading How Bing May Expand Queries Based upon Finding Entities Within them

Share

How Named Entities Connected to Trending Topics can be used to Address Real Time Search Results

When someone performs a search at one of the major search engines, the search engine focuses upon returning as quick and helpful an answer as possible. Part of that can involve looking the query up in a “trending topics” database to see if there’s some recent news that should be reported to the searcher. This is how the search engines are increasingly becoming a real time monitor of world events.

Yahoo suggests a number of real time news results on a query for Mark Zuckerberg
Yahoo suggests a number of real time news results on a query for Mark Zuckerberg

A recently granted patent at Yahoo (Bing has taken over crawling of web pages for Yahoo, but the deal between the two companies allows Yahoo to massage the data they receive and show off the results they want to) describes how they might “identify… and recommend… queries related to trending topics based on a query received from a user of an information retrieval system.”

The patent describes its focus and the challenges it intends to overcome as follows:

Continue reading How Named Entities Connected to Trending Topics can be used to Address Real Time Search Results

Share

SEO from Google’s Direct Answers

Google has started showing Direct answers to questions related to SEO. That has made me wonder how much someone could learn about SEO at Google with those direct answers, and I wanted to see what terms Google was showing results from and which sources. I expect there to possibly be a log of churn in the answers Google shows results from.

I started off by asking about SEO itself:

what is seo

I then wanted to look at some topics that might have questionable answers and advice, and asked about the next three topics to see if SEO myths were being promoted by Google Direct Answer. It seemed like they are given the following three answers about Reciprocal links, Keyword Density, and LSI (Latent Semantic Indexing):

What are reciprocal links?
What are reciprocal links?

Continue reading SEO from Google’s Direct Answers

Share

Getting Information about Search, SEO, and the Semantic Web Directly from the Search Engines