To those of us who are used to doing Search Engine Optimization (SEO), we’ve been looking at URLs filled with content, and links between that content, and how algorithms such as PageRank (based upon links pointed between pages) and information retrieval scores based upon the relevance of that content have been determining how well pages rank in search results in response to queries entered into search boxes by searchers. Web pages connected by links have been seen as information points connected by nodes. This was the first generation of SEO.
Chances are good that many of the methods that we have been using to do SEO will remain the same as new features appear in a knowledge Graph based search, such as knowledge panels, rich results, featured snippets, structured snippets, search by photography, and expanded schema covering many more industries and features then it does at present.
We know that Google doesn’t penalize duplicate pages on the Web, but it may try to identify which version it prefers to other versions of the same page.
I came across this statement from Dejan SEO on the Web about duplicate pages earlier this week, and wondered about it, and decided to investigate more:
If there are multiple instances of the same document on the web, the highest authority URL becomes the canonical version. The rest are considered duplicates.
The above quote is from the post at Link inversion, the least known major ranking factor. (it is not something I am saying with my post. I wanted to see if there might be something similar in a patent. I found something closer, but it deoes say the same thing that Dejan predicts
I read that article from Dejan SEO about duplicate pages, and thought it was worth exploring more. As I was looking around at Google patents that included the word “Authority” in them, I found this patent which doesn’t quite say the same thing that Dejan does, but is interesting in that it finds ways to distinguish between duplicate pages on different domains based upon priority rules, which is interesting in determining which duplicate pages might be the highest authority URL for a document.