The great thing about HTML is that it’s so flexible and offers so many ways to do things. The worst thing about HTML is that it’s so flexible and offers so many ways to do things. I’ve looked at a lot of websites and I still see people doing things new ways.
An issue that’s often common to many websites is when a page on a site can be found at more than one URL. This might be done by a site owner for a number of reasons, and in a number of ways. It might be an issue related to a content management system that’s being used as well.
A patent application published by Google explores how the search engine might recognize when it finds a URL through a web crawl and another URL through a feed, such as a product feed, with both URLs referring to the same page, but those URLs are structured differently.
This seems like potentially a lot of work to me, and the patent filing has me shaking my head that Google might use resources to figure out duplicated content on a site, even if it potentially might enable the search engine to understand URLs and associated products and other information that it might identify better.