There may be more than one URL for a single page on a website, which can cause problems when a search engine attempts to crawl and index pages on that site.
If the search engine can figure out some rules on how these different versions of URLs for a page come about, and identify only one version of a URL to index for the different versions, then it can save time and processing power by only crawling and indexing that one version.
The “canonical” version of a URL would be a standard single version, when there may be more than one way to represent the URL (or address) of a page.
Web crawlers can download only a finite number of documents or web pages in a given amount of time. Therefore, it would be advantageous if a web crawler could identify URL equivalence patterns in multiple different URLs that reference substantially identical pages and download only one document, as opposed to downloading all the substantially identical documents addressed by the multiple different URLs.