It’s fun to see something interesting come from a search engine that isn’t one of the big names.
A new patent application from Become, Inc., looks at links in a different way, so that while links play a role in the rankings of pages, not every link holds the same value.
The summary of the patent points out some of the problems with on page factors analysis and link structure analysis. They also write about the “artificial web,” which involves the use of scripts to write:
…millions or billions of simple web pages that contain links to a few websites to be promoted. As the number of these artificial web pages can be comparable to that of the major portion of the real Web, the spammers can wield undue influence in manipulating the link structure of the entire Web, thereby affecting the computation of PageRank.
We’ve seen this “artificial web” as a significant issue recently with Google, as reported on Search Engine Watch in Google Yanks Sites 5 Billion Pages After Spam Complaint. Does Become.com have a solution to this type of problem?
Method for assigning relative quality scores to a collection of linked documents
Inventors: Rohit Kaul, Marcin Kadluczka, Yeogirl Yun, and Seong-Gon Kim
Assigned to Become, Inc.
US Patent Application 20060143197
Pulished June 29, 2006
Filed December 23, 2005
A method for assigning relative quality scores to a collection of linked documents is presented. The method includes constructing a spring network according to a connectivity graph of a linked database and determining the strength of inter-nodal springs based on the link structure of the network and the displacements on end-nodes. The method may further include computing the displacements of the nodes in a spring network through an iterative process and obtaining the quality scores for documents from the converged displacements of nodes. The method may also include obtaining the relative quality scores for groups of documents. The method may further include assigning topic-specific quality scores to documents in a linked database.