Microsoft on Determining Search Engine Spam From Email Spam
Are there enough connections between email spam and search engine spam that exploring and understanding how they might be related may be helpful to search engines in fighting search engine spam?
A patent recently granted to Microsoft explores some ways that might help a search engine eliminate spam from the search results it shows by paying more attention to spam in emails that people receive.
It’s not unusual for an email spam to contain a web address where a recipient of that message can go to learn more about the product or service described in the email.
The patent goes into a lot of details involving how spam might be identified in both emails and on web pages, and is worth spending some time with if you are interested in learning more about search engine spam and email spam, and some ways that both might be identified.
The most interesting part of the patent is in using the relationship between email and search spam.
Web addresses listed in email spam and information about sites where that spam may originate from might just be helpful in fighting search spam related to those messages.
A number of filters built for email spam look at a large set of rules about terms that appear in email spam to determine a confidence level indicating whether an email might be spam. People who receive emails may also rate them as to whether or not they are spam.
Using those filters and human ratings, emails that show a high confidence level as spam that are associated with a specific web page may cause the web page to be demoted in search results, or removed from the results altogether.
The search engines would likely also look carefully at those web pages to see if they carry any of the patterns that might indicate that they are search engine spam before letting that determination influence rankings.
Search engine spam detection using external data
Invented by Bama Ramarathnam, Eric B. Watson, Janine Ruth Crumb
Assigned to Microsoft
US Patent 7,349,901
Granted March 25, 2008
Filed: May 21, 2004