Microsoft on Determining Search Engine Spam From Email Spam

Are there enough connections between email spam and search engine spam that exploring and understanding how they might be related may be helpful to search engines in fighting search engine spam?

A patent recently granted to Microsoft explores some ways that might help a search engine eliminate spam from the search results it shows by paying more attention to spam in emails that people receive.

It’s not unusual for an email spam to contain a web address where a recipient of that message can go to learn more about the product or service described in the email.

The patent goes into a lot of details involving how spam might be identified in both emails and on web pages, and is worth spending some time with if you are interested in learning more about search engine spam and email spam, and some ways that both might be identified.

The most interesting part of the patent is in using the relationship between email and search spam.

Web addresses listed in email spam and information about sites where that spam may originate from might just be helpful in fighting search spam related to those messages.

A number of filters built for email spam look at a large set of rules about terms that appear in email spam to determine a confidence level indicating whether an email might be spam. People who receive emails may also rate them as to whether or not they are spam.

Using those filters and human ratings, emails that show a high confidence level as spam that are associated with a specific web page may cause the web page to be demoted in search results, or removed from the results altogether.

The search engines would likely also look carefully at those web pages to see if they carry any of the patterns that might indicate that they are search engine spam before letting that determination influence rankings.

Search engine spam detection using external data
Invented by Bama Ramarathnam, Eric B. Watson, Janine Ruth Crumb
Assigned to Microsoft
US Patent 7,349,901
Granted March 25, 2008
Filed: May 21, 2004

Share

11 thoughts on “Microsoft on Determining Search Engine Spam From Email Spam”

  1. Hi Kimberly,

    Akismet is a great example of a spam filtering system that also has human voting.

    Imagine if a Google or Yahoo or Live.com took the information collected by Akismet, and looked at pages linked to in comments to see if they were spam. It might help make it easier for the search engines to find search spam pages.

    Hi Rob,

    You’re welcome. Thanks for stopping by.

  2. Bill,

    This seems like an excellent idea, which can’t become reality soon enough. I’d assume the correlation is pretty high right now between email and search spam.

    Just thinking out loud… I wonder if then, there would be a chance of email spammers causing extra trouble by putting links to innocent “shields” in their spam?

    Regards,

    Kelly

  3. Hiya Bill, cool post that must resonate with most all users of the internet. If I have to find another penis enlargement, cracked software, false college degree that has slipped past the spam filters on our server, I will scream!

    It is high time that services like Akismet get given more recognition int the quest to eliminate spam, which apparently has overtaken regular email traffic in volume!

    Microsoft is definitely onto something that will add value here!

  4. Hi Kelly,

    Thanks. There is a possibility that spammers might try to poison a system like this by including links to sites that aren’t paying to have spamming done on their behalf. Hopefully the search engine will be smart enough to not just use indications of spam from emails alone when demoting or removing pages from rankings, but will also try to identify spam on the sites themselves.

    Hi Jaques (Net Marketing) and People Finder,

    I’m pretty sick of those emails too, Jaques. Unfortunately, I’ve seen a number of false positives in Akismet. Mostly the program identifies comment spam pretty well. But sometimes it doesn’t, sadly. I do think that it would be interesting for a search engine to look at information like that which might be provided by Akismet.

  5. Thank you for sharing your thoughts Bill. You are always very interesting. I figure this kind of correlation between e-mail spam and URL-spam must be carefully reviewed by humans, otherwise people can use e-mail spam to hurt their competitors ranking in the search engines.

  6. This sounds to me like another golden opportunity for SEO saboteurs to gun down their competitors’ sites. This overzealous response to spam has opened the door to a far more insidious form of spam — the phenomenon that is loosely referred to as “negative SEO”.

    When search engines attempt to identify ways to punish sites that employ various forms of promotional spam, they enable these thugs to duplicate their antique spam techniques on behalf of a competitor, who many times unknowingly suffers the consequences of this malevolent behavior.

  7. I wrote an article on spam and the effects it has on your sites SEO, PR etc. I would love for some pros to answer. Take a look at My article about spam links to your site

  8. Hi Karl,

    I agree. I think that the search engines may be wise enough to know that the risk you describe exists. :)

    Hi Peter

    I definitely understand your concern.

    The patent document does note that instead of just looking at email spam to decide that a site might be spamming, that they would also look at the site itself before they make that determination. Hopefully, if they implement this, they will be careful in not causing harm because of the actions of some malicious people.

    Hi Russell,

    Your concern is one worth considering. I’ve seen the term “Google Bowling” used to describe the problem that you refer to – the impact of spamming sites linking to yours. It’s difficult to say how a search engine would react completely, but one possible action that they might take is to not give any credit to links from sites like the ones that you describe.

    If you can get links to your site from other sites, that don’t appear spammy, that may help to mitigate any harm, if there is any.

Comments are closed.