How Google might fight Web Spam in Social Networks

For as long as SEOs have known, Google has had one person in charge of leading their fight against Webspam. His name was Matt Cutts, and his position had evolved over the years into that of being a mouthpiece for Google, speaking on actions that Google might take to fight web spam, and low quality content. Matt Cutts is presently on an extended leave of absence from Google.

my fingerprint (index, left hand), Stefano Mortellaro, Some Rights Reserved
my fingerprint (index, left hand), Stefano Mortellaro, Some Rights Reserved

News came out a few days ago, that Google would be replacing Matt Cutts as Google’s Head of Spam, but that news tells us that the new person in charge of WebSpam at Google wouldn’t be as vocal as Matt Cutts had been, nor reveal his or her identity.

So, how will Google move forward?

A patent granted today to Google targets detecting spam across a social network. It starts early on with a definition of spam that informs the solution it presents:

Spam is a rampant problem across the Internet. Spam (e.g., junk or unsolicited bulk messages) is generally identical messages sent to numerous recipients who did not request it. While spam is typically thought of in the e-mail context, it is also becoming widespread on social networks. A spammer can attempt to elicit random users to help monetize. The spammer makes a large number of identical or nearly identical posts. This can be done by email, online service pages, or comments on a blog or a social network. A relevant aspect to a spammer is coverage. The greater the number of unique users that view a spam post, the greater the number of possible monetize-able events that may occur. That is, random users responding to or otherwise following the instructions in a spam post result in the spammer accruing money.

Some social networks allow public posting. A user can comment on a public post. Spammers can also comment on a post.

This patent specifically targets spam in social networks, and the unique features of those. It tells us that one feature of a social network is their approach towards finding and identifying popular content that is shared within a social network. That allows popular posts to be surfaced to other members of the social network, and enables them to view these trending posts. Spammers supposedly have been taking advantage of the “high number of views the comments of such posts will receive.”

Fingerprint Detection

Flowchart from the patent on the use of a fingetprint detector.

The patent proposes an approach to detecting spam across a social network. It does this by including a spam detector that includes modules such as a fingerprint generator, a comparison module, and a response module, so that once spam is found and identified, it can be responded to. The fingerprint module captures the characteristics of spam seen elsewhere. The comparison module can be used to identify that fingerprint in comments left in a social network. It then tells us:

The fingerprint is compared to other fingerprints previously generated and stored. If the fingerprint matches any previously stored fingerprints, it is considered to be spam and processed accordingly. If the fingerprint does not match any previously stored fingerprints, it is posted in the social network.

The patent tells us what the advantages it brings with it:

  • First, the spam detector prevents comments that are spam from being posted in a social network
  • Second, the spam detector can be used with other signals to improve the accuracy of identifying spam
  • Third, the spam detector can be used for any user generated content to reduce or eliminate spam

Spam Detector Show in Social Network

The patent is:

Detecting spam across a social network
Inventors:Christopher Jones and Stephen Kirkham
Assigned to Google
US Patent 9,043,417
Granted May 26, 2015
Filed: July 10, 2012


A system and method for detecting spam across a social network using a spam detector is disclosed. The system comprises a post receiving module, a fingerprint generator, a fingerprint comparison module, fingerprint storage and a spam response module. A comment is received by the post receiving module and a fingerprint is generated by the fingerprint generator using the comments.

The fingerprint is compared to other fingerprints previously generated and stored by the fingerprint comparison module. If the fingerprint matches any previously stored fingerprints, it is assumed to be spam and processed accordingly by a spam response module. If the fingerprint does not match any previously stored fingerprints, it is posted in the social network.

The patent tells us that its use isn’t limited to just social networks, but it can be used more widely for other types of user generated content:

While the present disclosure will now be described in the context of a social network, it should be understood that the principles of the present disclosure could be applied to any service that collects and posts user generated content. For example, any third party systems that accept comments, reviews, posts, tips, check-ins, comments on WordPress blogs, etc. may use the spam detector described below.

The Spam Detector acts to:

  • Prevent content from being posted or transmitted in the social network.
  • Remove abusive users from the social network.

This patent tells us that it errs on the side of people making innocent mistakes when they post duplicate content:

If the fingerprint does not match the previously stored fingerprint, then the method continues to post the comment on the social network. However, if the fingerprint matches the previously stored fingerprint, the spam detector has identified that it is a duplicate comment. This may be an innocent error or it may be a spam attempt. If a post of an accepted comment is the same as that of the proposed comment being considered, the proposed comment can be considered as being a duplicate made in error. On the other hand, if the proposed comment is to a different original post, the likelihood of the comment being made in error is much lower and the present method identifies the user as a potential spammer. The method continues to process the proposed comment as spam. As has been noted above, this additional processing may reject the comment for posting, perform automated or manual review of the user’s post, silence the user or disable the user’s account.

Spam Comments

Most of the patent doesn’t describe how it might potentially recognize spam, but it includes a method to try to “determine the likelihood that a comment is spam by applying various rules such as those identified above as well as other policies to categorize comments based on the signals associated with that comment.” So, it might include “five different categories and each has a different corresponding action that the spam response module will perform based upon the categorization.” The patent doesn’t tell us what these particular rules are.

Users of a social network might be terminated from the service based upon this patent:

For example, if there have been a number of duplicate comments from this user, and they are all within a very short time-frame, or the user account location is identified as one often used by spammers, there may be a very high likelihood that the comment is spam. If so, the method disables the account of the user and the method ends.

Other, less definite signals that someone is potentially engaging in spam might trigger a manual review of their actions.

OK, the patent doesn’t give us a road-map as to what spam is exactly, and provide ways for us to trigger the negative implications of spam. But if it did, it probably shouldn’t share information like that – a patent that may make it easier for people to spam the search engine would be highly undesirable. Especially to Matt Cutt’s replacement, whomever he or she might be.

Article Name
How Google might fight Web Spam in Social Networks
A new Google patent describes an approach it may follow in identifying Web Spam in a social Network.

39 thoughts on “How Google might fight Web Spam in Social Networks”

  1. Interesting, it sounds similar to how Akismet currently works. I wonder if this technology is being used on Google Plus at the moment?

  2. This seems like a pretty simple and common sense way of finding spam. How could something like this not already be in effect?

  3. Great post Bill, and good to know the above described technique to find spam. Is there something, how it will flag a fingerprint as spam as it’ll arrive on some count. Let’s say occurrences of 2 or 3 time of such comment will mark it as spam. Is it like something?

  4. Technique is pretty good to stop spamming and Matt Cutts replaced by someone else I think that’s a confirmed news but no official announcement by Google..!!

  5. Hi Bill:

    Thanks a lot for another nice post. It will be interesting to see how Google will do this.

    Best Regards
    Miraj Gazi

  6. Interesting. There are a few points IMO…1. As it’s a Google patent presumably it will apply only YouTube, as that is really the only place I see that kind of posting happening anyway. Even if it were a problem on FB or somewhere else they are a different site/company, which may or may not choose to adopt (or even be able to adopt) the technology. 2. As far as those sites are concerned, fresh new content is key. If most of what you get is spam, and especially if it seems to be popular, how likely are they to remove it? There are a few factors bundled into that one. 3. Anyone ever used Hootsuite? Exactly where is the line between efficiency and spamming?

    Presumably all these things have been taken into consideration. The last point in particular. Unless you are a hardcore spammer it is doubtful this will affect anything you do….I think anyway. GSA SER et al, you had a good run.

  7. Hi Aaron (Marketing tops),

    It’s possible that this may be used for Google+ in addition to You Tube, and it does state in it that it might be used for User Generated Content of all types, including WordPress blog posts, which there are a lot of.

  8. Hi Tom,

    Google hasn’t made an official public announcement, but did tell the reporters at Search Engine Land that they’ve picked a person to replace Matt in the role that he held, but that they didn’t want to make a public announcement to avoid making that person the spokesperson that Matt had become.

  9. Hi Dan,

    It’s possible that it seemed like common sense to the people who implemented it as well, but people following up with their actions may have decided that it was worth patenting, and did so.

  10. Very interesting. Spam is a problem and no matter how tight the Spamming (Akismet) software is set at it still seems to get by. I happen to be on Google+ and although I haven’t been hit with Spam I do see it from time to time…great article and looking forward to seeing how this evolves.

  11. My gut says a little too late. I thought they already had this capability to some degree. It is really going after on small aspect of the problem but it looks like a move in the right direction.

  12. Hey Bill,

    Another interesting post you have here. A patent for anti-spamming surely looks promising, however, I have a few concerns about it myself. One of my concerns is about the availability and effective reach of such patent. Google+ and other social related sites affiliated with Google will surely be beneficial with this proposal. However, how about the giant social media havens like Facebook, Twitter and Instagram? They will surely have a say to this.

    I am really hoping that Google will make the first move in approaching these major sites to propose such patent. But to be honest, I have little hope that it will happen. Whatever the outcome though, it is still very good to hear that Google is making their move to prevent and totally dispose what most netizens believe as a notorious problem in the internet before and today; Spam.

  13. Hi Johnny, it’s quite possible that Google had the capability to do something like this but hadn’t protected it with a patent, and decided it would be good to protect with one.

  14. Spam is the bane of my life! Manage to have it under control on most of my blogs and Google+ plus but would love something like this to be rolled out on Facebook for example where fellow therapists try and hijack my page to advertise their services. Now if only they could extend it to telephone calls as well my life would be much happier.

  15. Good to hear that spam is well-managed on WordPress and Google+, unfortunately this is Google’s patent so it won’t be used by Facebook. Shame that members of your industry engage in those types of practices.

  16. What we can say about web spam, it really an serious issue of all web world. Everyone is facing this issue. Google surely taking strict actions on this, which lead to under estimate the web spam. Thanks for sharing this important information.

  17. I think its becoming increasingly more important to not enter into spamming activities, even if Google doesn’t penalise you now, it could easily do it in the future

  18. Hi..
    Thank you for sharing this article and This seems like a pretty simple and common sense way of finding spam. How could something like this not already be in effect.
    thanks you

  19. Hi..
    Thank you for post this article and sharing the information about the great post discussing the issue of spam detection and spam removal technique .It,s technique is pretty good to stop spamming and Matt Cutts replaced by someone else

  20. Its sad to know that Matt Cutts would stop being face of Google’s anti spam efforts. But it is interesting to see the new efforts being taken by Google to prevent spam in social media.

  21. Hi Bill,

    Thank you for give us update on google new tactics to stop spamming. As spammer making new trick and google are ready to kick them with their new strategy. Feel bad for matt cutt as he share a lot info to us.

    Kind Regards
    Yasin Rishad

  22. Hi @Vikas Singh Gusain, i am agree with You.
    Thank you so much @Bill, for contributing your important time to post such an interesting
    & useful collection.It would be knowledgeable & resources are
    always of great need to everyone. Please keep continue sharing.

  23. Typically I really don’t go through document for weblogs, however desire to express that this kind of write-up extremely forced us to think about along with do this! A person’s way with words continues to be surprised me. Thank you so much, pleasant content

  24. Webspam is indeed a big effort. And in fact, it’s an effort needed to be must to lift really the genuine online activities. Organic SEO is based on a lot of work and it ever requires quality sharing and gaining good results in turn. Spam detection, comparison and respond, a good way to tear it apart.

  25. Hey bill, thanks a lot for sharing such an important information. Actually, nowadays spamming is becoming a big problem of the web world. We all’re facing this issue badly, and yeah, big thanks to google. If google doesn’t penalize anyone for spamming than it could be easy to do it. But Yeah the technique which you mention is pretty good to stop spamming.

  26. Hi Olivia,

    You’re welcome. Google has come out with some interesting processes to fight spam, and yes it is a challenge. I feel like I’m seeing less spammy search results than I used to.

  27. It probably has to do with the fact that modern spam-blocking software in G-mail, Hotmail,Yahoo works reasonably nowadays. And few security experts are saying that scammers are more interested in spearfishing attacks on your personal data though other ways, which can produce even better result.

Comments are closed.