The other day, I had a very interesting conversation with a friend about what and how comment spam works. I though it was a good blog topic as well.
What is comment spam?
Comment spams are weird comments full of links and garbage you might have seen on some blogs. Unlike junk mail that fill up your mailbox, comment spams are targeted to search engine robots, not humans — it’s called spamdexing.
It’s a form of SEO to help spammers drive traffic to commercial websites. Spammers use the huge web of blogs as a diffusion platform to cheat search engines. Driving and controlling traffic is what brings value to spammers.
How do they get my blog?
Millions of blogs are created every day and referenced by popular directories such as Technorati or MyBlogLog, it’s quite easy to build a huge list of potential victims.
However if you post a comment on a popular blog, you can be sure you will be the next one.
How do they manage all these blogs?
Spammers would perform a test campaign on collected blogs before they would actually start spamming. You know you are a potential target when you start getting fake comments such as “nice blog” or “keep up the good work”.
Spammers use this technique to gather information about your blogging platform and check if it’s open, moderated or filtered. If you leave these comments long enough, your blog will be marked as elligible for spamming.
How to they actually spam?
Spammers are smart programmers, everything is automated.
As soon as your blog is open for spamming, it will be flooded by link comments. They use open proxies (badly configured by negligence of system administrators) available accross the Internet to relay comments on their behalf so they can better hide.
Spammers also tend to comment on older posts so that spams don’t show up on the main page of your blog but are still accessible by search engines.
What are the solutions?
There is no perfect solution to this problem. Actually, two solutions are efficient, automatic filters and captcha.
Automatic filters such as Akismet or Defensio — available as plugins for most blogging platforms — validates all your comment traffic using complex algorithms to score comments. This generally works quite well but there is a little chance that good comments (false positives) be marked as spam and removed.
A captcha is a scrambled picture that can’t be scanned by programs but is readable by humans. Users are asked to enter the word that shows up to be allowed to post their comment. That’s generally very efficient but it’s quite bad from the user perspective and might discourage people from posting comments.
Spamming is a plague
80% of the email traffic on the Internet is actually junk and I guess comment spam is also quite huge. Sadly, even if we manage to filter spam from our blogs and mailboxes, the bandwidth will still be drained by the traffic.
Read more,



Comments
Nice post.
Just kidding. I know there is a plugin for Wordpress that is supposed to have a field that shows on the screen, but not when scraped like the spammers do. I haven’t used it yet. For my other blogs I just turn moderation on. This works for me anyway as I want to be in touch with my commenters and email them or post a reply to their comment.
BeachBum
Hi, thanks for your comment.
I didn’t mention moderation because it the most obvious way to filter content but as soon as you get tons of spams, it becomes hard to manage.
Very informative, I just started really playing with blogs and got a few random comments and I was all excited and then I realized there was noone there at the other end of the line and I guess this is why. Great post
Leave a comment