If you operate a website with commenting or forms then you are on the front lines on the battle against web spam. At best those spam submissions are a mildly annoying waste of time, at worst they are obscene and distracting. It’s a big enough problem that there are entire companies built on reducing the annoyance of spam. Myriad technologies such as Akismet , Honeypot, Mollom and CAPTCHA variations (re-Captcha, audio captcha, math captcha and the unsolvable Crapcha) exist to save website administrators from the frustration of dealing with spam.
CAPTCHA: Completely Automated Public Turing test to tell Computers and Humans Apart
All those puzzles are trying to do is separate the real audience (humans on your site to engage with you) from the spammers (automated bots and low-paid, usually offshore workers). As the spammers have become more sophisticated anti-spam solutions have had to become more advanced in order to stay effective. But in the process, more and more humans are snared in the traps designed for spammers. Anyone who’s used the internet long enough can recall a time where they were presented with an image recalling Jackson Pollock’s studio floor and asked to name the word spelled out. Maybe after a failed attempt or 2 you gave up? As of today, there is no 100% effective anti-spam solution that doesn’t affect the end users’ experience.
The core problem with most of these solutions is that rather than eliminating the frustration, they transfer it to the websites’ end users. The question becomes, to what degree are you willing to inconvenience your audience to make administrating your website easier?
If you’re a website administrator, you will need to evaluate your unique situation to identify the best spam mitigation practices. Some businesses are better served wading through relatively high spam numbers to avoid losing a precious lead or conversion. Other sites, particularly those not concerned with engagement, may be better served with more draconian measures.
Akismet automatically filters comment spam with an algorithm designed to identify spam comments with identifying traits, such as IP address, language patterns and keywords.
Pros: Low maintenance, moderately effective, available for both Drupal and WordPress with libraries available to extend to other platforms
Cons: Commercial sites will need to pay for a license, privacy concerns as data (including data about your server) is sent to a 3rd party service, bandwidth concerns on especially high volume sites.
Mollom is similar to Akismet, but is designed to protect both comments and form submissions. It evaluates the content entered and assigns it a “quality score”. You can configure your site to automatically approve, deny or most interestingly, present the user with a captcha depending on the quality score returned.
Pros: Can be very effective when properly configured, available for both Drupal and WordPress, addresses both comment and form spam.
Cons: Quality score thresholds may need to be frequently adjusted to be effective but in burdensome on end users. Commercial sites require a paid license.
Honeypot is aptly named because, just like Pooh bear is drawn towards honey jars, spam bots are drawn towards form fields—especially form fields that may grant the ability to link back to their own websites. The Honeypot method inserts a hidden form field to forms with a field name like ‘homepage’. End users don’t see the field, so they don’t fill it out. But spam bots (usually using prewritten scripts) do see the field (usually), and add something to it. The Honeypot module detects this and blocks the form submission if there’s something in the field. Additionally, the Honeypot module for Drupal adds in a Timestamp-based deterrent. Usually, forms take at least a few seconds to fill out when a human is entering data into them—especially surveys, user registration forms, etc. Spam bots try to fill out as many forms as they can in as little time as possible, so they will often fill out a form within a couple seconds at most. The Honeypot module requires given interval to pass before a form can be submitted.
Pros: Simple and quick to create. Virtually no impact on end users when properly configured.
Cons: Does not prevent human authored spam. Bots can frequently leave the trap field blank and as a result the honeypot method may not be as effective in reducing spam as other techniques.
Facebook provides a “Comments Box” via their developer tools that allows users to comment on your website as long as they are signed in to Facebook. It is highly effective in reducing spam as Facebook is fairly vigilant in removing spam accounts and the comment engine itself has spam protection built in.
Pros: Effective, mobile and responsive design friendly, robust moderation and distribution tools. Integrated with the Graph API. Available for most platforms and CMS including WordPress and Drupal.
Cons: require that your users are registered with Facebook and willing to comment with their real identity. Performance concerns as a 3rd party script must load via Facebook servers. Depending on configuration can be costly to install and setup.
reCAPTCHA is a variation on a traditional image verification captcha that seeks to take the 150,000 hours spent worldwide everyday on solving captchas and channel it into something positive by helping to digitize books, newspapers and old time radio shows. By typing out the obscured text you’re translating scanned copies books and documents that can’t be read by computers and digitizing them. Currently, they are helping to digitize old editions of the New York Times and books from Google Books.
Pros – You’re contributing to the democratization of human knowledge, quite nobel. It’s free.
Cons – Does nothing to address the frustration of a normal captcha. The distinctive red border can disrupt some designs.
Crapcha pulls no punches, it is not designed to filter out spammers, it is designed to filter out everyone and frustrate users in the process. What CRAPCHA does is annoy users by presenting a CAPTCHA with indecipherable text. Sadly, this cynical image captcha is not too far off from the real thing in terms of difficulty.
Pros – You won’t get any spam submissions.
Cons – You won’t get any submissions at all.