On Coding Horror, Jeff Atwood talks about the effectiveness of CAPTCHA.
While Atwood was generally supportive of CAPTCHA, one of the issues he cited was the idea of using free porn to circumvent it. The idea is simple: when you try to use a web-site protected by CAPTCHA, you simply snaffle the image and display it on an adult web site – offering “free” pornography to any human willing to type in the text.
I’ve heard of this (theoretical? urban legend?) attack before, and it got me pondering about the economics of the issue. I am not a trained economist, but let’s look at where thinking like an economist takes us.
The spammers believe getting their messages posted has a certain utility to them – let’s call it a
cents per message.
The web-masters think spam lowers the value of their site by b
cents each. Where b
is significantly less than a
, then there is a market for web-site advertisements.
The web-masters think non-spam (“ham”) comments increase the value of their site by c
cents each. If c
is too small to bother with, they will turn off comments. Otherwise, they have a problem – spammers will post so many comments that the sum of all the b
values will outweigh the c
values.
The solution is to charge each user d
cents to comment. The cost d
must be more than a
, but less than the value commenters see as having their comment published.
This cost d
is implemented by requiring some of their human effort – i.e. to parse some text and type it in. The power of CAPTCHA is ensuring that the cost, d
cannot be undercut by using a computer.
So, now the difference between b
and c
is frustrating to the spammer, but the spammer has another asset. The spammer has pornography that has a fixed cost to acquire, but a negligible cost to distribute. Rather than selling that pornography to users for a fair price, they simply charge d
by making the user solve the original CAPTCHA for them. As long as the numbers are high enough, the amortization of the fixed cost to acquire the pornography makes this virtually free.
So, how do we protect against this attack.
Atwood suggests it isn’t a real problem we see in the field, and therefore doesn’t need to be protected against. Pragmatic, but too boring.
Including the name (or some explanatory text) on the original site in the CAPTCHA image would help detection of the issue to occur faster. The porn viewer could see what was going on and report the web-site to the authorities. That would decrease the usability of the original site, and requires porn-seekers to come forward to authorities.
I can see a better solution – a method of undermining the whole economy here.
As long as d
is the cheapest cost for porn on the web, users will continue to “purchase” porn from spammers. But what if there was a cheaper source?
Anti-spam campaigners need to host mammoth sites of free porn available to all users. As long as users can easily get porn for free, why would they bother to pay d
(i.e. fill in CAPTCHA forms) on spammers’ sites?
When you look at free porn like an economist does, you can see this is truly the best way to prevent comment spam.
Comment by Jeff Atwood on November 2, 2006
Interesting.
I’d argue that this has already happened. The actual opportunity cost of porn, at least static-image-type porn, is already pretty close to zero. If you want to look at naked people, it’s not difficult. I think adding a captcha to that would be already too much work, compared to the scads of free porn you could click directly through to with no CAPTCHA in sight.
Maybe, but somehow I doubt that porn-seeking CAPTCHA farmers would care too much about doing the right thing. If they even exist.
Comment by Julian on November 3, 2006
Jeff,
I would shout “Yay! Those free porn-sites are already protecting us from spam!” but my tongue is stuck too firmly in my cheek to make out the words!
I agree. We would need to give them incentives.
I once saw a sign in an fast food restaurant in a U.S. airport that said “If you don’t get a receipt with your meal, the meal is free.” It turned every patron into an auditor, ensuring the cashier wasn’t pocketing the money.
Imagine if the CAPTCHA image contained the following text:
It would be a great tool for locating the bad guys, but it would be a usability nightmare on the original site.
Maybe it would be enough to, for example, include a web-logo watermark in the CAPTCHA image, and widely advertise your policy. That might be sufficient to work for the big boys (Yahoo, Google, etc.).
Comment by Alastair on November 3, 2006
I don’t understand why you can’t stop CAPTCHA abuse in the same way that we currently stop bandwidth stealing for images: by checking the referer header.
Or blacklisting. Even that is a feasible defence in the case of CAPTCHA farming.
Comment by Julian on November 4, 2006
Alastair,
I had imagined that the attack would involve the bad guy copying the image and sending it to the user. Otherwise, the dynamically-generated CAPTCHA image seen by the dupe and the CAPTCHA image seen by the bad guy wouldn’t match.
So, that would prevent blocking by referer header.
(Are CAPTCHA’s dynamically generated and unique? I’ve heard of cases where there has only been a set of a dozen of so pre-cached images, so at least sometimes the answer is “No”. Expensive image-generation per simple request = simple denial of service attack opportunity – one machine could generate more requests than one server could fulfill – so may be they shouldn’t be generated on the fly.)
I am not really clear where the “sweet spot” of blacklists is, so I am not clear whether they would help. I suspect they are good at blocking open-relays, where the naive system administrator hasn’t stop spam being relayed. I suspect they are somewhat useful for blocking the flood of personal machines taken over by botnets. I suspect they are somewhat useful at over-aggressively blocking entire IP ranges from anti-social ISPs who are known for deliberately accepting spammers as customers as part of their business model.
However, would blacklists be successful against this attack, which could probably be done undetected right under the nose of even a responsible ISP? Do motivated people really find it that hard to move their machines onto a new IP address when the old one is burned?
Comment by Julian on November 2, 2007
Oh no! According to Trend Micro, this theoretical attack has become reality: TROJ_CAPTCHAR.A
At the time of writing, it has become an epidemic of SIX people.
Time for action. Release the porn!
Comment by Julian on February 10, 2008
I asked (in an aside in the article above):
ReCAPTCHA answers that question: Human-assisted OCR.