Computerworld

Researchers crack Microsoft, eBay, Yahoo, Digg audio captchas

Researchers have figured out how to to crack captchas, making it possible to launch automated attacks against sites such as Microsoft, eBay and Digg where opening phony accounts could be turned into cash.

Software written by researchers at Stanford University and Tulane University can interpret human speech well enough to crack audio captchas between 1.5 per cent and 89 per cent of the time - often enough to make sites that use them vulnerable to setting up false user accounts, the researchers say.

THE PAYOFF: Wiseguy scalpers bought tickets with CAPTCHA-busting botnetĀ 

Called Decaptcha, the program was able to decode Microsoft's audio captchas about half the time. It cracked the toughest audio captcha from reCAPTCHA just 1.5 per cent of the time and Authorize.com's audio captchas 89 per cent of the time.

It solved eBay audio captchas 82 per cent of the time, Microsoft 48.9 per cent of the time, Yahoo 45.5 per cent of the time and 42 per cent of the time for Digg, say the researchers, headed up by Elie Bursztein, a post-doctoral researcher at Stanford.

"[A] computer algorithm that solves one captcha out of every 100 attempts would allow an attacker to set up enough fraudulent accounts to manipulate user behavior or achieve other ends on a target site," the researchers say.

Visual captchas (completely automated public Turing tests to tell computers and humans apart) display distorted numbers and letters that a person has to identify and key in. Audio captchas present a voice reading numbers and letters that are partially obscured by noise, music or competing voices, and the person solving them has to key in the characters being read.

The Decaptcha program samples the audio and identifies what are likely to be numbers and letters based on numbers and letters that have previously been read to it. It then tries to match the suspected character with one of the characters in its library, choosing the one that makes the best match.

According to the researchers training the program requires it to "listening to" captchas that have been accurately identified. "Decaptcha requires 300 labeled captchas and approximately 20 minutes of training time to defeat the hardest schemes," the researchers say in a paper describing their results. After that, the trained program can solve tens of captchas per minute.

In order to make it difficult for computer programs to identify the characters, various types of distractions are played over them, such as random white noise, loud noises between characters, other voices. Some audio captchas use purposely low-quality recordings.

White noise is relatively easy to filter out, but competing voices and sounds that present sound patterns similar to letters and numbers are the most difficult for Decaptcha to discern, the researchers say. These are called symantic distractions and require human intelligence to sort them out with a high degree of accuracy.

Working in favor of Decaptcha is that the creators of audio captchas have to make them simple enough for humans to figure out the letters and numbers the vast majority of the time. The balance between simple enough for humans to distinguish and difficult enough for computers to miss is tricky, the researchers say.

The researchers recommend tightening up security of audio captchas through use of more symantic noise.

The researchers say they are working on ways to break audio captchas that use entire words rather than just characters to see whether they are more safe.

They also want to analyze the differences in the ways humans make mistakes decoding captchas and the ways computers make mistakes. That way captchas can be designed to make it more difficult and costly to device programs that defeat them, they say.

Read more about wide area network in Network World's Wide Area Network section.