Captcha-ing More Than You Know
CAPTCHAs, we all know them, we’ve all used them. They’re for security right? To tell us from the robots? You’d be right to think that they are in fact a security feature used on hundreds of thousands of websites, or at least this was their initial purpose.
Standing for Completely Automated Public Turing test to tell Computers and Humans Apart, CAPTCHA’s came to life in 2000 and were the brainchild of Carnegie Mellon University graduate, Luis von Ahn. Taking the system a step further, the project was renamed reCAPTCHA when it was redesigned to add further layers of distortion on top of the text to beat the hackers. But that wasn’t all, the next development was the true genius behind the system. The proof? Google bought it.
Luis realised that every day in excess of 100 million CAPTCHAs were being completed, providing the opportunity to use words tagged as unreadable in the digitising of books and other printed materials as CAPTCHA phrases. In doing this, Luis has displayed an excellent example of a combination of O’Reilly’s first and second core patterns of Web 2.0, Harnessing Collective Intelligence, as discussed last week, and the idea that ‘Data is the next Intel Inside’.
This concept refers to the increased importance and reliance on data in Web 2.0 applications. The reCAPTCHA system exhibits this by fulfilling 4 of the 5 best practices of the core pattern.
Seek to own a unique, hard to recreate source of data: Luis von Ahn did just this in the creation of reCAPTCHA, as evidenced though Google’s decision to purchase it from him. Seeing the value and complexity of the system, Google capitalised on the opportunities created through the acquisition.
Enhance the core data: Under the reCAPTCHA project, the data was enhanced and re-purposed to provide the dual benefit of security, along with the deciphering of printed texts.
Let users control their own data: Prior to acquisition by Google, Luis had created the opportunity for users to submit ‘unreadable’ words to the system for deciphering, however, it is unclear whether Google keeps this privilege exclusively for themselves.
Define a data strategy: There is a very clear data strategy being used for reCAPTCHA. Providing a security system to ensure only humans are accessing or posting data, as well as double purposing and deciphering texts tagged as unreadable by current text recognition software.
Google really is CAPTCHA-ing more than we thought.
Thanks for reading and please feel free to leave feedback!