Ted Reads
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
MASSIVE-SCALE ONLINE COLLABORATION<br />
Luis von Ahn - April 2011<br />
Louis von Ahn is an associate professor of Computer Science at Carnegie Mellon University,<br />
and he’s at the forefront of the crowdsourcing craze, building systems that combine<br />
humans and computers to solve large-scale problems that neither can solve alone. His work<br />
takes advantage of the evergrowing Web-connected population to acheive collaboration<br />
in unprecedented numbers. His projects aim to leverage the crowd for human good. His<br />
company reCAPTCHA, sold to Google in 2009, digitizes human knowledge (books),<br />
one word at a time. His new project is Duolingo, which aims to get 100 million people<br />
translating the Web in every major language.<br />
0:00<br />
0:13<br />
0:39<br />
1:13<br />
1:35<br />
2:07<br />
“<br />
How many of you had to fill out some sort of web form where you’ve been asked to read a distorted sequence of<br />
characters like this? How many of you found it really, really annoying? Okay, outstanding. So I invented that. (Laughter)<br />
Or I was one of the people who did it.<br />
That thing is called a CAPTCHA. And the reason it is there is to make sure you, the entity filling out the form, are<br />
actually a human and not some sort of computer program that was written to submit the form millions and millions<br />
of times. The reason it works is because humans, at least non-visually-impaired humans, have no trouble reading these<br />
distorted squiggly characters, whereas computer programs simply can’t do it as well yet. So for example, in the case of<br />
Ticketmaster, the reason you have to type these distorted characters is to prevent scalpers from writing a program that<br />
can buy millions of tickets, two at a time.<br />
CAPTCHAs are used all over the Internet. And since they’re used so often, a lot of times the precise sequence of random<br />
characters that is shown to the user is not so fortunate. So this is an example from the Yahoo registration page. The<br />
random characters that happened to be shown to the user were W, A, I, T, which, of course, spell a word. But the best<br />
part is the message that the Yahoo help desk got about 20 minutes later: “Help! I’ve been waiting for over 20 minutes,<br />
and nothing happens.” (Laughter) This person thought they needed to wait.<br />
CAPTCHA Project is something that we did here at Carnegie Melllon over 10 years ago, and it’s been used everywhere.<br />
Let me now tell you about a project that we did a few years later, which is sort of the next evolution of CAPTCHA. This<br />
is a project that we call reCAPTCHA, which is something that we started here at Carnegie Mellon, then we turned it<br />
into a startup company. And then about a year and a half ago, Google actually acquired this company.<br />
So this project started from the following realization: It turns out that approximately 200 million CAPTCHAs are<br />
typed everyday by people around the world. When I first heard this, I was quite proud of myself. I thought, look at the<br />
impact that my research has had. But then I started feeling bad. See here’s the thing, each time you type a CAPTCHA,<br />
essentially you waste 10 seconds of your time. And if you multiply that by 200 million, you get that humanity as a whole<br />
is wasting about 500,000 hours every day typing these annoying CAPTCHAs. So then I started feeling bad. (Laughter)<br />
And then I started thinking, well, of course, we can’t just get rid of CAPTCHAs, because the security of the Web sort<br />
of depends on them. But then I started thinking, is there any way we can use this effort for something that is good for<br />
humanity? So see, here’s the thing. While you’re typing a CAPTCHA, during those 10 seconds, your brain is doing<br />
something amazing. Your brain is doing something that computers cannot yet do. So can we get you to do useful work<br />
for those 10 seconds? Another way of putting it is, is there some humongous problem that we cannot yet get computers