In the loop

In July 2012, Luis von Ahn was named a recipient of the Presidential Early Career Award for Scientists and Engineers, the highest honor bestowed by the United States government on science and engineering professionals in the early stages of their independent research careers.

Von Ahn, 33, is the A. Nico Habermann Associate Professor of Computer Science at Carnegie Mellon. He was one of 96 PECASE recipients announced by the White House and was one of 20 recipients nominated by the National Science Foundation. 

A native of Guatemala, von Ahn is a graduate of Duke University who earned his Ph.D. in computer science at CMU in 2005, where his advisor was Manuel Blum.

Von Ahn's past awards include a MacArthur Fellowship in 2006, a Packard Fellowship and Sloan Research Fellowship in 2009, and the Association for Computing Machinery's Grace Murray Hopper Award earlier in 2012. Last year, Spanish Foreign Policy magazine named him the most influential new thought leader of Latin America and Spain.

He spoke with Link Editor Jason Togyer.

How did you get started in computers?

I was 8 years old and I wanted a Nintendo, but my mother bought me a Commodore 64 instead. I had no idea how to use it, but I had to figure out some very basic programming in order to run some games. I started getting better mainly because my mom wouldn't buy me many games, and I had to figure out how to break the copy protection they had. So I guess I got started by pirating games!

What convinced you that this was a field you wanted to work in?

I actually majored in math in college, but one thing I didn't like about math was that when I visited grad schools, the profs would tell me things like, "I've been working on this problem, and no one has solved it in 300 years." Whereas in computer science, people were like, "I solved an open problem yesterday and I'm working on another one." It was a much more vibrant field, mainly because it's only been around for 50 years instead of 2,000 years. There's still a lot left to do.

Your work here with Manuel Blum is what led to the original Captcha. Are you surprised at how ubiquitous it's become?

Yes, I didn't expect that--it's literally used by every major website in the world. The first Captcha that I wrote for Yahoo! was not meant to be used by more than 100,000 people. It was kind of a toy. Now, about 250 million times a day, someone types a Captcha.

Would that be a bad thing to be remembered for?

Probably not, though I find the other research that I've done to be more exciting.

You've referred to "games with a purpose." What are games with a purpose?

I started working on crowdsourcing in 2001. At the time, that word didn't exist--it was just "well, there's the web and we're trying to get people to help us do some stuff."

I developed a game called the ESP Game, which was played by a few million people, and as people were playing it they were helping us improve image searches. It was one of the first examples of a game that was used for scientific purposes. Then it occurred to me that I could combine these games with crowdsourcing, which is where the idea for ReCaptcha came about. I think ReCaptcha is the crowdsourcing idea that's been used by the largest number of people on the planet. About 1 billion people have helped digitize books with ReCaptcha, which is several orders magnitude bigger than anything else.

Do the people who are typing these ReCaptchas realize they're digitizing books?

No, although we don't try to hide it at all, in many cases they don't know why they're typing these squiggly characters. The key to making these things work is making them a part of what people would do anyways. Similarly, with "games with a purpose," we're kind of getting something for nothing.

What makes a compelling game?

It's a complete art. It's very similar to making a good movie. You can have the right actors and the right director and in the end, it flops. And it's not just games--it's what makes a compelling user experience. This new project that we're working on, Duolingo, is not a game, but we've spent a lot of effort trying to make it more and more compelling.

Give me the elevator pitch for Duolingo.

Duolingo began when we asked the question, "How can we get 100 million people to translate the web into every major language?" Google Translate is getting better, but it's crude. If I really want to translate the whole web accurately, I literally need millions of people. And--I can't pay them. If that's how you start, you quickly run into some major obstacles.

Such as?

First, a lack of bilingual people. There just aren't that many. Second, how are you going to motivate people to translate for free? It's normally something you have to pay professional translators about 10 cents a word to do. We were stuck on this problem for a while until we figured out there's a way to solve both of these problems with one solution. It turns out there are about a billion people in the world who want to learn a foreign language right now. So what we've been working on is Duolingo, a language learning site where you learn for free, but the twist is that you learn by translating real-world sentences.

How does it work?

When you first show up, if you're just a beginner, we give you very simple sentences to translate, and we tell you what each word means. As you do more, you get more and more complex sentences. The crazy thing is that it really, really works--people really do learn another language. Instead of paying with money, the student pays with time. As they're learning they're also creating value with these translations. It's time that would need to be spent anyway. That's nice. I like that.

Where do the texts come from?

Basically, we're crawling the web for Creative Commons stuff. If you want your stuff translated, sure, we'll do it, but you've got to upload it. We don't want to get sued for stealing your content.

Have any major publishers signed up?

Yeah, but I can't talk about it yet. Some of the largest publishers in the world are very interested. Translation is huge, it's about a $30 billion market, but it turns out that only about 0.1 percent of what needs to be translated actually gets translated. The other 99.9 percent isn't worth 10 cents a word. News is an example--most news outlets are not doing all that great, and they can't afford 10 cents a word. Those are exactly the kinds of stories we want to be translating.

Do those kinds of articles work for someone who's learning a language?

They're super-interesting if you're learning a language! Instead of translating, "the girl jumps, the girl eats, the boy jumps, the boy eats," now you're learning a language by reading something that actually has a thread. I'm very excited about it.

Where would you like to see Duolingo in a couple of years?

I'd like it to be the language learning standard, and I think we can do that. It's free and it's really good. And because of that, I want to be able to be translating at a high rate. Translating a million articles a month, something like that.

What's next?

Duolingo. That's it. There's one big difference between me and most people. They have 10 projects. I have one. When I started working on my Ph.D., I tried working on 10 different projects, and I ended up with 10 (lousy) projects.

What would be your career advice to someone who's 13 or 14?

Use a computer! The earlier you learn how to use a computer, the better. The kids that we're seeing as undergrads here started learning how to use a computer way earlier than I did. It's as if I started learning french at age 8, where they started learning at age 4. They're much better. If you want to be part of the future, you should be using a computer. I understand there are cases of addictive behavior, but otherwise I would hope that kids are using a computer as much as they can.

Image2: 
For More Information: 

Jason Togyer | 412-268-8721 | jt3y@cs.cmu.edu