On Campus: It's All in the Game

With Foldit and EteRNA, computers and humans work together to crack genetic codes--and the results are being translated into real laboratory experiments

By Ken Chiacchia

Human brains are becoming part of a vast, extended computing network that's creating new molecules of ribonucleic acid--RNA, one of the building blocks of all known forms of life.

They're doing it through EteRNA, an online program that pools players' ingenuity and then translates their insights directly into laboratory experiments.

Launched in January, the game was designed by Adrien Treuille, an assistant professor in Carnegie Mellon's Robotics Institute, along with physicist Rhiju Das of Stanford University, and Jeehyung Lee, a Carnegie Mellon computer science graduate student.

Cells in all living creatures are predominantly comprised of proteins that must fold into three-dimensional shapes in order to carry out vital functions.

Understanding how proteins fold is central to understanding how they work and
how they can be used to create favorable interactions within cells.

The recipe for each protein used by cells is encoded in DNA--deoxyribonucleic acid. Biologists long thought that RNA was a simple messenger that translated that code into the proteins that express genes, but recent research has shown that RNA can also have important catalytic functions, filling the normal role of proteins; and it can have regulatory functions, interacting with the genes in a distinctly DNA-like manner.

One of the reasons why RNA is so promising as a bioengineering agent is that it can affect cellular processes in multiple ways. Proteins that fold incorrectly can lead to diseases, but even "good" folding can sometimes be harmful. For instance, we don't want HIV proteins in an infected cell to fold correctly. That's why geneticists may want to block the action of certain proteins.

In EteRNA, a player begins with a target molecular shape and then tries to deduce the sequence RNA subunits (called bases) that would cause a protein to naturally fold into that shape. The goal is the creation of new molecules that might chemically block a virus from binding to its host cells, short-circuit a pathway necessary for a genetic disease to develop, or catalyze a new or improved industrial process.

Arguably, EteRNA's most significant innovation is that the gamers' work feeds directly into wet-lab research. On a weekly basis, the online community picks the most promising structures, which biochemists then synthesize and test.

Marvels Treuille: "And they're really just doing it because they want to beat their neighbor at some game."

Two earlier collaborative efforts were important predecessors of EteRNA. One was the SETI@home screensaver, which searched for extraterrestrial radio messages using volunteers' surplus computer time. Another was a program called Rosetta, designed by a team at the University of Washington led by biochemist David Baker.

Like SETI@home, Rosetta harnessed the surplus computer power of many volunteers, but instead of searching for radio messages, it calculated theoretical protein structures from their amino acid sequences--the "inverse problem" to what EteRNA does. Rosetta used the distributed computing power to calculate theoretical protein structure solutions quickly and displayed its solutions on participants' screens. But they weren't necessarily the best solutions.

Then something interesting happened. The participants, who had passively been watching the program fold amino-acid chains, told the researchers they thought they could improve on the structures they were seeing. That left the researchers wondering how they could harness the minds of these thousands of users to help find better solutions.

There's a precedent, of course, and it's right at Carnegie Mellon. Luis von Ahn, assistant professor of computer science at CMU, co-developed the familiar CAPTCHAs that use human image-recognition capacity to authenticate web users.

After calculating that people were spending about 500,000 hours per day interacting with CAPTCHAs, von Ahn decided to see if they could make more productive use of that time. He invented reCAPTCHA, which authenticates people by making them type two words--one a security test, the other a digitized word that computers have failed to identify. In the process, users are now helping to digitize thousands of books and newspaper articles that would otherwise be unsearchable in online databases.

Similarly, Von Ahn took advantage of human image-recognition capability and created the ESP Game, which matches pairs of online gamers to produce searchable labels for images.

Taking a page from von Ahn's work, Baker teamed up with Treuille and Zoran Popović, associate professor of computer science and engineering at Washington and a Carnegie Mellon PhD alumnus, to design a game called Foldit. The new game applied human problem-solving capability to the protein-folding task.

In an August paper in Nature, the researchers compared Foldit players' blinded solutions to Rosetta's for select known protein structures. The humans came closer to the real structures than the computers. More important, perhaps, was that the human players employed distinct and more diverse strategies than the computers. Unlike the machines, players' strategies also changed between the early, mid-, and end games. This poses lessons for artificial intelligence design, Treuille notes.

EteRNA differs from Foldit in three important ways: it solves the inverse problem, going from a target structure to a sequence rather than the reverse; it uses RNA rather than protein structures; and it creates a much more direct link between the computer simulations and real experiments done in biochemistry labs.

"Generally speaking, inverse folding is more interesting, because it allows you to design new things rather than just predict shapes," Treuille says. "But you have to understand how folding works before you can solve the inverse folding."

An important actor in many biological processes, RNA serves as both a vector of information and a functional structure in its own right. In protein synthesis, it translates genetic information derived from DNA sequences into functional proteins and also catalyzes that process. In gene silencing, it can mask genetic signals to prevent unneeded genes from being expressed.

The base molecules that form the building blocks of RNA structures may prove a more malleable and useful tool than the far more complex and diverse amino acids that make up protein molecules.

RNA's relative simplicity also should make EteRNA easier to learn than Foldit. And because it allows the inverse-folding approach, RNA should prove far more flexible in creating tailor-made structures with useful functions.

The competition between players in both EteRNA and Foldit can be intense. Interestingly, so can the collaboration. Foldit players have produced and shared widgets that carry out minor folding tasks, tips on strategizing the game, and encouragement via a Wiki. "We don't have a single person who always wins," Popović says. "What we have is a group of very good people out of whom at least one always comes up with a solution." Similar collaborations are emerging in the EteRNA community.

Shawn Douglas, a Wyss technology development fellow at Harvard University and an expert on using DNA as a molecular building material, believes such games can help democratize science.

While professionals will still play a central role, amateur involvement "kind of raises all the boats," Douglas says. "People feel better about funding science, there are more people who are educated about what's going on in science."

"I think the biggest question is ... what can we do with this?" von Ahn says. "Humanity's really large scale achievements--the pyramids, the Panama canal, the moon shots--all involved about 100,000 people. Before the Internet, coordinating more than [that] was just about impossible."

But reCAPTCHA is now using the brains of 750 million people daily--a little over 10 percent of the human race. Now that Foldit and EteRNA show that you can use online gaming to crack problems far more complex than digitizing text, the potential seems vast.
For More Information: 

Jason Togyer | 412-268-8721 | jt3y@cs.cmu.edu