Alumni Snapshot: Jerry Zhu

  • B.S., computer science, Shanghai Jiao Tong University, 1993
  • M.S., computer science, Shanghai Jiao Tong University, 1996
  • Ph.D., Language Technologies Institute, Carnegie Mellon University, 2005
If any machine-learning research can be considered "retro," that might be an apt description of the work Xiaojin (Jerry) Zhu is pursuing at the University of Wisconsin at Madison.

Zhu, an assistant professor of computer science, is investigating the ways that human cognition can be studied using machine-learning techniques and vice versa. He says his work is almost a throwback to what's now considered "classical" artificial intelligence research as performed by Herbert Simon, Allen Newell and other AI pioneers a half-century ago, but using the latest advances in machine learning.

"I'm interested in finding the fundamental mathematical principles that govern learning across the spectrum, using both human subjects and computers," says Zhu, who collaborates closely with UW-Madison's psychology department.

There are strong comparisons to be made in the ways that humans and computers acquire new knowledge, he says. Take the problem of over-fitting. Over-fitting happens when a machine is given a "training set" of data and creates a model too exactly fitted to the data--one that finds not the true underlying pattern, but instead the idiosyncrasy of that particular training set.

"It turns out this is relevant in humans as well," Zhu says. In one experiment at UW-Madison, students were given a list of five words, together with their category. For example, "daylight" was listed as a word in "category A," while the words "hospital," "termite," "envy" and "scream" were in "category B." Students were then asked to predict the category of more words. Zhu says students came up with elaborate explanations (i.e., over-fit) why "daylight" belonged in category A, while the others belonged in B. The actual reason was simple: Category A represented words with positive connotations, while the other represented words with negative connotations.

"Of course it is hard to figure out the actual rule with only five words, and easy to come up with wrong guesses," Zhu says. "The real question, however, is whether we can derive a precise mathematical formula on how badly humans will over-fit given any training set--be it small or large, words or pictures." Using a machine-learning concept known as Rademacher complexity, he and his collaborators developed a mathematical model that predicted exactly that.  Such models, though highly theoretical, could have applications in education, he says--for instance, in predicting how likely students are to grasp underlying concepts from the examples they see in classes or textbooks.

"I hope machine learning will eventually come back to address more of the cognitive science problems that classic AI considered," says Zhu, who jokes that the Machine Learning Department at CMU might then have to change its name to simply the "Learning Department."

In his spare time, Zhu enjoys amateur astronomy, sometimes looking at the night sky near Madison through his own 8 inch Dobsonian. "Madison is slightly better than Pittsburgh in terms of light pollution," he says. "There are also more clear nights." Zhu, his wife and their children, ages 2 and 6, also participate in family fossil-hunting trips organized by Madison's geology museum.
For More Information: 

Jason Togyer | 412-268-8721 | jt3y@cs.cmu.edu