Much more than the sum of the parts

Modern artificial intelligence research integrates different approaches and disciplines—and so does a writer trying to understand the state of the field in the 21st century

Illustration by Jeffrey Katrencik (A 1985)

I first heard the Indian legend of “The Blind Men and the Elephant” in a junior high school English class, via John Godfrey Saxe’s famous poem:

It was six men of Indostan
To learning much inclined,
Who went to see the Elephant
(Though all of them were blind),
That each by observation
Might satisfy his mind.

Each man encounters a different part of the elephant and imagines what it might look like, based on what he touches—the first, feeling the elephant’s side, believes an elephant must be like a wall; the second, touching its tusk, thinks it like a spear. The third, grabbing the elephant’s trunk, decides an elephant is like a snake, while the fourth, feeling one of the legs, thinks it like a tree. And so on.

They begin arguing about what an elephant is. Because each has felt only one part of the elephant, none really understands what an elephant is. As Saxe concludes his poem, each “was partly in the right, and all were in the wrong!”

I recently spoke about artificial intelligence research with seven different Carnegie Mellon University professors—Emma Brunskill, Eric Nyberg, Ariel Procaccia, Tuomas Sandholm, Aarti Singh, Manuela Veloso and Eric P. Xing. After each conversation, I felt like those blind men of Indostan; I walked away with a fragmented sense of the breadth and depth of the field of AI. Only when I was able to integrate what each of the faculty members had to say did I begin to develop an understanding.

In computer science terms, AI can be considered a very old discipline. The idea of a “thinking machine” that could synthesize concepts and ideas like a human being has fascinated scientists and futurists for generations. The field of “artificial intelligence” itself was formalized and named in the summer of 1956, when a group of researchers that included Carnegie Tech’s Allen Newell (TPR’57) and Herb Simon (H’90) met for two months at Dartmouth College to talk about the work being done with machines that displayed intelligent behaviors. The four organizers of the Dartmouth Summer Research Project on Artificial Intelligence, John McCarthy (then at Dartmouth), MIT’s Marvin Minsky, IBM’s Nathaniel Rochester and Bell Labs’ Claude Shannon, proposed “every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it. An attempt will be made to find how to make machines use language, form abstractions and concepts, solve kinds of problems now reserved for humans, and improve themselves.”

In short, says Ariel Procaccia, an assistant professor in CMU’s Computer Science Department, “the Dartmouth conference had a very broad vision of trying to understand how people think.” Different approaches to simulating human intelligence emerged from the conference. Some researchers favored rule-based systems; Newell and Simon presented their version of a “thinking machine” called the Logic Theorist, which attempted to solve problems using the formal rules a human would use. McCarthy preferred a more abstract approach that came to be known as “circumscription,” which excluded any variables that weren’t explicitly known in an attempt to solve problems more efficiently.

Many approaches to AI were tried in the 1960s and 1970s, including early attempts to construct artificial neural networks. Much of the research focused on rule-based systems, in which programmers would imagine some state or condition, and then devise the rules that a computer would have to follow to move from that state or condition to another state or condition. “Rule-based approaches enabled the first use of computers to perform true symbolic reasoning,” says Manuela Veloso, CMU’s Herbert A. Simon Professor of Computer Science. She calls them a “major contribution” to the field of AI; for the first time, she says, scientists were attempting to emulate human problem solving in a structured, deliberate way. And as rule-based systems evolved, the research evolved as well. Soon, Veloso says, AI developers were focused on ways to most “efficiently search (the) enormous space of possible rules to find a sequence of actions that would transform some current state into a desired final state.”

But there were limitations to rule-based approaches. In structured domains, such as playing chess, it was hard enough to write a rule for every possible situation; achieving artificial intelligence that could handle wider, more abstract problems through purely rule-based methods was nearly impossible. Other attempts at AI, such as artificial neural networks, were struggling as well. By the early 1990s, organizations that had been funding research into pure AI, such as the Defense Advanced Research Projects Agency, began questioning whether their investments would ever yield practical results. “It was like, ‘Where are the success stories? What are the big AI systems that have been fielded?’” says Tuomas Sandholm, a professor in the Computer Science Department.

The advent of quicker processors, parallel processing and better methods for storing and retrieving massive quantities of information made it possible to use statistical methods to analyze gigabytes and terabytes of data to look for patterns, and then use algorithms to draw inferences and make predictions. The resulting field has become known as machine learning, and as a result, computer science “is not what it used to be, many years ago,” says Aarti Singh, the A. Nico Habermann Associate Professor in CMU’s Machine Learning Department.

Statistical machine learning as applied to big data—the catchall term that describes voluminous amounts of both structured and unstructured digitized information—has allowed researchers to make major breakthroughs in fields ranging from speech recognition, language translation and image processing to economics and computational biology. Singh says machine learning has become so prominent in computer science in part because it is making an impact in so many different areas. Her own research includes designing algorithms that, when confronted with a massive quantity of what she refers to as “big and dirty” data, balance the need for accurate statistical approximations with the desire for computational efficiency. The work has broad possible applications in physics, psychology, economics, epidemiology, medicine and analysis of social networks.

“Artificial intelligence research has a history of being very interdisciplinary, especially at CMU, and it is going to continue to be in the future,” Singh says.

But modern methods of achieving artificial intelligence look “totally different” than they used to, says Sandholm, founder and director of CMU’s Electronic Marketplaces Laboratory. In 2014, he led the team that won the championship at the Association for the Advancement of Artificial Intelligence’s annual computer poker competition, and in 2015, his team’s AI, now called “Claudico,” took on four of the world’s best human poker players in 80,000 hands of no-limit Texas hold’em.

“Researchers today have a much more analytical approach, and AI has become more statistical and more rigorous, both theoretically and empirically,” says Sandholm, who holds patents on many methods of optimizing online marketplaces such as auctions. “Instead of building a system and saying, ‘Look, it kind of works,’ there is real evaluation today on the empirical side, proving that things work.”

Sandholm, for instance, has extensively deployed and demonstrated electronic marketplaces. In 1997, Sandholm founded CombineNet, which enabled major companies such as General Mills and Procter & Gamble to move their sourcing of supplies from manual processes and simple reverse auctions to highly sophisticated combinatorial auctions; CombineNet’s sourcing platform, powered by AI, allowed purchasing agents to place bids on extremely diverse combinations of goods and services. CombineNet powered 800 auctions worth $60 billion before being acquired in 2010. Sandholm’s newest startup, Optimized Markets Inc., is again applying sophisticated AI optimization techniques to change how advertising inventory is allocated and sold.

Although statistical machine learning has reshaped the study of artificial intelligence, not everyone was thrilled when those approaches first emerged. Some AI researchers who had been working on rule-based systems derided statistical methods as employing “brute force,” implying that it was somehow crude and less than scientific.

“When statistical methods first came on the scene, there was a (negative) reaction because the traditionalists, who had been working on rule-based systems for years, got frustrated when they found out that by simply observing data, a model could automatically be trained much more quickly than having a human observe and then write the rules,” says Eric Nyberg, a professor in CMU’s Language Technologies Institute.

He remembers a language technology conference in the early 1990s that he and several CMU colleagues attended. During one session, a debate ensued between rationalists and empiricists. The rationalists championed machine translation systems that used rules to describe a grammar that formally dictated correct English as it’s taught to humans learning to read and write. The empiricists supported the idea of doing statistical analysis of texts, and creating rules based on probabilities that would allow computer systems to understand language at something close to human level, even if they weren’t using formal rules of grammar.

“At the conference the people from Carnegie Mellon decided to fly over this controversy and say, ‘You’re missing the point. The point is that we have to combine these techniques because they have complementary strengths,’” Nyberg says. Rule-based systems “are really good at helping encode exceptions,” he says, but statistical methods “are good at teasing out the general rules of a domain.”

“They don’t work equally well on all cases, and until you learn what your domain is all about, you can’t presuppose which approach will be better,” he says.

In the early 1990s, Nyberg says, CMU was a pioneer of Multi-Engine Machine Translation, or MEMT, a method for processing data that uses multiple machine translation approaches, including both statistical and rule-based methods, within a single machine-translation system. This multi-strategy approach has since been leveraged by other language systems, most notably in IBM’s Watson question-answering system, which attracted international attention when it defeated two human champions on the TV game show “Jeopardy!” in 2011. Nyberg’s own research involves the open advancement of question-answering, or QA, systems; he and other CMU researchers collaborated with IBM and other universities to create the algorithms that powered Watson to its “Jeopardy!” victory.

Nyberg predicts that, going forward, artificial intelligence researchers will more and more be blending the two different approaches. “You can have a rule-based component that does a task and a statistically trained component doing the same task and constantly monitor how they’re doing,” he says. “We can combine the components on a task until such time when we realize one has surpassed the other.”

Over and over again, CMU faculty engaged in artificial intelligence research told me about the importance of blending different approaches. Sandholm says the “silos of learning” inherent in academia have to come down everywhere, but particularly in artificial intelligence. In the 2004 edition of her classic book, “Machines Who Think,” Pamela McCorduck laments that AI researchers have often broken into “subfields” such as “vision, natural language, decision theory, genetic algorithms, robotics” that “hardly have anything to say to each other.”
“Cross-fertilization of ideas is important within AI research, but also between AI and other disciplines,” Sandholm says. “From AI to operations research. From AI to economics. From AI to astrophysics. I tell my students never to write in a paper, ‘a computer scientist says this, but an economist says that.’ You should never think like that.”

Newell and Simon embodied that interdisciplinary approach. Unlike many early computer scientists, neither one came to artificial intelligence research directly from electrical engineering or mathematics; Simon was an economist and political scientist, while Newell was studying group dynamics and decision-making.

Manuela Veloso (CS’89,’92), who today holds the university professorship named in Simon’s honor, came to CMU as a student in 1986, when both Simon and Newell were still active faculty members. At a time when there was a lot of excitement about different divergent projects in artificial intelligence, “such as chess, and tools like neural networks in planning and problem-solving techniques, and the different architectures for cognition, and trying to figure out how machines learn,” Veloso says Newell was already arguing that these different research areas had to be integrated. “I was there, in 5409 Wean Hall, when Allen Newell went to the blackboard and picked up the chalk and wrote perception, cognition and action and drew boxes around the words. Underneath, he wrote the word agents. He said that the field of AI was fragmented and that it was time to put these things together.” Veloso, the immediate past president of the Association for the Advancement of Artificial Intelligence, said she personally took his comments “as the research direction for my life.”

In the early 1990s, Veloso founded the research lab called CORAL—for Collaborate, Reason, Act and Learn—which she still directs. With her students, she created robots that could play soccer together as a team, performing coordinated activities against other teams of robots. The first “RoboCup” soccer games were held in 1997 at the 15th International Joint Conference on Artificial Intelligence in Nagoya, Japan, with 10 teams competing in the real robot league, and 29 teams competing in computer simulations. Teams from CMU, led by Veloso, have competed in every “RoboCup” since. Her teams—competing in the “Small-Size League”—have won the competition five times and placed second four times; in 2015, they won all of the games in which they competed, finishing with a combined score of 48-0. “The robot soccer teams are a remarkable example of artificial intelligence, robotics and some machine learning,” Veloso says.

Roaming the halls of Carnegie Mellon’s Gates and Hillman Centers today are additional manifestations of that integration of AI research that Newell called for a quarter century ago. They are the CoBots, created by Veloso and her students, which deliver packages, lead visitors to various offices for scheduled meetings, and, when necessary, seek assistance from humans to push elevator buttons so they can carry out their tasks. To date, the CoBots have logged more than 1,000 kilometers navigating the GHC buildings.

They would never have reached that milestone—would not be able to demonstrate any artificial intelligence—without integration of their various capabilities. Veloso’s vision algorithms, for instance, must be connected to task-planning algorithms that in turn control actuators. For the CoBots to be successful, Veloso says, she realized that they would have to be aware of their own perceptual, physical and reasoning limitations. “We introduced the concept of symbiotic autonomy, in which the robots proactively ask for help from humans, or resort to searching the Web when they realize they lack the capabilities or understanding to perform parts of their service tasks,” Veloso says. The research and development of the robot soccer teams, of the CoBot robots, and of other robots in the CORAL lab requires integration of robotics, engineering and many other disciplines. “It’s a science itself, of determining how we are going to make all these (systems) work together,” she says.

Real-world successes, such as combinatorial sourcing auctions and the CoBots, have validated the integrative approach to artificial intelligence. Another of the real-world success stories in AI is the Kidney Paired Donation Exchange, which is powered by algorithms and software developed in Sandholm’s research group. There are two types of kidney transplants—a kidney swap, where a donor gives a kidney directly to someone (often a blood relative) who is medically compatible, or the much more complicated kidney chain, where a donor gives a kidney to a patient who they don’t know. Most kidney transplants worldwide now happen through such chains. Doing the work manually is a “very tough task,” Sandholm says.

The kidney exchange, launched in 2010, medically matches donors who are incompatible with their intended recipients to others in the network who need a transplant. AI algorithms make the transplantation plan autonomously for the entire United States twice a week. Sandholm and Procaccia are collaborating on the kidney exchange research, along with CMU computer science professor Avrim Blum, who developed an early version of the fielded algorithm.

“AI is doing a much better job than any human could possibly do,” Sandholm says. “It’s not because the doctors are dumb. It’s just that they’re sifting through more alternatives than there are atoms in the universe, and that’s not something humans can do.” But computers can.

Sandholm predicts that within the next five to 10 years, similar artificial intelligence techniques will be helping humans make smarter choices in many other critical areas, both in high-level planning and low-level decision-making. That’s because of the progress in AI algorithms over the past decade, as well as in the amount of data now available, he says.

Yet there are concerns that with their inherent lack of structure, current statistics-based methods are hitting their own limitations, just as purely rule-based methods did in the past. “A large fraction of the contemporary machine learning systems in action is using very simple logical or mathematical rules as of now,” says Eric Xing, professor of machine learning, language technologies and computer science and director of CMU’s Center for Machine Learning and Health. Those rules, he says, are mostly propositional rules—relatively simple inferences such as “if x, then y”—or measurements of dispersion of data. Future approaches to statistical analysis of big data will increasingly apply higher-order logic, Xing predicts, such as “relational or probabilistic rules,” that enable the resulting intelligence mechanisms to interface with data in ways that are both “stochastic and elastic.”

“The marriage of the two approaches will create richer models that are more likely to work in the real world,” Xing says.

Today’s hardware designs also affect the capacity to learn from big data. Because of the heavy computational power required to take advantage of the data, operating systems need to evolve. “In the past, I don’t feel that either the machine learning or artificial intelligence fields have been paying strong attention to the hardware and to the operating systems,” Xing says. “And vice versa. The hardware and system people don’t pay enough attention to the other side.”

He says the merging of hardware engineering and operating system design “is imminent.”

One of the next-generation platforms that brings machine-learning principles and artificial intelligence needs directly into the design of both an operating system and its hardware is called Petuum. Developed by a research group within Xing’s CMU lab, called “Statistical Artificial InteLligence & INtegrative Genomics,” or “SAILING” for short, Petuum is designed specifically for machine learning algorithms. It provides essential distributed programming tools to tackle the challenges of running machine-learning algorithms at large scale.

“The Petuum system ties together a cluster of machines with communication procedures, scheduling procedures and resource allocation and management procedures so that they turn up as a single machine interface to the user,” Xing says. One of those users is the recently established Pittsburgh Health Data Alliance, a joint effort between the Center for Machine Learning and Health, the University of Pittsburgh and Pittsburgh’s UPMC health care system. The Health Data Alliance collects massive amounts of data from sources as varied as electronic health records, genomic profiles and wearable devices. But instead of passively recording that data for humans to sort through later, the Health Data Alliance is using Petuum to analyze the data in real-time, find noteworthy patterns, and generate notifications or warnings, if necessary.

“We want to use our strength in artificial intelligence and computing to amplify the value of that data,” Xing says. The Center for Machine Learning and Health is also planning to develop new technologies—including a series of increasingly data-driven apps—that will change the way diseases are prevented and patients are diagnosed and treated.

When the Pittsburgh Health Data Alliance was announced in early 2015, CMU President Subra Suresh predicted it would help caregivers and patients to “move from reactive care to immediate, proactive prevention and remediation, from experience-based medicine to evidence-based medicine, and to augmenting disease-centered models with patient-centered models.” Xing has a catchier phrase for the effort. By mining UPMC’s health data, he says, the alliance hopes to “‘smart-ify’ the entire health care system.”

But are we any closer to developing a computer with human-level intelligence? At the Dartmouth conference in 1956, the organizers proposed that a “significant advance” in simulating human thought could be made as the result of their summer-long brainstorming session. More than 50 years later, we don’t seem yet to have created a machine that can demonstrate both human-level intelligence as well as creativity.

“We have systems that can mimic how we think human intelligence works, but how humans can spontaneously arise with ideas of identity, of self-will, of control, of consciousness, to me those concepts are very, very new,” says Emma Brunskill, an assistant professor in CMU’s Computer Science Department who is also affiliated with the Machine Learning Department. How a machine could replicate those attributes “is incredibly unclear at this point,” she says.

Advances in both processors and storage have allowed researchers to develop artificial neural networks that do simulate the way human brains engage in activities such as solving problems and processing language, Brunskill says. “With the increased focus on data and neural networks, we’re starting to tackle how agents and autonomous systems learn representations of the world from scratch,” she says. The challenge is in developing systems that take in raw data of many different kinds, build up a representation of what the data means, and then act on that representation in abstract ways—not just seeing, for example, a phone on a table, but understanding one of the concepts represented by a phone on a table: communication.

“Those types of systems are really exciting because they are able to go directly from sensory input of the world to real decision-making,” Brunskill says.

One of Brunskill’s research interests is developing online tutoring programs that continually improve their teaching methods as they interact with students. A major focus of concern is making sure it doesn’t take too long for those systems to self-optimize.  “We don’t want to teach a million students before we figure out what pedagogical activities are most effective for teaching fractions,” Brunskill says.

In an ongoing research project, Brunskill and her colleagues have developed an intelligent tutoring system for teaching students how to use histograms—graphical representations of statistical data. Many students, Brunskill says, have trouble understanding histograms, even after taking a statistics class. The tutor automatically found a strategy for teaching histograms by iteratively changing how it instructed students. When the tutor was evaluated, the researchers found that within the first 30 students who used it, the quality of instruction was comparable in results to a good, hand-designed tutorial.

Brunskill doesn’t dismiss the possibility of a computer some day achieving human-level intelligence, but she says there are aspects of the human brain that remain elusive. It may be possible that consciousness will evolve from a system once it reaches a certain level of mathematical or computational sophistication, she acknowledges, but that would raise another question. “How would we test if something was conscious?” Brunskill says.

In a 1950 article called “Computing Machinery and Intelligence,” Alan Turing suggested the hypothesis that we now call the “Turing test”—the idea that a computer can be considered “intelligent” if human users can conduct a conversation with the computer and not realize they’re talking to a machine. If computers aren’t yet there, question-answering systems—such as those being developed by Nyberg and his colleagues—are getting close.

That, in turn, has sparked fears of a computer uprising. “I, for one, welcome our new computer overlords,” joked Ken Jennings, one of the two human “Jeopardy!” champions defeated by Watson.

The pop-culture trope of rogue computers enslaving humans grew out of the classic definitions of artificial intelligence, which envisioned AIs as autonomous agents that didn’t need human input. But Nyberg says Watson actually represents a different approach, one that IBM calls “cognitive computing”—humans and machines collaborating to accomplish tasks better than either humans or machines can do on their own.

A team of CMU researchers led by Ariel Procaccia is exploring one such collaboration. They’ve applied artificial intelligence to social choice and game theory to help human beings allocate resources and make collective decisions, such as who gets a favorite family heirloom when a loved one dies, or how much an individual owes when sharing a cab with friends.

Understanding the “fair division” problem, Procaccia says, has until recently been the domain of economists and political scientists, but not so much computer scientists. Yet artificial intelligence is uniquely suited to providing unbiased advice, unaffected by emotion. Once a group of individuals faced with a decision establish what “fairness” means to each of them—whether it’s a social good or a personal goal—“then we can feed that into a computer and develop algorithms that achieve those notions of fairness,” Procaccia says. In 2014, he and other researchers launched the website, a not-for-profit academic endeavor that utilizes economics, mathematics and AI research to provide people with provably fair methods to resolve everyday dilemmas, such as how to split rent, divide tasks or apportion credit for a project.

They’re now applying their research to more complex scenarios, says Procaccia, who’s working with California’s largest public school system, the Los Angeles Unified School District, to develop a computer program that will make suggestions for the fair allocation of students in classrooms in the district’s 241 charter schools.

“I want to leverage economic theory to develop computer programs that make intelligent suggestions for interactions between multiple people,” he says.

Unlike the blind men of Indostan, who remained steadfast in their conclusions about the nature of an elephant, I came away from my conversations with all seven Carnegie Mellon professors with a better appreciation of the breadth and depth of artificial intelligence—and the magnitude of its impact on our lives.

As the field evolves, one aspect of artificial intelligence on which all seven Carnegie Mellon professors concur is that it will reach into almost every aspect of human existence. “It’s already pretty broadly in our lives today,” says Sandholm, “but in the next 10 to 20 years, it’s going to be running almost everything.”

Another thing that became clear from my conversations is that from the days of Newell and Simon’s participation in the Dartmouth conference, CMU has played a key role—perhaps the most integral role—in moving artificial intelligence research forward.

In October 2000, at CMU’s Earthware Symposium, Herb Simon ranked the computer as one of the top three human inventions, along with language and organization. “It has already become, and will continue to become increasingly, a constant companion and partner of the human mind,” Simon said, arguing that the task of computer scientists is “to design a future for a sustainable and acceptable world, and then devote our efforts to bringing that future about.”

Carnegie Mellon today is arguably living out Simon’s vision. “CMU is not just one of the players in AI research,” Xing says. “It is very unique and has the advantageous position of leading the research.”

Linda K. Schmitmeyer is a freelance writer and editor and teaches non-fiction writing at Pittsburgh’s Point Park University. Link Editor Jason Togyer contributed to this story.

For More Information: 

Jason Togyer | 412-268-8721 |