Server: Netscape-Commerce/1.12 Date: Tuesday, 26-Nov-96 00:06:13 GMT Last-modified: Thursday, 15-Jun-95 00:40:49 GMT Content-length: 18834 Content-type: text/html THEORY OF COMPUTATION GROUP

As one of the world's largest groups of theoretical computer researchers, the Laboratory for Computer Science (LCS) pursues nearly every major area of computer technology. Our interests range from basic mathematical theory -- such as computational geometry, complexity theory, and number-theoretic algorithms -- to theoretical work on the foundations of electronic circuitry, communications, biology, cryptography, and computer architectures.

An important goal of theoretical computer science is to create formal models of computation, then explore what is possible within those models. The results not only deepen our understanding of the basics of computer science, but also alter its practice through more efficient algorithms, novel architectures, and a better understanding of a program's "meaning." While many of these models reflect recent technological advances -- parallel or distributed computing, for example -- work also is performed on such traditional models as finite automata and ordinary sequential computers.

MIT is the world's leader in parallel algorithms and architectures. We thus work closely with architects and systems designers to create the next generation of parallel supercomputers. Faculty and students interact with such leading companies as Thinking Machines and IBM to design and analyze communication networks, parallel computation models, efficient parallel algorithms for various applications, and methods for making large-scale parallel machines more fault-tolerant.

Not surprisingly, we are deeply involved in the design and use of the forthcoming "information highway." Efficient network-based communications, in fact, is one of the most important and exciting challenges facing theory researchers.


Tom Leighton,
Professor of Applied Mathematics
(Parallel Algorithms)
Michel Goemans,
Assistant Professor of Applied
Mathematics (Efficient Algorithms)

LCS also vigorously researches efficient algorithms for sequential computers. Surprisingly, perhaps, improved algorithms for well-known problems continue to be discovered, and new theoretical problems often arise as spin-offs from advances in computer technology. Some of our work focuses on algorithms for graph problems, computational geometry, number-theoretic problems, and the laying out and routing of VLSI circuitry. Other recent projects include on-line algorithms (which do not know all the data in advance), randomized algorithms (which use random numbers to aid decision-making), and approximation algorithms (which are guaranteed to find near-optimum solutions).

Many fundamental problems that provide insight into the design and analysis of efficient algorithms lie in the area of combinatorial optimization. We recently have seen a surge of exciting developments in approximation algorithms for difficult optimization problems, and LCS is a leader in obtaining novel and general techniques for designing such algorithms. We have developed improved approximation algorithms for a variety of problems, including those related to multicommodity flow and network design, as well as more specific problems such as the graph bisection problem or the maximum cut problem.


Alan Edelman,
Assistant Professor of
Applied Mathematics
(Scientific Computing)
David R. Karger,
Assistant Professor of
Computer Science
and Engineering (Algorithms)

Largely as a result of rapid advances in parallel computing technology,scientific computing has become one of computer science's most active areas. This interdisciplinary area bridges numerical analysis, linear algebra, computer architecture, program analysis and optimization, software engineering, scientific visualization, and various scientific applications.

Problems in scientific computing often strain the resources of modern parallel machines, compelling us to advance new tools and ideas. LCS researchers have pioneered the adaptation of algorithms for the special needs of these scientific applications.

Scientific computing involves various research topics in theoretical computer science, such as finite element and finite difference mesh generation, sparse and dense matrix computations, and the solution of large-scale linear systems. Some of these problems can be translated into or approximated by combinatorial and geometric problems, including network optimization, communication network topology emulation, graph embedding, parallel machine scheduling, dynamic load balancing, geometric modeling, and triangulations. A fundamental issue in parallel scientific computing is mesh partitioning, in which a large mesh is divided into a given number of pieces of roughly equal weights so that the "boundary" is "small." Efficient partitioning is vital to balance the load and reduce communication in parallel solutions of sparse linear systems. It is also useful in parallel emulation of computational meshes on hypercube and butterfly architectures and out-of-core algorithms for iterative relaxation.

Computational biology represents another new and exciting research area. Accordingly, one of our goals is to expand the computational "toolkit" that is available for numerous biological problems.

Computer science, for example, helps make sense of the vast amount of information now being compiled for the Human Genome project, such as DNA and amino-acid sequence data. One intra-MITeffort has drawn on the resources of LCS, the Whitehead Institute, and the Biology and Mathematics departments; specific research areas include computational approaches to protein folding, physical and genetic mapping, virus shell assembly, AIDS theories, and sequence homology and alignment.

Yet another illustration of computational biology relates to the so-called "grand challenge" associated with protein folding -- that is, the determination of how a protein will fold three-dimensionally when only its amino-acid sequence is known. An important first step in answering this question is a solution to the motif recognition problem: Given a known 3D structure (or motif), researchers must determine whether the fold occurs in an unknown sequence of amino acids and, if so, in which positions. Techniques from theoretical computer science have been particularly effective in solving such problems.


Ronald L. Rivest,
Edwin Sibley Webster
Professor of Computer Science
and Engineering
Associate Director, LCS (Machine Learning)
Bonnie A. Berger,
Assistant Professor of Mathematics
(Computational Biology)

On another front, researchers in machine learning study computers' ability to "learn" from experience. The results of our research have stimulated various formalizations that address a range of issues in psychology, artificial intelligence, pattern recognition, and neurobiology. Some recent research themes include the inference of finite automata; learning in the presence of noise; learning an unknown environment "piecemeal" by exploration; learning of "manifest" systems (in which relevant variables are often, but not always, visible to the learner), and models of "teaching."

In general, the group's research is "positive" in nature, in that we strive to develop provably efficient learning algorithms with potentially practical application. In some cases, however, the research leads to equally useful "negative" results by identifying the limits of what is ultimately learnable.

Another major theme is the development of new models of learning that provide better theoretical formulations of real-world learning situations and the appropriate algorithms. One example is to learn a concept defined by Boolean formulae from examples of that concept. Another is to infer the structure of a finite-state system by examining the system's input/output behavior. Statistical techniques are needed to determine how much data are needed for a given problem; complexity theory helps assess the difficulty of computing the desired answer from the data.

Machine-learning research is generally theoretical in nature (although some is experimental), and involves the careful specification of models of learning and the precise specification and analysis of learning algorithms. We use a wide range of models to capture different aspects of technical and philosophical relevance, such as learning from noisy data; learning hierarchically structured concepts; learning with neural nets; learning via different output representations; learning to represent a system containing hidden state variables, and trading off simplicity of hypothesis with quality of fit to the data.


Shafrira Goldwasser,
Professor of
Electrical Engineering
and Computer Science
(Cryptographic Protocols)
Silvio Micali,
Professor of
Electrical Engineering
and Computer Science
(Cryptographic Protocols)

Cryptography is another important area of research within LCS. In its simplest and most ancient form, cryptography relates to secret communication. Cast in the framework of complexity theory, the sender, the recipient, and the adversary are computationally bounded machines. An encryption system is deemed "secure" when it is computationally unfeasible for an adversary to obtain information from their encodings. Since proving non-trivial lower bounds on the complexity of NP-complete problems is not within the current state of the art, proof of security must show that any method for compromising security can be transformed into an efficient algorithm for a problem (such as factoring integers) which is generally believed to be intractable.

Achieving privacy is only one area of cryptography research. Another is the design of protocols for authentication, certified electronic mail, and contract signing between two or more mutually suspicious parties. In general, the goal is to perform an arbitrary distributed computation among many processors, each containing some portion of the input. Each processor is connected in a network such that no processor reveals any information other than that which is intended.

Protocol research has led to a complexity theory of the amount of knowledge that must be released in order for one processor to prove a fact to another processor -- the theory of "zero-knowledge proofs." Generating pseudo-random numbers and functions is another important field. (Randomness is here defined with respect to a specific model of computation and a specific level of computational resources.)

LCS researchers have contributed to virtually all cryptographic inventions of the past decade, including the invention of the first public-key cryptosystem, probabilistic cryptosystems, and the invention of zero-knowledge proofs.


Michael Sipser,
Professor of Applied Mathematics
(Computational Complexity Theory)
Mauricio Karchmer,
Assistant Professor of Mathematics
(Computational Complexity Theory)

LCS also enjoys a traditional leadership role in computation complexity theory. One of the prime goals in this field is to devise and study natural schemes for classifying problems according to their computational difficulty, then place familiar and important problems within the appropriate scheme.

One familiar example is the problem of factoring large integers -- that is, finding all the prime numbers that divide the integer evenly. The exercise is not only theoretically interesting, but also is relevant to cryptography. The "brute force" method of searching for prime factors one by one is too slow to be useful. While better algorithms are known, determining the intrinsic difficulty of the factoring problem is one of complexity theory's many exciting questions.

LCS researchers were the first to show that there are problems of high intrinsic complexity. Now we are investigating the complexity of problems akin to factoring by studying the power of weak computational models, such as branching programs and monotone circuits of bounded depth. These formally constrained models are easier to analyze and thus may help us better understand the standard models. In closely related work, we are studying the power of probabilistic computation, parallelism, randomness and pseudo-randomness, interactive proof systems, and other basic computing concepts.


Albert R. Meyer,
Hitachi America Professor
of Engineering
(Program Semantics and Logic)
Peter Elias,
Professor Emeritus
and Senior Lecturer
(Information Theory)

Elsewhere within LCS, researchers into the theory of programming semantics and logic aim to provide clear mathematical foundations and reliable reasoning principles that conform to the robust functional metaphor programmers use to design, describe, and justify their programs. Programming routinely unites high abstraction and pragmatic design and includes the declaration of procedures, functions, data types, processes, and "objects." The researchers' objective is to lay a solid foundation for this task.

Computer scientists' notion of a "function," however, may depend on when and in what context it is evaluated. That contrasts with the mathematician's classical notion of a function. Bridging this conceptual gap involves elements of algebra, modal and intuitionistic logic, and category-, complexity-, computability-, proof-, and type theory -- all applied to programming-language design, compiler construction, and program optimization. This work is being extended to the study of specifying the meaning and verification of the properties of parallel and distributed processes.


Nancy Lynch,
Professor of Electrical Engineering
and Computer Science
(Distributed Computing)
Baruch Awerbuch,
Research Scientist
(Distributed Computing)

Distributed computation theory is designed to clarify the basic capabilities and limitations of concurrent and distributed computing systems. Research results include new algorithms and their analysis, impossibility results, formal concurrent-systems models, and models and techniques for proving correctness of concurrent algorithms. Problems typically include getting the non-failing processors to agree, synchronizing non-failing processors, fault-tolerant compiling and routing, resource allocation, sharing access to data, and graph-theoretic problems such as doing a breadth-first search or finding minimum-cost spanning trees.

A basic problem in fault-tolerant computing is to cause processors to agree among themselves (about the value of a data item, say, or about a common course of action). While this is a simple exercise in the absence of faults, it may be impossible when faults are present: Individual processors do not have reliable knowledge about the states of other processors. Our work has led to many interesting algorithms and impossibility results that demonstrate conditions under which consensus can or cannot be achieved.

An important LCS project related to network protocols has been the development of a series of efficient algorithmic transformers. The result of this project is the "compilation" of protocols that were designed for a relatively simple network model into protocols that run in a more complex, but more realistic, environment.

LCS has developed an important formalism -- the Input/Output Automaton Model, a basic mathematical model for concurrent and distributed systems and their components. This simple-state machine model helps describe interactions between a concurrent system and its environment. The model not only verifies the correctness of algorithms but also helps find and fix serious gaps in some basic existing algorithms -- the construction of multi-writer atomic registers, for example.