# Neural Information Processing Systems

## Tutorial Program: November 30, 1998

### Session I: 09:30--11:30

• Do We Understand the Neural Code? -- William Bialek, NEC Research Institute
• Independent Component Analysis and Blind Separation of Signals -- Jean-Francois Cardoso, Centre National de la Recherche Scientifique and Ecole Nationale Superieure des Telecommunications, Paris

### Session II: 13:00--15:00

• Neocortical Synapses -- Henry Markram, Weizmann Institute
• Learning Theory and Generalization for Neural Networks and Other Supervised Learning Techniques -- Peter Bartlett, Australian National University

### Session III: 15:30--17:30

• Computational Vision: Principles of Perceptual Inference -- Daniel Kersten, University of Minneapolis
• Exploratory Data Analysis and Data Visualization -- Joachim Buhmann, Universitat Bonn

Session I: 09:30--11:30
(To top)

Do We Understand the Neural Code?
William Bialek, NEC Research Institute

Throughout the brain, signals are encoded in sequences of identical pulses, called action potentials or spikes. All of our perceptions and actions are built out of spikes, and our understanding of the neural code guides our thinking about the mechanisms of neural computation. Although spikes from single neurons have been recorded for seventy years, there is renewed excitement in the field.

Three developments are driving the resurgence of interest in such an old problem. First, new experimental methods allow the simultaneous recording of spike sequences from multiple neurons. Second, it is now feasible to study neural activity under conditions that approximate more closely the complex, natural environment in which the brain is designed'' (or, more precisely, selected) to function. Finally, many of the qualitative questions in the field have been given a precise, quantitative formulation using ideas from information theory. In the background of this discussion are ideas about the computational abilities of single neurons that have to read' the coded signals sent by (many) other neurons.

I will claim that, at least in one case, we are close to true understanding: we can measure the information content of spike sequences, identify the elementary symbols in the code, characterize the features of the sensory world for which the symbols stand, put these symbols together to read the neural code, and test this understanding in a nearly natural stimulus context. Major parts of this program have been completed in many different systems. Along the way it has been discovered that the nervous system can adapt not just to constant signals (as in light and dark adaptation), but to more complex statistical structures in the distribution of inputs, as has long been conjectured. Finally, several once murky issues are now in focus, suggesting an agenda for the next few years.

Biography: William Bialek has been at the NEC Research Institute since 1990, shortly after it was founded. Previously a member of the physics and biology faculties at the University of California at Berkeley, he is currently lecturing in the physics department at Princeton University. He also serves as codirector of the summer course on computational neuroscience at the Marine Biological Laboratory in Woods Hole, and as a visiting faculty member in the Sloan Center for Theoretical Neurobiology at the University of California at San Francisco. Together with Rieke, Warland, and de Ruyter van Steveninck, he is coauthor of the recent book {\em Spikes: Exploring the Neural Code} (MIT Press, 1997).

Independent Component Analysis and Blind Separation of Signals
Jean-Francois Cardoso, Centre National de la Recherche Scientifique and
Ecole Nationale Superieure des Telecommunications, Paris

Independent Component Analysis (ICA) and Blind Source Separation (BSS) are emerging techniques of signal processing and data analysis which are receiving increasing interest in the NN community.

The objective of BSS is to recover unobserved signals from several observed mixtures. A typical application is the cocktail party problem': separate the voices of n speakers when received on a set of n microphones. The key feature of BSS algorithms is that the mixture coefficients are unknown and are discovered together with the source signals. Not relying on assumptions about the mixture makes BSS algorithms very versatile: they can be applied for extracting source signals with very little prior knowledge. Successful applications domains include digital communication with sensor arrays and biomedical signals with multi-electrodes (electro-encephalography, electro-cardiography, electro-myography,...)

The objective of ICA is to discover in a random vector a set of components which are as independent as possible'. This is complementary to principal component analysis (PCA) but, while PCA is based on correlations and energy, ICA is based on independence and information. ICA is also closely related to the problem of finding sparse representations and factorial codes.

The underlying models and the algorithms for BSS and ICA are in fact identical and the algorithms are fundamentally based on exploiting the assumption of statistical independence between sources/components. However, for BSS/ICA to make sense, it is necessary to express independence beyond decorrelation and this can be done only within non-Gaussian models. The ICA/BSS challenge is to exploit the non-Gaussianity of real world signals.

This tutorial provides a unifying view of the approaches developed in the last ten years. It will in particular review:

• information-theoretic foundations of ICA and their interpretation in the light of information geometry,
• comparison between ICA, PCA and projection-pursuit,
• the role of non-linear data transformations and the use of higher-order correlations,
• the existence of smart gradient' algorithms which are both simple and efficient,
• the underlying statistical structures of ICA/BSS models,
• some sucessful applications,
• extensions and open problems.
Biography: Jean-Francois Cardoso is with the French CNRS (National Center for Scientific Research) and is based at the Signal and Image Processing' department of Telecom Paris'. He graduated from Ecole Normale Superieure' in 1984 in Physics. He is a member of the IEEE Technical Committee on Statistical Signal and Array Processing. His research interests are in statistical signal processing and the connections to neural networks and information theory. For further information, see: http://www-sig.enst.fr/~cardoso/stuff.html.
Session II: 13:00--15:00
(To top)

Neocortical Synapses
Henry Markram, Weizmann Institute

There are a growing variety of types of neocortical synapses, with specific rules emerging as to which types are used by specific neurons to communicate with other neurons. This enormous heterogeneity of synaptic types poses a serious challenge to any theory of brain function. In order to simulate and capture the complex computation performed by neocortical networks it could therefore be essential to consider the rules which dictate synaptic transmission between specific classes of neurons. Distinct rules have emerged for synapses formed by the pyramidal neurons of the neocortex. The several thousand excitatory synapses that are formed from a single neuron exhibit great diversity in terms of their absolute strengths, the probabilities of neurotransmitter release, the rates of depression and facilitation - a rule that seems to be dictated by the types of neuron targeted (activity-independent), while the unique history of pre and postsynaptic neurons seems to determine the precise values for these synaptic parameters (activity-dependent). On the other hand, synapses established by inhibitory interneurons follow an entirely different set of rules. Differential synaptic signaling could therefore have a profound influence on neural network dynamics - but what is this influence? In order to understand the impact of dynamic synaptic transmission and how neural networks may exploit dynamic transmission by using general organizing rules, it is essential for experimentalists and modelers to get to grips with how these little computational devices could operate and learn. The first half of the tutorial will be simple biophysics of synapses oriented at describing the machinery of synapses, how this machinery is thought to work and to point out where things may change. The second half of the tutorial will be more theoretical and will focus on the potential impact of changing specific synaptic parameters on neural network dynamics.

Biography: Henry Markram obtained his PhD at the Weizmann Institute for Science with Menahem Segal. He did a postdoc in Bert Sakmann's lab and started at the Weizmann in 1995 where he focuses on the anatomical, physiological and learning principles of connections between specific classes of neurons in the neocortex. For further information see: http://www.weizmann.ac.il/brain/markram/markram.htm.

Learning Theory and Generalization for Neural Networks and Other Supervised Learning Techniques
Peter Bartlett, Australian National University

This tutorial will provide an introduction to the theory of the generalization performance of supervised learning techniques. It will explain several key models and describe the main results relating the generalization performance of a learning system to its complexity. The discussion will concentrate on pattern classification and real prediction problems, using neural networks as examples, but these results are of considerable importance in understanding a much broader variety of phenomena in machine learning.

The latter part of the tutorial will concentrate on recent advances that exploit these results to provide new analyses of large margin classifiers. Many pattern classifiers, such as neural networks and support vector machines, and techniques for combining classifiers, such as boosting and bagging, predict class labels by thresholding real-valued functions, and tend to have a large margin between the predicted value and an incorrect prediction. This part of the tutorial will focus on large margin classifiers, presenting results on the generalization performance of these classifiers, and explaining why their size is not the most appropriate measure of their complexity.

Biography: Peter Bartlett is a Fellow in the Research School of Information Sciences and Engineering at the Australian National University. His research has concentrated on the areas of computational learning theory and the theory of neural network learning. He was the program committee co-chair (with Yishay Mansour) of the 1998 Conference on Computational Learning Theory, and is co-author (with Martin Anthony) of the soon-to-be published book, A Theory of Learning in Artificial Neural Networks'.

Session III: 15:30--17:30
(To top)

Computational Vision: Principles of Perceptual Inference
Daniel Kersten, University of Minneapolis

This tutorial will review recent progress in our understanding of human vision as statistical inference. It is now widely appreciated that the problem of visual perception is complex and formally hard. Theoretical work has highlighted a number of problems that constrain our understanding of the nature of the neural machinery underlying vision. One problem pointed out as far back as Helmholtz is that interpreting image data is underconstrained--there are multiple interpretations of the world consistent with the image data. A second problem is that for any given visual task (e.g. object recognition), there are image variations (e.g. illumination, clutter, noise) that confound the signal (e.g. object shape). A key to solving the problems of image ambiguity and variations is to understand how vision exploits the inherent statistical structure of natural images for the various tasks vision is used for. We will look at theory and experimental results drawn from several research groups. Topics will include: early visual coding as redundancy reduction; learning and using intermediate-level organizational processes (e.g. surface structure and Gestalt principles); and, high-level visual functions (object recognition and localization) as Bayesian inference.

Exploratory Data Analysis and Data Visualization
Joachim Buhmann, Universitat Bonn

Exploratory data analysis addresses the problem to extract structure from data by grouping and visualization. Different techniques, which can be considered as unsupervised learning methods, are employed to search for clusters in data, to project high dimensional data to informative linear or nonlinear subspaces or to embed relational or cooccurrence data in low-dimensional Euclidian spaces. This tutorial gives an overview of different data types and presents methods for data clustering and visualization. All methods are formulated as optimization problems. Algorithms to search for optimized solutions are derived from the maximum entropy principle. The robustness of this data analysis strategy can be understood by statistical learning theory. Experimental results are shown for image retrieval, image segmentation, image quantization and visualization of histogram data.

Biography: Joachim M. Buhmann received a Ph.D degree in theoretical physics from the Technical University of Munich in 1988. He held postdoctoral positions at the University of Southern California and at the Lawrence Livermore National Laboratory. Currently, he works as an Associate Professor for computer science at the University of Bonn, Germany where he heads the research group on Computer Vision and Pattern Recognition. His current research interest covers the theory of neural networks and their applications to image understanding and signal processing. Special research topics include data clustering and data visualization, active data selection, statistical learning theory, stochastic optimization techniques, video compression and sensor fusion for autonomous robots. For further information, see: http://www-dbv.informatik.uni-bonn.de

We have attempted to ensure that all information is correct, but we cannot guarantee it.