The long term goal of the ONSIS project is to map the complete interconnect structure of the nervous systems of small biological organisms up to the size of a laboratory mouse. "Reverse engineering" the circuit diagram of small insect and mammalian brains, including their connections to eyes, sensory cells and muscles is a challenging, multidisciplinary research project just far enough beyond currently available technology to drive research in large scale signal processing, image processing algorithms, optical sensor technology, system integration, MEMS, fiber optics, and several related areas. The ultimate research results of ONSIS are not necessarily limited to contributions to neuroscience, rather a complete map of all cells in an organism, which is impractical to build via conventional means, also furthers research in embryo-genesis and the understanding of how genetic information controls growths. The ONSIS project has two principal components, namely the data processing system and the optical imaging system that will produce the raw data. A laboratory mouse represents a volume of approximately 2 by 2 by 4 cm^3, which needs to be scanned at a resolution of better than 0.2 micrometer in each linear dimension. Furthermore, it is necessary to measure the optical properties of each voxel at several wavelengths and to detect fluorescent dye markers in order to identify neuronal structures via selective labeling (instead of expert analysis of images). This results in more than 18 Peta-Bytes of raw data, which exceeds the capacity of practical and affordable storage devices. Therefore, ONSIS requires a large amount of real-time signal processing to fuse the data from multiple channels into a single 3D data set and then compress this dataset for archival storage and subsequent off-line analysis. Because of the time critical processes for selective tissue staining, the scanning process must proceed uninterrupted once started, which means that in the full system configuration the signal processing system must deal with more than 18 G-bytes/sec of sensor data continuously for several months. This requirement is expected to drive the development of a cost-effective, scalable processing system based on state of the art DSP chips and a FPGA based interconnect system. The optical imaging system for ONSIS consists of four components that are integrated into one, fully automated unit:
- The back-scatter imager, which is a confocal microscope that looks into the surface of a fixated biological specimen.
- The microtome, which removes thin slices from the the specimen and transfers them onto an optically clear carrier film
- The staining system, which uses selective dyes to attache fluorescent labels to the sliced specimen in a manner resembling the development of photographic film.
- The fluorescent imager, which locates the neuronal structures of interest.
Fiducial marks extracted from the first step guide the re-assembly of data from step 4, which necessitates a large amount of transient storage to bridge the processing time of step 3. The starting point for ONSIS was confocal microscopy, which was originally proposed by Prof. Minsky at MIT and which has evolved over the last 25 years into highly sophisticated, commercial instruments that have both sufficient resolution and selectivity to trace neuronal structures. However, these microscopes are lacking the throughput to achieve the goals of ONSIS within practical scan times. Furthermore, these instruments are not integrated and require numerous manual processing steps. They also lack the mechanical precision to assemble images into a large, complete 3D stack. Positioning systems based on air-bearing and laser interferometry, which are commonly used in silicon wafer processing, provide the required precision and will be used in ONSIS. While investigating ways on how to speed up the scan speed of confocal microscopes through the use of multiple, concurrent channels, interferometric means were explored to minimize the cross-talk between neighboring channels. This let to the proposed system of heterodyne detection, where laser light is split into two beams with an acoustical optical modulator (AOM) so that the frequencies of the beams differ by a few tens of MHz. One beam is used to illuminate the specimen while the other beam is used as a reference in an interferometer. The detector, receiving light from both beams, produces a beat signal that is within the range of conventional RF processing circuitry. Compared to ordinary confocal microscopy, this system has several advantages in addition to channel isolation and higher throughput:
- A solid state detector can be used instead of the customary photo-multiplier tubes (PMTs). Solid state detectors have much higher quantum efficiency compared to PMTs (80-90% vs. 20-30%). PMTs ordinarily are more sensitive, because of their high internal gain, which cannot be matched with solid state photo-diodes and base-band amplifiers. However, in heterodyne detection, only the RF-component is of interest, which can be amplified with comparably low noise performance and much better linearity.
- A solid state detector has a much larger dynamic range (> 120db) than a PMT, which is especially important for backscatter imaging.
- Unlike ordinary intensity detectors, the RF signal from the heterodyne detector preserves the optical phase. This means that in effect a microscopic hologram is recorded. This in turn allows more advanced imaging modes, for example index of refraction contrast. Also holographic wavefront reconstruction should lead to improvement in the resolution, especially in the Z-axis, which tends to have lower resolution in confocal microscopy.
In addition, a rapidly tunable dye laser is being developed that can generate optical FM chirps. By combining optical chirping with heterodyne detection, it is possible to disambiguate the interference signal so that the length of the scattered light path can be determined exactly. This is analogous to CW/FM radar processing. In fact, because of the heterodyne detection system, many of the advanced RF signal processing techniques become applicable. By the same token, light scattered from parts of the optical system (say, internal lens surfaces) can be rejected, which should increase image contrast. A number of related methods were developed for the fluorescent imager, which can also reconstruct the optical phase, even across the fluorescence process, which normally destroys the phase relation between the excitation and emission light.
The Piranha single chip multiprocessor (CMP) project is not happening at CMU, rather it is being developed at the Compaq Western Research Laboratory in Palo Alto, Silicon Valley. However it is near and dear to be because I worked on it until I moved to CMU. Basically, Piranha recognizes the fact that ever more complex microprocessors featuring out-of-order execution, multiple functional units (superscalar), deep pipelines, complex branch-prediction, speculation, simultaneous multithreading (SMT), and so on, lead to very complex design that are difficult to verify, that are costly to implement and that do not perform well on many commercial applications. At the same time, these commercial application classes (data bases and web servers) have a large amount of inherent parallelism that can be use with more processors. Piranha integrates 8 simple Alpha cores onto one chip along with a shared cache and memory subsystem, I/O and a scalable interconnect structure.
Thanks to steady technology advances, the size of the components, gates and flip-flops that make up current computers have been shrinking dramatically. This led to Moor's law, tightly integrated microprocessors and a dramatic, continuous increase in performance. This trend will certainly continue for some time to come, however eventually the current class of computer architectures will experience problems stemming from the fact that the underlying abstractions do not match the physical reality of the implementation technology. For example, conceptually today's computer assume random access memory, where the cost for accessing memory does not depend on the address. In the physical world, storing data requires some structure of finite size, therefore the volume required for memory grows linearly with memory size, which gives rise to an O(n^1/3) cost of accessing memory due to finite signal propagation speed. There are many other reasons why classic computers are not necessarily the optimal architecture, not the least of which is the steady increase in system complexity. Here are some speculations on alternatives.
There are numerous 3D graphical display devices that have been build, even more that have been patented or proposed. The most common system in use is based on stereoscopic vision, where two images are displayed in a manner that each eye can see only the corresponding image. For example, PZT shutter glasses are used to sequentially display the left and right image on a screen. This is a far cry from having a 3D object floating in space as shown in various sci-fi movies. True 3D displays by their very nature require vastly more bandwidth and compute power, because of the need to render the views of the object from all, or at least many points of view. Here are some raw ideas on how one might to go about doing this.
Back to my home page