Fu-Hua Liu, "Environmental Adaptation for Robust Speech Recognition", Ph.D Thesis, CMU, June, 1994. Abstract Lack of robustness with respect to environmental variability is a continuing problem for speech recognition. Many studies have shown that automatic speech recognition systems perform poorly when there are differences in the acoustics of the training and testing environment. Several approaches have previously been considered to compensate for environmental variability, including techniques based on autoregressive analysis, the use of auditory models, and the use of array processing, among many other approaches. This dissertation describes a number of new algorithms that improve the ability of speech recognition systems to adapt to new acoustical environments. These new testing environments are assumed to differ from the training environment because of the presence of both unknown additive noise and distortion from unknown linear filtering. The algorithms are based on previous research in which significant environmental robustness had been achieved by modifying the cepstral coefficients that are input as features to speech recognition systems. The present work extends the previous results along the dimensions of improved recognition accuracy, reduced dependence on specialized training data, reduced computational cost, and greater integration of environmental compensation into the matching algorithm of the speech decoder. Environmental compensation is generally accomplished by the application of one of an ensemble of additive corrections to either the features that are input to the recognition system, or to the internal representation of speech inside the recognition system itself. The exact compensation is time varying. For each 20-ms speech segment the choice of compensation vector depends either on physical attributes such as the instantaneous signal-to-noise ratio, or on the putative identity of the phoneme during that segment as hypothesized by the speech decoder. The actual values of the compensation vectors are determined by frame-by-frame comparisons of large numbers of cepstral vectors of speech that is simultaneously recorded in the training environment and in one of a number of prototype secondary environments. Compensation is performed by first estimating which of the prototype environments most closely resembles the testing environment, and then by applying the compensation vectors that are appropriate for that environment. The new algorithms are evaluated in terms of their effectiveness in improving environmental robustness and their computational complexity, among other attributes. It is found that further in creases in robustness can be obtained by combining algorithms that process features that are input to the system with algorithms that modify the system's internal representation of speech. Linear interpolation of compensation vectors from different environments is generally helpful when the system was tested in an environment that was not one of the prototypes used to develop compensation vectors. In a standard ARPA evaluation of a 5000-word system that recognized sentences recorded in unknown environments, combination of these techniques typically decreases the rate of word errors by 66% compared to no environmental processing at all, and by 40% when the standard technique of cepstral mean normalization is included in the baseline system.