SPEAKER ADAPTATION IN CONTINUOUS SPEECH RECOGNITION VIA ESTIMATION OF CORRELATED MEAN VECTORS The present study addressed the problem of speaker adaptation in both feature-based and stochastic model-based continuous speech recognition systems. Effective speaker adaptation procedures must be able to adapt to the characteristics of a new speaker given speaker-specific training data in quantities which are well below those required for training speaker-dependent systems. The adaptation algorithm must be computationally efficient to allow for a short enrollment process. Since the basic recognition unit in continuous speech recognition systems is at the sub-word level, user feedback of unit labels is impractical. The adaptation algorithm should therefore operate in an unsupervised mode. The approach taken in this thesis was to use multivariate parameter estimation procedures to update the mean values of the component densities which comprise a feature-based system's classifiers, or a stochastic model-based system's codebook. Emphasis was placed on obtaining low initial estimation error with a computationally efficient algorithm. Adaptive filtering techniques were exploited to derive an estimator which met these conditions. The Bayesian optimal (EMAP) estimator was first shown to be equivalent to a minimum mean-square error (MMSE) adaptive filter with timevarying data statistics. A stochastic gradient approximation of the MMSE formulation resulted in a least mean-square estimator, called LMS-C, which with proper initialization produced a faster rate of convergence than the Bayesian estimator. Computational requirements of the LMS-C estimate are approximately one-third of those of the EMAP estimate. Unlike the EMAP estimate, however, the LMS-C estimate is asymptotically biased. This misadjustment is negligible in the context of the speaker adaptation problem. Expressions which define the LMS-C algorithm and its mean-square estimation error were derived and analyzed assuming correlated, jointly-gaussian data distributions. Compared with maximum likelihood (ML) estimation, the additional expense required for LMS-C (or EMAP) estimation was shown to be justified when the dogmatism of the data is neither very large nor very small, and training data is limited. Relative gains of LMS-C and EMAP estimates over ML estimates were shown to increase with increasing correlation between the data means and with increasing skew in the class' prior probabilities. The general limitations of LMS-C, EMAP, and ML adaptation procedures were assessed in the context of unsupervised speaker adaptation in the Carnegie Mellon ANGEL system, a novel featurebased system called PROPHET, and a semi-continuous version of the CMU SPHINX system. Comparisons between the ANGEL and PROPHET systems indicated the necessity for the adaptation data to obey the gaussian assumptions made in derivation of the estimation algorithms. When these assumptions were met (using computer-generated data), adaptation using the LMS-C or EMAP algorithms reduced front vowel classification error rates by 28% after the presentation of 10 unlabeled training samples. Five iterations through the training data were shown to reduce the error rate by an additional 10% over the one-iteration rate. Unsupervised adaptation experiments with a synthetic HMM indicated that the EMAP and LMS-C estimates were able to produce an estimation error lower than the ML estimate only when the dogmatism of the data was low. It was shown that the unsupervised ML estimate, as specified by the HMM reestimation procedure, produced an estimation error which was initially much larger than the supervised form of this estimate. Due to the dependence of the EMAP and LMS-C estimates on the ML, performance of these two algorithms was also reduced. Repeated iteration of the forwardbackward algorithm eventually reduced the unsupervised level of error to that of the supervised estimate. It was also shown that the unsupervised form of the ML estimate implicitly models the correlation of the data means which serves to reduce estimation error as the data means become more correlated. Mean vector adaptation in SPHINX was less successful than with the feature-based systems because the dogmatism of the data in SPHINX was more than twice that of the feature-based systems. The SPHINX system's performance using LMS-C, EMAP, and ML codebook mean vector adaptation methods was compared with the system using no adaptation. Results showed an overall reduction of 2.0 to 3.4% in word error rate due to adaptation for a set of 11 speakers from the DARPA resource management task. Using a distance metric applied to the adapted codebooks, word error rates were reduced on average by 15% for those speakers automatically identified as good candidates for adaptation.