In this paper we present a framework for building probabilistic automata
parameterized by context-dependent probabilities. Gibbs distributions
are used to model state transitions and output generation, and parameter
estimation is carried out using an EM algorithm where the M-step
uses a generalized iterative scaling procedure. We discuss relations with
certain classes of stochastic feedforward neural networks, a geometric
interpretation for parameter estimation, and a simple example of a
statistical language model constructed using this methodology.