12:00, Wed 18 Sep 1996, WeH 7220 Learning to Emulate an Unknown Probability Distribution Leemon Baird If a learning system is repeatedly shown random vectors that are generated independently according to some unknown distribution, it should be able to learn that distribution. This is simply modeling a stochastic system. After learning, it should be able to tell the probability pdf(x) or cdf(x) for any vector x. Even more importantly, it should be able to emulate that distribution by generating random vectors on its own according to the same distribution. And it would be nice if this could be done using any arbitrary function approximator, not just special ones (like a sum of Gaussians). This talk will derive a very simple algorithm that does just that. It can also learn conditional probabilities of the form "given I'm in this state and do this action, what state might I end up in on the next time step". That's needed to do reinforcement learning with arbitrary neural networks if you want guaranteed convergence. This algorithm can also be used for inverse control for noninvertable systems where you say "given where I am and where I want to go, give me one of the multiple actions that will get me there". This algorithm can also be used for unsupervised learning, where it learns a function such that f(x) is uniform even when x is not. The talk will end with a discussion to see if anyone knows of other areas where this algorithm might be useful.