Martin Stolle's Homepage

Homepage

In Spring 2001 I did my Honors project in the area of Reinforcement Learning. More specifically, I developed an algorithm that will find desirable goal states for Options in discrete Reinforcement Learning problems as well as appropriate activation sets for these goal states. In summary, my algorithm autonomously develops options that speed up the learning of new problems in the same environment. The source code and write-up is available here.

A more advanced version of that algorithm was presented at SARA 2002. You can find the paper here and the presentation slides here

I am now working on an alternative algorithm that makes weaker assumptions about the environment. If you read the above paper, you will notice that in order to get the initiation set, an interpolation between "start states" is necessary. This assumes that we can interpolate between arbitrary states in the environment, which is quite a strong assumption. This is alleviated by taking the states on the path as the initiation set. Early experimental results are quite promising and it even seems that the algorithm is more robust when reducing the number of random trajectories.

Besides my work in the Reasoning and Learning Lab (formerly known as Machine Learning Lab), I have also worked on several other projects. Last semester, I worked on a Neural Network that was supposed to recover 3D features from Stereo images. It was based on a really simple network structure that I thought to be really promising and I was wondering why no one had tried it before. Well, when I tried to run the experiments to find out if it worked, I found out why it was not done before: the memory requirements were just too massive (early version required over 1 Terabyte of Ram, I then reduced by quite a bit, but it was still too much). The write up can be found here. The source code here

Furthermore, I have also worked on applying Reinforcement Learning to a Mobile Robotics application. The goal of the project was to find out if it was feasible for multiple robots to learn policies that were designed to explore the environment without too much overlap in their exploration. The robots were not allowed to communicate during exploration. On the simple environments that I tried the algorithm, it worked quite well. The results can be read here. The source can be found here. There you can also find my implementation of a Java interface to McGill's RoboDaemon and an implementation of the Artificial Potential Fields Method for goal finding that uses the simulator.

Starting with the new Millennium in January 2001 I was a research assistant for the Laboratory for Natural and Simulated Cognition from Prof. Shulz at McGill. I worked under the supervision of Francois Rivest who developed a great and efficient way of integrating previously learned knowledge into Cascade Correlation Neural Networks. I learned a lot about Neural Networks (and Java) while programming the Java Simulator for Knowledge Based Cascade Correlation Networks.

02.11.2002

language: english | german | french

MADE BY MARTIN