46838-s99 Machine Learning for Computational Finance

Assignment 5

Due Before Class Monday April 19th, 1999, IN HARD COPY

  1. Problem 9.3 from the textbook

  2. Problem 9.4 from the textbook

  3. The next problems involve applying genetic algorithms and hillclimbing to currency data. Five files are necessary: which you should download from here Put these in a directory and start up matlab. First, load in the .dat files (type 'load yen_returns.dat' etc at the matlab prompt). These files contains 4 variables:

    ga.m is the genetic algorithm program, and is run by the following command at the matlab prompt:

    [train_returns, test_returns] = ga(series, returns, training_proportion, tree_size, population_size, generations)

    Where:

    when run, ga plots the train_returns in blue and the test_returns in red.

    Questions:

  4. Run ga.m ten times for each of the following parameters set: (Note that matlab plots will be generated automatically by the code executing.)

    The first is a genetic algorithm with a max tree size of 2, population of 10, and 50 generations. The second is a hillclimbing algorithm run for 500 generations.

  5. Would there be an advantage to having a population size of 100? A disadvantage? Run the algorithm as above with a population size of 100, and see if your predictions are correct.

  6. Run ga.m ten times for each of the following parameters set:

    This is the same as the genetic algorithm of the previous question, except the proportion of data used for training is only 10%.

  7. Run ga.m ten times for each of the following parameters sets (note the change from yen to dm): Examine the test_fitness curves as a function of the number of generations. Do you notice a qualitative difference between the curves from the algorithm with tree_size 2 as opposed to the algorithm with tree_size 1? (hint: for each trial compare the maximum test fitness values with the final test fitness value). Print out one representative plot for each condition that demonstrates this, and speculate about how the difference in tree size causes the differences in curve shapes (hint: think about pruning in decision trees).
    Created by James Thomas, maintained by Rosie Jones
    Last modified: Mon Apr 12 10:34:54 EDT 1999