46838-s99 Machine Learning for Computational Finance Assignment 1

Due March 15, at the beginning of class

  1. Describe a problem in Finance where you think machine learning techniques can be used to supplement or substitute for expert knowledge.

  2. The PayDividend learning task has the following properties:
    Example Price Earnings Sector Market Exchange Outlook Pay Dividend
    1 Up High Manufacturing Bull NYSE Strong Yes
    2 Up High Service Bull NYSE Strong Yes
    3 Down Low Service Bull NYSE Weak No
    4 Up High Service Bull Nasdaq Weak Yes

    Explain why the size of the hypothesis space is the PayDividend learning task is 973. How would the number of possible instances and possible hypotheses increase with the addition of a new attribute which can take on 3 different values. More generally, how does the number of possible instances and hypotheses grow with he addition of a new attribute A that takes on k possible values?

  3. Consider the PayDividend learning task discussed in class and the associated hypothesis space H. Define a new hypothesis space H' that consists of all _pairwise_ disjunctions of the hypotheses in H. For example, a typical hypothesis in H' is:

    (?, Low, Srv., ?, ?, ?)  V  (Up, ?, Srv., ?, ?, Strong)

  4. Give the sequence of S and G boundary sets computed by the CANDIDATE-ELIMINATION algorithm if it is given the sequence of examples given in the table in question 2 in reverse order. Although the final version space will be the same regardless of the sequence of examples (why?), the sets S and G computed at intermediate stages will, of course, depend on this sequence. Can you come up with ideas for ordering the training examples to minimize the sum of the sizes of these intermediate S and G sets used for the H used in the PayDividend example?

  5. Problem 2.4 in the book

  6. Problem 2.7 in the book

  7. Problem 2.8 in the book

Rosie Jones
Last modified: Sat Mar 13 04:48:36 EST 1999