Notes for March 18, 1998 * Hidden Markov Models * weak-learning a monotone function over unif distrib. ---------------------------------------------------------------------------- HMMs --> see slides. If have time after doing HMMs, here is something cute related to Fourier analysis stuff. (Has nothing to do with HMMs but wanted to mention). Say learning from examples, which are uniformly drawn in {0,1}^n. Say that target function is *monotone*. Monotone means that if you take some example x and flip some bit from 0 to 1, then the value of f(x) can't change from positive to negative. You don't have queries: just observe uniform random examples. Just given that the function is monotone, it turns out there is an easy way to *weak-learn* with respect to uniform distribution. Any guesses? - Say that Pr(f(x) = 1) = 1/2 (at least approximately) since otherwise can just say "everything is positive" or "everything is negative". - Want to argue that some variable has correlation. I.e., for some x_i, Pr(f(x) = 1 | x_i = 1) > Pr(f(x) = 1 | x_i = 0) + epsilon. (If can find such an x_i, then can predict positive when x_i = 1 and negative when x_i = 0) I.e., has some noticeable fourier coefficient out of these n+1. How to prove: look at boolean cube {0,1}^n. Say color example red if positive and blue if negative. Half of each. Want: no matter how you color, a lot of edges crossing from red to blue. Say we can prove that there must be O(2^n) edges with one endpoint of each color. Then must be some direction i with O(2^n / n) edges in that direction. Then x_i has good correlation [[why?]] So, it all boils down to saying that if you want to slice a cube into two equal halves, you need to cut a lot of edges. Can prove this using "canonical path" argument. for each pair of examples (x,y) where x is red and y is blue, define the canonical path from x to y to be: "starting with x, scan down the bits from left to right, flipping where needed". There are (2^{n-1})^2 pairs. But, any given edge can be used by at most 2^n of these paths (why: because if edge is 01010100 -> 01011100, then x must end with 100 and y must begin with 0101). So, must be at least 2^{n-2} edges.