15-859(B) Machine Learning Theory 03/18/04 * Angluin's algorithm for learning DFAs ========================================================================= ANGLUIN'S ALGORITHM FOR LEARNING DFAS ===================================== Now, a neat algorithm by Dana Angluin for learning DFAs from membership queries and counterexamples. - Membership query: propose x, get f(x). - Counterexample: propose h, get x such that h(x) != f(x) if one exists. (Sometimes called "equivalence queries") [Same as MQ+MB model] Goal: we want #queries, #counterexamples, running time to be polynomial in the size of the target DFA. (and, technically, in the size of the largest counterexample) - Passive observation alone is not sufficient (by crypto results). - MQs alone not sufficient either (see slides). "combination-lock automata". Several ways of viewing problem: - standard view: Box contains a DFA. feed in x, get f(x). - finite-state-environment view: box is the world. You propose an action and get observation. Also have a reset button that takes you back to the start. The two views are roughly equivalent. E.g., difference is whether you get just f(x) or you also get the value of f on all prefixes of x. But you can simulate that with more queries. ********************* --> See slides. ********************* Initially get 2-state machine with states "lambda" and "a". Use counterexample "aaba". How to use the counterexample to figure out where we went wrong? Key trick: look at all ways to split the counterexample into a (prefix, suffix). For each split, replace prefix with our predicted string in S and make a membership query. We are guaranteed to find two adjacent pairs (pref_i, suff_{n-i}), (pref_{i+1}, suff_{n-i-1}) whose outputs differ, and this gives us a new experiment (specifically, suff_{n-i-1}) to use. Gives us 3-state machine with states lambda, a, aa. query again for counterexample. Interesting thing: same counterexample works here... End up with correct 4 state machine. Learning algorithm creates up to n rows in top part of table and n*|A| rows in the bottom part. # experiments is at most n since each one distinguishes a new state. -> # EQ's is at most n -> # MQ's is at most n(|A| + 1)*n + n*log(m) first term is the number of rows times the number of experiments (i.e., the table size), and the second is how many queries it takes to find the next experiment given a counterexample * number of counterexamples, if we use binary search. (m = max length of counterexample)