15-859(B) Machine Learning Theory 03/18/04
* Angluin's algorithm for learning DFAs
=========================================================================
ANGLUIN'S ALGORITHM FOR LEARNING DFAS
=====================================
Now, a neat algorithm by Dana Angluin for learning DFAs from
membership queries and counterexamples.
- Membership query: propose x, get f(x).
- Counterexample: propose h, get x such that h(x) != f(x) if one exists.
(Sometimes called "equivalence queries")
[Same as MQ+MB model]
Goal: we want #queries, #counterexamples, running time to be
polynomial in the size of the target DFA. (and, technically, in the
size of the largest counterexample)
- Passive observation alone is not sufficient (by crypto results).
- MQs alone not sufficient either (see slides). "combination-lock automata".
Several ways of viewing problem:
- standard view: Box contains a DFA. feed in x, get f(x).
- finite-state-environment view: box is the world. You propose an
action and get observation. Also have a reset button that takes
you back to the start.
The two views are roughly equivalent. E.g., difference is whether you
get just f(x) or you also get the value of f on all prefixes of x.
But you can simulate that with more queries.
*********************
--> See slides.
*********************
Initially get 2-state machine with states "lambda" and "a".
Use counterexample "aaba".
How to use the counterexample to figure out where we went wrong? Key
trick: look at all ways to split the counterexample into a (prefix,
suffix). For each split, replace prefix with our predicted string in
S and make a membership query. We are guaranteed to find two adjacent
pairs (pref_i, suff_{n-i}), (pref_{i+1}, suff_{n-i-1}) whose outputs
differ, and this gives us a new experiment (specifically,
suff_{n-i-1}) to use.
Gives us 3-state machine with states lambda, a, aa.
query again for counterexample. Interesting thing: same
counterexample works here...
End up with correct 4 state machine.
Learning algorithm creates up to n rows in top part of table and n*|A|
rows in the bottom part. # experiments is at most n since each one
distinguishes a new state.
-> # EQ's is at most n
-> # MQ's is at most n(|A| + 1)*n + n*log(m)
first term is the number of rows times the number of
experiments (i.e., the table size), and the second is how many
queries it takes to find the next experiment given a
counterexample * number of counterexamples, if we use
binary search.
(m = max length of counterexample)