15-859(B) Machine Learning Theory 03/24/10 DFAs * Can learn in Mistake-bound + Membership Queries * But hard to learn in PAC ====================== Learning in MB+MQ -> see slides. Hardness in PAC: based on hardness of learning NC^1 (logarithmic depth) circuits. Idea: given an NC^1 circuit f over {0,1}^n, will create DFA g over {0,1}^m where m = n*2^{depth(f)}. Also will define a simple mapping r: x -> xxx...x. The DFA g will have the property that g(r(x)) = f(x). Also the number of states in g will be polynomial in m. -> first, how would you create a DFA for the function x_2 OR not(x_4)? This is easy. In fact, notice we can do this so that the DFA always ends up in either a specific accept-state A or a specific reject-state R,and we can get rid of self-loops if we only care about inputs of size n. -> Given a DFA for the function f1 and a DFA for the function f2, how to create a DFA for "f1 AND f2", assuming we're allowed to replicate the input? Just make f1's A state be f2's start state, and have f1's R state go through a chain of dummy states to f2's R state. Similarly for OR. So we double the length of the input needed each time we go up 1 level in the circuit. That's it. Notice that even though NC^1 circuits are hard to learn even with queries, we have *not* shown that DFA's are hard to learn with membership queries (good thing too, since we just gave an algorithm that learns DFAs using membership queries). The reason is that only queries of the form "xxx...x" actually map back to legal queries for the original NC^1 circuit. Also, even though NC^1 is hard to learn even over the uniform distribution, we also have *not* shown that DFA's are hard to learn over the uniform distribution on n-bit inputs. (This is a nice open problem).