-------------------------------------------------------------------------- Inductive Program Synthesis: Back to the Roots -------------------------------------------------------------------------- Ute Schmid Visiting Researcher at SCS CMU http://ki.cs.tu-berlin.de/~schmid/ -------------------------------------------------------------------------- Inductive program synthesis (IPS) can be viewed as a special kind of unsupervised learning over structured data: Input to the synthesis system is a (small and sometimes ordered) set of positive examples representing input/output relations of some unknown program; output is a (recursive) program which generalizes over the examples. From I/O relations "([]/0), ([x]/1), ([x,y]/2), ([x,y,z]/3)" we can for example infer the recursive function "length(l) = if empty(l) then 0 else inc(length(tail(l)))". IPS was extensively worked on in the seventies in the context of functional (LISP) programming and is now rediscovered in the frameworks of inductive logic programming (ILP) and genetic programming (GP). I will present an approach to IPS which is an extension of the original framework: IPS is divided into two subproblems: (1) rewriting I/O relations into constructive expressions, i.e. terms which transform an input into an output using predefined operators and (2) generalizing over these "straightforward programs" by identifying unifiable subpatterns. The first subproblem can be seen as a search-intensive rewrite-task which depends on provided background knowledge about data structures and operators. The second subproblem is basically a pattern-matching task which can be dealt with (mostly) by a purely syntactical approach. I believe that the original conceptualization of splitting IPS into two distinct parts has some advantages over ILP and GP approaches. Dividing this complex learning problem can help to make IPS algorithms more transparent and furthermore (and more important) can help to conquer it more easily. I will show that by providing two distinct algorithmic approaches for solving the IPS subproblems the scope of inferable programs can be extended. Another characteristic of our approach is that it is independent of a given programming language. Instead, we rely on the notion of recursive program schemes (another nice concept of the seventies) which are elements of some arbitrary term algebra. Thereby we can address the problem of inferring recursive functions from examples in a more general, language independent way. Final remarks: Why is it interesting to develop algorithms for inductive program synthesis? Because most work in machine learning at CMU is focussed on classification learning, I feel the need to give some reasons why what I am doing might be of interest. A recursive program can be viewed as a representation of action sequences for domains of arbitrary complexity. It is obvious that humans have the ability of inferring such kind of procedural knowledge from experience. If a child can unstack a tower of three blocks it can probably also unstack a tower of seven or twenty blocks, although it may have done this never before. That is, it can generalize a general strategy from experience. In psychological research this human ability of learning from problem solving experience was demonstrated in many studies (cf. learning a general solution strategy for Tower of Hanoi puzzles). Thus, research in IPS can be seen as a contribution to AI and cognitive science by providing algorithms which perform this kind of learning task. IPS is also of interest from a more theoretical point of view: The theory of grammar inference provides us with a formal framework for analyzing the learnability of classes of languages. It can be shown that IPS corresponds to the inference of context-free tree grammars which is in general not (efficiently) learnable. It is a challenging problem to identify efficiently learnable subclasses of context-free tree grammars and provide the according algorithms. -------------------------------------------------------------------------- If you are interested, please refer to the following papers: (1) Our approach to inductive synthesis of recursive program schemes: Schmid, U. & Wysotzki, F. (1998). Induction of Recursive Program Schemes. In Claire Nedellec and Celine Rouveirol (Eds.), Proceedings of the 10th European Conference on Machine Learning ECML-98 (pp. 214-226). Springer, LNAI 1398. PS-FILE: http://ki.cs.tu-berlin.de/~schmid/pub-ps/ecml98f.ps (2) For a formalization of inductive program synthesis in the framework of grammar inference: Schmid, U., Muehlpfordt, M. & Wysotzki, F. (submitted). Inductive program synthesis as induction of context-free tree grammars. PS-FILE: http://ki.cs.tu-berlin.de/~schmid/pub-ps/jml-article-draft.ps (3) Relation of inductive program synthesis to human problem solving and learning: Schmid, U. & Wysotzki, F. (1998). Skill acquisition can be regarded as program synthesis: An integrative approach to learning by doing and learning by analogy. In U. Schmid, J. Krems and F. Wysotzki (Eds.), Mind Modelling - A Cognitive Science Approach to Reasoning, Learning and Discovery. Lengerich: Pabst Science Publishers. PS-FILE: http://ki.cs.tu-berlin.de/~schmid/pub-ps/cm96-book3.ps --------------------------------------------------------------------------