--------------------------------------------------------------------------
Inductive Program Synthesis: Back to the Roots
--------------------------------------------------------------------------
Ute Schmid
Visiting Researcher at SCS CMU
http://ki.cs.tu-berlin.de/~schmid/
--------------------------------------------------------------------------
Inductive program synthesis (IPS) can be viewed as a special kind of
unsupervised learning over structured data: Input to the synthesis system
is a (small and sometimes ordered) set of positive examples representing
input/output relations of some unknown program; output is a (recursive)
program which generalizes over the examples. From I/O relations "([]/0),
([x]/1), ([x,y]/2), ([x,y,z]/3)" we can for example infer the recursive
function "length(l) = if empty(l) then 0 else inc(length(tail(l)))".
IPS was extensively worked on in the seventies in the context of functional
(LISP) programming and is now rediscovered in the frameworks of inductive
logic programming (ILP) and genetic programming (GP). I will present an
approach to IPS which is an extension of the original framework: IPS is
divided into two subproblems: (1) rewriting I/O relations into constructive
expressions, i.e. terms which transform an input into an output using
predefined operators and (2) generalizing over these "straightforward
programs" by identifying unifiable subpatterns. The first subproblem can
be seen as a search-intensive rewrite-task which depends on provided
background knowledge about data structures and operators. The second
subproblem is basically a pattern-matching task which can be dealt with
(mostly) by a purely syntactical approach. I believe that the original
conceptualization of splitting IPS into two distinct parts
has some advantages over ILP and GP approaches. Dividing this complex
learning problem can help to make IPS algorithms more transparent
and furthermore (and more important) can help to conquer it more easily.
I will show that by providing two distinct algorithmic approaches for
solving the IPS subproblems the scope of inferable programs can be extended.
Another characteristic of our approach is that it is independent of a given
programming language. Instead, we rely on the notion of recursive program
schemes (another nice concept of the seventies) which are elements of some
arbitrary term algebra. Thereby we can address the problem of inferring
recursive functions from examples in a more general, language independent way.
Final remarks: Why is it interesting to develop algorithms for inductive
program synthesis?
Because most work in machine learning at CMU is focussed on classification
learning, I feel the need to give some reasons why what I am doing might be
of interest. A recursive program can be viewed as a representation of action
sequences for domains of arbitrary complexity. It is obvious that humans
have the ability of inferring such kind of procedural knowledge from
experience. If a child can unstack a tower of three blocks it can probably
also unstack a tower of seven or twenty blocks, although it may have done this
never before. That is, it can generalize a general strategy from experience.
In psychological research this human ability of learning from problem solving
experience was demonstrated in many studies (cf. learning a general solution
strategy for Tower of Hanoi puzzles). Thus, research in IPS can be seen as a
contribution to AI and cognitive science by providing algorithms
which perform this kind of learning task. IPS is also of interest from a more
theoretical point of view: The theory of grammar inference provides us with
a formal framework for analyzing the learnability of classes of languages.
It can be shown that IPS corresponds to the inference of context-free
tree grammars which is in general not (efficiently) learnable. It is a
challenging problem to identify efficiently learnable subclasses of
context-free tree grammars and provide the according algorithms.
--------------------------------------------------------------------------
If you are interested, please refer to the following papers:
(1) Our approach to inductive synthesis of recursive program schemes:
Schmid, U. & Wysotzki, F. (1998). Induction of Recursive Program
Schemes. In Claire Nedellec and Celine Rouveirol (Eds.), Proceedings
of the 10th European Conference on Machine Learning ECML-98 (pp.
214-226). Springer, LNAI 1398.
PS-FILE: http://ki.cs.tu-berlin.de/~schmid/pub-ps/ecml98f.ps
(2) For a formalization of inductive program synthesis in the framework of
grammar inference:
Schmid, U., Muehlpfordt, M. & Wysotzki, F. (submitted). Inductive program
synthesis as induction of context-free tree grammars.
PS-FILE: http://ki.cs.tu-berlin.de/~schmid/pub-ps/jml-article-draft.ps
(3) Relation of inductive program synthesis to human problem solving
and learning:
Schmid, U. & Wysotzki, F. (1998). Skill acquisition can be regarded
as program synthesis: An integrative approach to learning by doing and
learning by analogy. In U. Schmid, J. Krems and F. Wysotzki (Eds.), Mind
Modelling - A Cognitive Science Approach to Reasoning, Learning and
Discovery. Lengerich: Pabst Science Publishers.
PS-FILE: http://ki.cs.tu-berlin.de/~schmid/pub-ps/cm96-book3.ps
--------------------------------------------------------------------------