Newsgroups: comp.lang.lisp
Path: cantaloupe.srv.cs.cmu.edu!bb3.andrew.cmu.edu!newsfeed.pitt.edu!scramble.lm.com!news.math.psu.edu!psuvax1!news.cc.swarthmore.edu!netnews.upenn.edu!msunews!uwm.edu!newsspool.doit.wisc.edu!night.primate.wisc.edu!aplcenmp!hall
From: hall@aplcenmp.apl.jhu.edu (Marty Hall)
Subject: Re: Functional Languages and Caches
Message-ID: <DoMGAx.4tI@aplcenmp.apl.jhu.edu>
Organization: JHU/APL Research Center, Hopkins P/T CS Faculty
References: <4iq667$nkv@csugrad.cs.vt.edu>
Date: Thu, 21 Mar 1996 14:30:33 GMT
Lines: 99

In article <4iq667$nkv@csugrad.cs.vt.edu> jmaxwell@csugrad.cs.vt.edu
(Jon A. Maxwell) writes: 
>In a functional language supposedly there are no side-effects.
>
>I'm thinking that having a cache of function parameters and
>return values could speed up programs many times.  The example in
>my algorithms book for Dynamic Programming is to parenthesize
>matrix multiplications in order to minimize the number of element
>multiplications.  This is done by evaluating subproblems and
>using those results to solve larger problems. (O(n^3) it turns
>out whereas recursion takes O(2^n)?)
>
>Dynamic programming is so much faster because the subproblems are
>solved only once instead of many times.  Wouldn't that happen
>automatically if the (recursive) function had a cache associated
>with it?
[...]
>So would this cache be practical, and could it be used in a
>language such as lisp (which has side effects)?  

The technique of caching previous computations is known as
"memoization" and is widely used. Automatically converting an existing
function into one that caches computations is known as "automatic
memoization". The idea of memoization was first popularized by Donald
Michie at the University of Edinburgh in the late 1960's. Michie and
later David Marsh implemented some libraries in the Pop-2 language and
looked at some usage issues. Michie's idea was more focused on machine
learning than on dynamic programming, with the thought being to have
fuzzy matches against the cache. However, this was done by potentially
examing the whole cache, resulting in O(N) performance where N is the
number of values already cached. Or one can use hashing and give up
on the idea of fuzzy matches. We are currently looking at ways to
have our cake and eat it too in this regard. See references below.

It was then mostly discussed in the functional programming literature.
Reade and Field&Harrison both discuss it in their texts on Functional
Programming, and there are multiple related papers as well.

In Lisp, Abelson and Sussman suggest the idea of automatic memoization
in Scheme in _Structure and Interpretation of Computer Programs_, and
Peter Norvig gives a much more complete implementation in _Paradigms's
of AI Programming_. People also discuss applications to logic
programming (Warren, Dietrich), context-free parsing (Norvig), Dynamic
Programming (Cormen), PDA's (Amtoft), etc.

Applying memoization in pure functional languages is nice, but not all
that helpful for the 99.9% of people who use impure languages. Applying
it in Lisp is much easier and more flexible than in most other
languages, but can still be quite tricky. At IJCAI-85, Mostow and
Cohen presented a very interesting paper on some of the problems they
had trying to figure when it was safe to use. I have a couple of
papers discussing several of these problems with potential solutions,
suggesting additional applications, and summarizing our experiences
using such a package we developed in Common Lisp. One of these papers
is available at
http://www.apl.jhu.edu/~hall/test/Papers/Monterrey-Memoization.ps. I 
will try to get the others online soon and linked to my Lisp WWW page 
(http://www.apl.jhu.edu/~hall/lisp.html). You can obtain an early
version of our library at http://www.apl.jhu.edu/~hall/lisp/Memoization/
as well as from the Lisp archives at CMU. I will try to get an updated
version and links from my Lisp WWW page relatively soon.

We are currently working on a version in C/C++, and are considering a
Java version sometime later.

> I just wondering how they got emacs so fast.

To my knowledge, emacs does not use memoization.

>If you were doing factorials, say you evaluated 5! + 4!.  After
>calculating 5!, the 4! would already be known and the answer
>could be returned without having to calculate it again.

The problem is that the cache-retrieval time is likely to be
significantly larger than the time to calculate a small factorial from
scratch. By using hashing to store the cache, you can make the access
time O(1) with respect to the number of stored values (O(N) with
respect to the size of the input argument unless you use
object-identity instead of object-equality as the test, which has its
own problems). But this time is unlikely to be faster than that
required to do a simple factorial. For instance, in Common Lisp on my
Sparc10, a computation has to take at least 1/1000th of a second for
memoization to be potentially useful. Note also that if you want the
super-dramatic speedups you referred to above, you need to apply
memoization to a divide and conquer problem with overlapping
subproblems, which is not the case with factorials.

But memoization can still be extremely useful in other problems, but
just not in such tiny ones. It is usually considered to be a
space-for-time tradeoff, but if the function you are memoizing
generates a lot of temporary memory (garbage) to do its calculations,
memoizing it can save this. In our empirical tests on a large decision
aid in Lisp, we were surprised to find huge reductions in garbage
after our programmers used the memoization library (in addition to
very large speedups).

Cheers-
						- Marty
(proclaim '(inline skates))
