Newsgroups: comp.lang.smalltalk
Path: cantaloupe.srv.cs.cmu.edu!das-news.harvard.edu!news2.near.net!MathWorks.Com!europa.eng.gtefsd.com!howland.reston.ans.net!pipex!harlqn.co.uk!harlequin.co.uk!eliot
From: eliot@harlequin.co.uk (Eliot Miranda;081 519 2769)
Subject: Re: threaded interpreter VM for ST?
Message-ID: <eliot.780239100@newshost>
Sender: usenet@harlequin.co.uk (Usenet Maintainer)
Organization: Harlequin Ltd, Cambridge, UK
References: <1994Sep18.001417.2338@cs.sfu.ca>
Date: Thu, 22 Sep 1994 13:05:00 GMT
Lines: 52

In <1994Sep18.001417.2338@cs.sfu.ca> craig@cs.sfu.ca (Craig Larman) writes:

>Charles Duff, the developer of Neon and Actor (and a man of 
>great talents, ahead of his time, IMHO), wrote an interesting 
>Aug '86 Byte article: "Designing an Efficient (OO) Language", 
>which I really enjoyed at the time.

>He presents an argument for using a threaded interpreter instead 
>of a byte code interpreter, especially for its efficiency when 
>combined with early binding constructs (which Actor 'had').

>I'm curious if anyone has explored creating an ST VM based on a 
>threaded interpreter model, and compared the performance, size 
>of code, etc. Would make an interesting thesis, if it hasn't 
>been done...

>regards, Craig Larman

I've done a dynamic translator to threaded code.  Its about 1.5 times
faster than the best bytecode interpreter I've written, and runs at
about 75% to 80% of ParcPlace's PS & HPS which dynamically translate
to native code.  Currently the overhead of translation is about 10%
of entire execution time so one might conclude that compiling direct
to a threaded code would not improve performance much at the expense
of a moderate increase in space.

For example, some simple measurements indicate that on an image with
50k objects, 2581k heap & 405k of object table, replacing bytecoded 
methods with threaded code methods would add 1577k to the heap, a growth
of about 50%.  Depending on the implementation one might also be able to
loose 8k ByteArrays holding the bytecodes, reducing the object table by
64k.

More interesting is compiling direct to native code.  The performance
increase can be large (> x4 over PS & HPS).  See Ian Piumarta's thesis
from the University of Manchester:

	"Delayed Code Generation in a Smalltalk-80 Compiler"
	Ian K. Piumarta, University of Manchester, 1993

Its avaliable via anonymous ftp from
	mighp0.cs.man.ac.uk:/pub/theses/*

The file /pub/theses/README explains the various formats for the report
(A4, 2 per page A4, A5 etc).

Of course maintaining image portability suddenly becomes a more
difficult issue :)
--
Eliot Miranda                           email:  eliot@harlequin.co.uk
Harlequin, Barrington Hall, Barrington  Tel:    +44 223 87 3887
Cambridgeshire, CB2 5RG, UK             Fax:    +44 223 872 519
