Newsgroups: comp.lang.smalltalk
Path: cantaloupe.srv.cs.cmu.edu!bb3.andrew.cmu.edu!newsfeed.pitt.edu!gatech!newsfeed.internetmci.com!howland.reston.ans.net!torn!nott!cunews!dbuck
From: dbuck@superior.carleton.ca (Dave Buck)
Subject: Re: Optimizing Ordered Collections
X-Nntp-Posting-Host: superior.carleton.ca
Message-ID: <DLs7t2.AIK@cunews.carleton.ca>
Sender: news@cunews.carleton.ca (News Administrator)
Organization: Carleton University, Ottawa, Canada
References: <4e9s7n$h4h@harbinger.cc.monash.edu.au>
Date: Fri, 26 Jan 1996 09:31:50 GMT
Lines: 74

In article <4e9s7n$h4h@harbinger.cc.monash.edu.au>,
Danny H Cron <dcron@bruce.cs.monash.edu.au> wrote:
>I realised that an OC suited my needs, but I wanted the speed of an
>array, so I came up with a Dynamic Array which was suppose to suit
>my needs.  And I guess it does, but the speed up was not as great
>as I thought.  25% reduction in time over the Ordered Collection for
>inserting and iterating.

Welcome to the world of strange Smalltalk optimizations.  The reason
that Arrays are so fast compared to other collections is because at:
and at:put: are coded as bytecodes in the Smalltalk compiled code.
Suppose you want to send at: to an Array.  Here's what happens:

(below, VM stands for Virtual Machine i.e., the "interpreter".  Let's
ignore dynamic compilation for now.)

  1) You execute the bytecode for at:
  2) The VM pops the index and the receiver from the stack
  3) The VM tests to see of the receiver's class is Array
  4) It is, so the VM checks to see if the index is inbounds
  5) It is, so the VM pushes the indexed variable at that index into the stack

Notice that no message at: was dispatched to the array.  The entire
operation occurred within the VM's interpreter loop by using a special
case.

Here's what happens for OrderedCollection at:

  1) You execute the bytecode for at:
  2) The VM pops the index and the receiver from the stack
  3) The VM tests to see of the receiver's class is Array
  4) It's not, so the VM dispatches an at: message to the object.
  5) A method lookup is done searching in the method dictionary for
OrderedCollection.
  6) The at: method in OrderedCollection is executed which does a type
check and a bounds check on the index before calling super at: with
the index plus an offset.
  7) In Object, the at: method is found and executed.
  8) The at: method in Object is primitive 60 so primitive #60 is
called in the VM
  9) Primitive 60 pops the index and receiver and pushes the indexed 
instance variable at that index.
 10) Everything returns

Your new subclass of Array is following the same execution path as
OrderedCollection except that you don't have an at: method for your
class to perform a bounds check (which you should do, BTW).

I also noticed that your class doesn't give the right answer to the
'size' message.  You should implement size to return the number of
entries used and use basicSize in your implementation to determine how
many indexed instance variables you have.

As for become:, it's not quite as fast as you may think.  I hear (I
don't know the details) that there may be some housekeeping to do
during a become: depending on where the object lives and where the
object that you're becoming lives.  You don't want to move an object
from New Space into Old Space during a become: (or something to that
effect.)  For information on new space and old space, look in
ObjectMemory class>>spaceDescriptions.

My recommendation is to use OrderedCollections.  Just be careful to
pre-allocate them to be a good size if you're adding a lot of things
to them.  It's hard to make your own collections that are faster but
still work the way they should.

David Buck
dbuck@ccs.carleton.ca

_________________________________
| David K. Buck                 |
| dbuck@ccs.carleton.ca         |
| The Object People             |
|_______________________________|
