Newsgroups: comp.arch,comp.lang.lisp,comp.lang.scheme
Path: cantaloupe.srv.cs.cmu.edu!das-news.harvard.edu!news2.near.net!MathWorks.Com!europa.eng.gtefsd.com!howland.reston.ans.net!pipex!dircon!rheged!simon
From: simon@rheged.dircon.co.uk (Simon Brooke)
Subject: Re: Tag bits in data
In-Reply-To: thinman@netcom.com's message of Wed, 14 Sep 1994 19:12:52 GMT
Message-ID: <CwCC0o.1p3@rheged.dircon.co.uk>
Organization: none. Disorganization: total.
References: <thinmanCw4w1G.L14@netcom.com>
Date: Sun, 18 Sep 1994 19:41:11 GMT
Lines: 68
Xref: glinda.oz.cs.cmu.edu comp.arch:53047 comp.lang.lisp:14724 comp.lang.scheme:9800


In article <thinmanCw4w1G.L14@netcom.com> thinman@netcom.com (Technically Sweet) writes:

   Why do tag bits have to be in the datum itself?
   Why can't they be stored in the container (cons cells & vectors)
   which point at the datum?  Is this an overhead issue?
   Is it easier to fiddle with the tag in a register than
   to check the pointer-to datum?

I had a go at this one, once in my foolish youth. I did worse because
if I remember right I also had reference counters on the pointers (OK,
somebody's got to be that stupid -- and it's a long time ago.)

So why not?

1 Storage efficiency. Why store the same thing twice? OK, ok, caching
is a good plan sometimes: but you surely don't want to multiply cache
the type information of every object you know about? the inefficiency
is *huge*.

2 Concommittant to the first: it breaches one of old Codd's rules (I
forget which one). Basically, if you hold the same information in more
than one place, sometime you're sure to cock up and fix it in one
place but not the other. Leading to a good big total crash, with cons
cells rolling all over the floor and clogging the air conditioner
intakes. However...

However I'm still playing with the idea in my current sketch
architecture. This says that Cons Space Objects can't be moved on
garbage collect (so that foreign function can hold pointers to them --
sweeping cons space doesn't seem important since CSOs are all the same
size). But Heap Space must be swept occasionally otherwise you get
wasteful fragmentation. How to allow foreign functions to hold
pointers into heap space? Easy. For each HSO, allocate one CSO.
Indirect all references to the HSO through the CSO. Now: where to hold
the tagging information? To some extent this depends how wide your tag
is. If you have sufficient tag bits to allocate one class of reference
object to each class of HSO, then the HSO needs no tag and you have
effectively cached the HSO tag on the reference object. You can then
use the CDR of the reference object to indicate the size of referred
object, and the referred object becomes just a clean vector of bytes
with no necessary internal structure, which has some advantages (i.e.
you don't have to offset your indirection).

I suspect, though, that the people who really know about language
design (I don't) don't do things like this anymore. The problem is
having a fixed width tag. If you have a fixed width tag, then either 

(i) the application layer class system is not homogenous with the
language layer class system; or

(ii) the number of application layer classes is arbitrarily limited.

Presumably someone has solved this problem, presumably by having a
dynamically variable tag width: but for the life of me I can't see how
it would work efficiently. My recommendation would be to study the
FEEL code and work out how they do it.

If you know more about this stuff than I do (not hard), sorry!



-- 
    .::;====r==\              simon@rheged.dircon.co.uk (Simon Brooke)
   /  /____||___\____         
  //==\-   ||-  |  /__\(      MS Windows IS an operating environment.
 //____\___||___|_//  \|:     C++ IS an object oriented programming language. 
   \__/ ~~~~~~~~~~ \__/       Citroen 2cv6 IS a four door family saloon.
