Newsgroups: comp.lang.dylan,comp.lang.misc,comp.lang.lisp,comp.object,comp.arch
Path: cantaloupe.srv.cs.cmu.edu!rochester!udel!news.mathworks.com!newsfeed.internetmci.com!tank.news.pipex.net!pipex!swrinde!elroy.jpl.nasa.gov!lll-winken.llnl.gov!uop!csus.edu!netcom.com!NewsWatcher!user
From: hbaker@netcom.com (Henry Baker)
Subject: Re: allocator and GC locality (was Re: cost of malloc)
Message-ID: <hbaker-0208950816000001@192.0.2.1>
Sender: hbaker@netcom14.netcom.com
Organization: nil organization
References: <9507261647.AA14556@aruba.apple.com> <3v8g7l$cge@jive.cs.utexas.edu> <3vac07$ptf@info.epfl.ch> <3vb382$dtr@jive.cs.utexas.edu> <3vbl70$bht@fido.asd.sgi.com> <hbaker-3107951026250001@192.0.2.1> <justin-0108951458440001@158.234.26.212> <hbake <jyuynr@bmtech.demon.co.uk>
Date: Wed, 2 Aug 1995 16:16:00 GMT
Lines: 53
Xref: glinda.oz.cs.cmu.edu comp.lang.dylan:4946 comp.lang.misc:22444 comp.lang.lisp:18520 comp.object:36191 comp.arch:60061

In article <jyuynr@bmtech.demon.co.uk>, Scott Wheeler
<scottw@bmtech.demon.co.uk> wrote:

> -0108950939380001@192.0.2.1>
> X-Newsreader: NewsBase v1.36 (Beta)
> Lines: 16
> 
> In Article <hbaker-0108950939380001@192.0.2.1> Henry Baker writes:
> >All you've said, is that if you go along with Stroustrup/Coplien's
> >programming styles, you can get the job done, sort of.  However, I 
> >would argue that following their advice can cost you 3X - 100X in 
> >performance, because you drive the poor allocator/deallocator crazy.
> >Forcing a program to go 2 indirections to access array/string elements 
> >because C++ has no way to store the length of the array/string along 
> >with the array/string is probably the most irritatingly obvious 
> indication of the problem, but the problem is a good deal more 
> widespread than this.
> 
> Surely you are not claiming that a Pascal/BASIC-type string arrangement 
> in memory is going to run at least 3x faster than a typical C++ string? 
> This strains credibility.

If you allocate/deallocate these things very often, 3x may be conservative.
You pay some overhead simply accessing the string on a machine with a
longer cache line, because the string header will cause a cache fault,
and proceed to bring in a lot of stuff that won't be used, followed by
an indirection which causes another cache fault when you start gaining
access to the characters themselves.

The C++ beginner is sucked in because 1) strings are 0x00 terminated,
and therefore don't usually require that the programmer keep separate
track of a length, and 2) the actual length of the storage is kept in
a 'hidden' location managed by the allocator itself.  Thus, you get
another example of hidden overhead that you can't even access yourself
-- e.g., to get a rough idea of how big the string is before searching
for the 0x00.

If you have to dynamically allocate arrays of kinds other than C-style
strings, you now have to explicitly deal with a length, and you probably
want to check indices for legality, so an explicit length field becomes
a necessity.

Basic is slow because of interpreter overhead, not because of string
allocation overhead.

Just read the various C++ forums (fora?) for a while, and you keep getting
this feeling of deja vu all over again, as everyone describes their favorite
'specialized allocator' hack of the week for how to deal with allocator
overhead.

-- 
www/ftp directory:
ftp://ftp.netcom.com/pub/hb/hbaker/home.html
