Newsgroups: comp.lang.dylan,comp.lang.misc,comp.lang.lisp,comp.object,comp.arch
Path: cantaloupe.srv.cs.cmu.edu!bb3.andrew.cmu.edu!nntp.sei.cmu.edu!cis.ohio-state.edu!math.ohio-state.edu!howland.reston.ans.net!xlink.net!slsv6bt!news
From: kanze@lts.sel.alcatel.de (James Kanze US/ESC 60/3/141 #40763)
Subject: Re: allocator and GC locality (was Re: cost of malloc)
In-Reply-To: mikemac@engr.sgi.com's message of 3 Aug 1995 19:31:07 GMT
Message-ID: <KANZE.95Aug9150054@slsvhdt.lts.sel.alcatel.de>
Lines: 70
Sender: news@lts.sel.alcatel.de
Organization: SEL
References: <9507261647.AA14556@aruba.apple.com> <3v8g7l$cge@jive.cs.utexas.edu>
	<3vac07$ptf@info.epfl.ch> <3vb382$dtr@jive.cs.utexas.edu>
	<3vbl70$bht@fido.asd.sgi.com> <hbaker-3107951026250001@192.0.2.1>
	<justin-0108951458440001@158.234.26.212> <hbake
	<jyuynr@bmtech.demon.co.uk> <hbaker-0208950816000001@192.0.2.1>
	<jyvgwh@bmtech.demon.co.uk> <3vr85r$758@fido.asd.sgi.com>
Date: 09 Aug 1995 13:00:53 GMT
Xref: glinda.oz.cs.cmu.edu comp.lang.dylan:5021 comp.lang.misc:22580 comp.lang.lisp:18620 comp.object:36556 comp.arch:60241

In article <3vr85r$758@fido.asd.sgi.com> mikemac@engr.sgi.com (Mike
McDonald) writes:

|> In article <jyvgwh@bmtech.demon.co.uk>, Scott Wheeler <scottw@bmtech.demon.co.uk> writes:


|> |> class str {
|> |>     int iCount;
|> |>     char achData[256];
|> |> };
|> |> 
|> |> with the obvious problems in terms of fixed maximum length of the 
|> |> string.

|>   Yuch! In C, (I don't know how to do it in C++) you'd do something like:

|> typedef struct string {
|>   int length;
|>   char contents[0];
|>   } *String;

Not legal, at least not in C (or C++).  Arrays may not have length of
0.

|> String make_string(int n)
|> {
|>   String s;

|>   s = (String) calloc(1, sizeof(struct string) + n);
|>   if (s == NULL)
|>     you_lose();
|>   s->length = n;
|>   return (s);
|> }


|> (If your C compiler doesn't like arrays of length zero, declare contents as
|> length 1 and subtract 1 from the sizeof(struct string).)

But you still cannot *access* the remaining memory.  At least not
easily.  Accessing contents with an index > 0 is undefined behavior;
with any implementation of reasonable quality, it will cause a bounds
check error.  (Of course, I've never seen an implementation of C with
reasonable quality, so you may be safe.  Although I seem to have heard
somewhere that there is one, Centerline, maybe?)

Interestingly enough, there is a legal (and safe) way of implementing
this same idiom in C++.  It's so ugly, though, that I'm not going to
post it.  (I know, if you were worried about ugliness, you wouldn't be
using C++ anyway:-).)

|>   This way, both the length and then data are allocated together in the heap. And
|> you haven't limited the max length artifically. (I suspect most BASIC systems do
|> this.)

I once ran benchmarks (with a very simple program, so probably not
indicative of anythinkg) of three reference counted implementations of
C++ strings: the classical one, the classical one with operator new
overloaded for the fixed length header class, and one using the above
ugly hack (and so only one allocation per string).  Under Solaris,
using the standard system malloc (called directly by new), there was
no significant difference in runtime.

-- 
James Kanze         Tel.: (+33) 88 14 49 00        email: kanze@gabi-soft.fr
GABI Software, Sarl., 8 rue des Francs-Bourgeois, F-67000 Strasbourg, France
Conseils en informatique industrielle --
                              -- Beratung in industrieller Datenverarbeitung


