Newsgroups: comp.lang.dylan,comp.lang.misc,comp.lang.lisp,comp.object,comp.arch,comp.lang.c++
Path: cantaloupe.srv.cs.cmu.edu!das-news2.harvard.edu!oitnews.harvard.edu!purdue!lerc.nasa.gov!magnus.acs.ohio-state.edu!math.ohio-state.edu!cs.utexas.edu!swrinde!tank.news.pipex.net!pipex!news.mathworks.com!news.kei.com!world!NewsWatcher!user
From: rpk@world.std.com (Robert P. Krajewski)
Subject: Re: allocator and GC locality (was Re: cost of malloc)
Message-ID: <rpk-1108950023020001@192.0.2.1>
Sender: news@world.std.com (Mr Usenet Himself)
Nntp-Posting-Host: world.std.com
Organization: The World @ Software Tool & Die
References: <9507261647.AA14556@aruba.apple.com> <3v8g7l$cge@jive.cs.utexas.edu> <3vac07$ptf@info.epfl.ch> <3vb382$dtr@jive.cs.utexas.edu> <3vbl70$bht@fido.asd.sgi.com> <hbaker-3107951026250001@192.0.2.1> <justin-0108951458440001@158.234.26.212> <hbake <jyuynr@bmtech.demon.co.uk> <hbaker-0208950816000001@192.0.2.1> <jyvgwh@bmtech.demon.co.uk> <hbaker-0408950815320001@192.0.2.1> <405k8h$emi@news.parc.xerox.com> <hbaker-0708951241390001@192.0.2.1> <40apft$3im@news.parc.xerox.com> <KANZE.95Aug10145551@slsvhdt
Date: Fri, 11 Aug 1995 04:22:24 GMT
Lines: 53
Xref: cantaloupe.srv.cs.cmu.edu comp.lang.dylan:5047 comp.lang.misc:22628 comp.lang.lisp:18658 comp.object:36660 comp.arch:60355 comp.lang.c++:143345

In article <KANZE.95Aug10145551@slsvhdt.lts.sel.alcatel.de>,
kanze@lts.sel.alcatel.de (James Kanze US/ESC 60/3/141 #40763) wrote:

>The problem is, that the operations which need to be optimized are not
>the same for different applications.  A compiler will need different
>string handling than an editor, for example.
>
>With this in mind, I'm not really convinced that the solution is to
>try and create an optimal string class as a standard.  I rather think
>of the standard string class as a facility for people like myself,
>whose programs only do string handling secondarily (formatting error
>messages, and the like).  If I were writing an editor, for example, I
>would not expect the standard string class to be acceptable for my
>text buffers.  In this regard, just about any string class with the
>required functionality will do the trick.  (And it is more important
>that the string class be easier to use than that it be fast.)
>
>This does mean that most text oriented applications will have to
>`re-invent the wheel', in that they will have to write their own
>string class.  But I'm not convinced that there is a string class
>which would be appropriate for all applications, anyway.

OK, but what might be appropriate is that there's a standard string class
*interface*, so that even implementors of specialized string classes can
use standardized routines for the things they need, but don't need to be
optimized. Such a standard also allows them to pass such objects to
external libraries that expect a standard interface for strings.
Otherwise, there's a going to be lot of reimplementation going on. (As if
there isn't already.)

I designed a string class that was actually two string classes. The first
class was SimpleString. All it assumed was the existence of a pointer to
characters, C style (this is C++), and a length count. Even with this
constraint, which rules out the possibility of growing a string or
allocating storage later in its lifetime, the SimpleString class could do
a lot of useful work -- counting spacing characters, searching, copying,
conversion to various character sets, and even in-place modifications that
don't require reallocation in any case, such as case (lower/upper)
conversion.

By breaking the assumption of a certain kind of allocation, it was
possible to derive from this class to take care of differing situations.
First, there was the general string class that allocated character storage
from the heap and pretty much rounded out the set of the operations that
were destructive or involved side-effects. But you could also derive
classes that always expected that the storage would be managed for them,
especially in the very common case where a string's length would never
increase once it was created -- the character storage might come the
stack, or might have been cleverly arranged to come from the same heap
block as the heap allocation of the string object itself (i.e., allocate a
block the size of the string descriptor, plus the storage needed for
characters). So even "clever" strings could be passed to the simpler, more
general operators that didn't care about cleverness.
