Newsgroups: comp.lang.smalltalk
Path: cantaloupe.srv.cs.cmu.edu!rochester!cornell!travelers.mail.cornell.edu!news.kei.com!news.mathworks.com!uunet!boole!uri
From: uri@boole.com (Uri Eshkar)
Subject: Re: novice question: shallow and deep copies
Message-ID: <1995Feb15.191104.18144@boole.com>
Organization: Boole & Babbage, Inc.
References: <obe.792377722@BIX.com> <3hn3p1$pnf@infosrv.rz.unibw-muenchen.de> <rick.shafer-1402951609130001@rasmac.gsfc.nasa.gov>
Date: Wed, 15 Feb 1995 19:11:04 GMT
Lines: 145

In <rick.shafer-1402951609130001@rasmac.gsfc.nasa.gov> rick.shafer@gsfc.nasa.gov (Rick Shafer) writes:

>In article <3hn3p1$pnf@infosrv.rz.unibw-muenchen.de>,
>i31ade@applsrv.rz.unibw-muenchen.de (Frank Derichsweiler) wrote:
>> obe@BIX.com (obe on BIX) writes:
>> >I'm struggling with shallow vs deep copies.
>> 
>> The main difference is the following:
>> shallow copy produces a new reference to the same object
>> deep copy creates a new instance 
>> 
>> e.g.
>> b := a shallow copy.
>> a change_some_value.
>> 
>> now b has the same changed value as a
>> 
>> if you perform a deepcopy, the values of b will not change if you change 
>> the values of a

>Ummm...., not being a real ST maven, I don't think so.  The real pros will
>probably jump in, but hey let me try...  So from one newbie to another...

>Shallow copy does NOT give a new reference to the same object.  That is
>merely done with the :=, i.e.
>a := AnObject new init.
>b := a.
>Now b and a are both references to the same object.
>c := a shallowCopy.
>c now points to a new object, BUT c is the same class as a AND all of c's
>instance variables point to the same objects as a.  If these are foo and
>bar which are set with the ever popular setter/getter routines #foo: ,
>#foo, #bar: , and #bar, then:
>a foo: aFoo; bar: aBar.
>c := shallowCopy a.
>a foo: anotherFoo.
>d := shallowCopy a.

>Now a, c, and d all point to different objects.  All three have the bar
>instance variable pointing to the same object (aBar).  However, the foo
>instance variable in c points to aFoo (a's original Foo when c was created
>by the #shallowCopy) and a and d point to anotherFoo.

>So, a shallow copy makes a New object whose instance variables are the
>same as the source object's.

>So what's a deepCopy?  Trouble, if my reading of some past threads are any
>indication, but follow along with me.  In the previous example if you were
>to make a change to the object (aBar), this would effect the behavior of
>all of the objects (a, c, and d) that have it as their bar instance.  How
>could
>you make a copy of an object that would break this perhaps undesireable
>connection?  This is what deepCopy is supposed to produce, and the way it
>does it is by sending a "copy" command to all the instance variables.
>That is if a, c, and d are all members of the Bletch class, the Bletch may
>have defined a method.

>deepCopy
>        | newBletch |
>        newBletch := Bletch new init.
>        newBletch foo: (self foo copy);
>                bar: (self bar copy).
>        ^newBletch.

>This is in contrast to

>shallowCopy
>        | newBletch |
>        newBletch := Bletch new init.
>        newBletch foo: (self foo);
>                bar: (self bar).
>        ^newBletch.

>(I apologize for the bad practices in the above, my actual ST codeing is
>pretty rusty).  Now of course, these routines are special code for the
>Bletch class.  In actuality there is support in Object that will
>automagically handle all the named and ennumerated instance variables for
>you.

>Were you paying attention to a little problem in the above though?  In the
>deepCopy method what I sent was a #copy message.  I have seen some
>ambiguity over whether this by default should actually be a #deepCopy, or
>#copy (which
>usually defaults to #shallowCopy) but you can override that for the
>particular classes of the *instances* to #deepCopy or whatever your little
>heart desires.  The advantage of ST.


There is lot of confusion here, so here it is:

  A shallow copy gives you a new object, which shares all instance variables
with the old one. A deep copy gives a new object and also new instance
variables. This definition is recursive. If the instance variables are complex
objects by themselves, their copies might be deep or shallow, so in effect
there aren't just 2 kinds of copy, but many. It really depends on how deep you
want your copy to be.

  Till now it was just general definitions, that apply to any object system.
Now to implementation. Older versions of PPS Smalltalk implemented shallowCopy
and deepCopy. In VW deepCopy has been removed (I believe it has been done
earlier in ObjectWorks 4.0, but it does not really matter) and it is up to
the user to implement a deep copy. 

If you inspect copy in Object, you'll notice it is implemented as:

copy
   ^self shallowCopy postCopy

shallowCopy is a primitive that answers a new shallow copy of the object.

postCopy is implemented as ^self, but the comment there explains it all. It says
this is a method intended to be sent to the newly created instance to implement
a real copy. 
Subclasses do implement copy (a fully deep one or partially deep one) by
reimplementing postCopy to copy their instance variables. They should never
reimplement copy.

You can find many examples of that in the base image. Look at:

CharacterAttributes >> postCopy

	super postCopy.
	attributes := attributes copy.
	defaultQuery := defaultQuery copy

By having each of your objects implement postCopy you get a recursive deep copy,
no matter how complex your object is.

I had only one problem with this approach. Collection classes in the VW image
do not implement postCopy, so if you have collections in your object structure
the chain of postCopy sends is broken. To overcome this you must either create
your own subclass of the collection, that implements postCopy (adding postCopy
to the base collection class is a bad idea because of side effects) or have the
object that has a collection as an instance variable iterate over the collection
in its postCopy and send copy to the elements of the collection. This might
become messy if the elements of the collection are collections by themselves...


---------------------------------
  Uri Eshkar	Boole & Babbage
  Voice:	(408)526-3418
  Fax:		(408)526-3055
  Email:	uri@boole.com


