Newsgroups: comp.lang.smalltalk
Path: cantaloupe.srv.cs.cmu.edu!das-news2.harvard.edu!news2.near.net!news.mathworks.com!hookup!swrinde!pipex!news.maz.net!news.ppp.de!news.Hanse.DE!wavehh.hanse.de!cracauer
From: cracauer@wavehh.hanse.de (Martin Cracauer)
Subject: Re: Smalltalk has yet another 7-1 productivity advantage story ...
Message-ID: <1995Jan31.083726.7388@wavehh.hanse.de>
Organization: The Internet
References: <3ge7r6$nti@news1.delphi.com>
Date: Tue, 31 Jan 95 08:37:26 GMT
Lines: 361

jsutherland@BIX.com (Jeff Sutherland) writes:
[...]
>The really amazing thing to me is that the ratio of 7 to 1 improvement in
>productivity keeps coming up.  30 people for 2.5 years is 70 person years
>and 10 people for 12 months is 10 person years.

>INTERSOLV obviously didn't finish the job in C++ so it would have taken
>longer.  But some of the speed in Smalltalk was due to understanding the
>problem better than starting from scratch so let's call it even.

>I wonder how many times people will have to see different examples that are
>7 to 1 improvements in productivity in Smalltalk before they actually try
>Smalltalk long enough to make it work for them?

See below.

>This comment is a good one and belongs on the Internet.  If you don't want
>it posted let me know, otherwise I will put it up over the weekend.

Please post.


As for the productivity advantage of Smalltalk, once more, I can't
refuse to comment on this :-)

Although I followup to Jeff's posting, I rather address this to some
more careless minds in the Smalltalk community.

The examples about productivity posted here won't take you far in
advocating Smalltalk. You have to name what exactly make live easier
in some languages. And you should reflect a bit and ask yourself about
the drawbacks of the mechanisms that enables faster implementation.

I might say in advance that I think you are right that Smalltalk (and
other languages) enables greater coding efficiency than C++, but that
doesn't automatically mean I thing they are superior.


My argumentation goes like this:
================================

OO languages in general offer several things over procedural languages
like C and Pascal. Two important ones are probably:

OOP1) Reuse, mainly by inheritance. This saves coding time when doing
   similar jobs several times or by buying a library that includes
   useful classes.

OOP2) `Designed' Interfaces to data types, by encapsulation and by
   polymorphism. This make programming saver and later changes
   easier and supports Point OOP1 because there are similar interfaces
   to similar classes.

However, none of these speeds up development of an application when
you don't have code to reuse and the project doesn't last so long that
you forget your first interfaces.

C++ offers - among others - features to enable the two OO mechanisms
listed above. However, it does not primary offer features to enable
the programmer to implement a given piece of code faster than in ANSI
C. This is true both for the writer of a library and for the `user' of
a library, the application programmer. C++ makes sense when you have
or plan something to reuse. Then and only then you'll save time by
using C++ over C.

Other languages (Smalltalk and Lisp, to name two) enables programmers
to "compress" their code compared to C. That means: A given task
starting without existing code can be implemented in significant less
time.

Smalltalkers, some C and C++ code is to follow. Please stand by, I
cannot explain without it what I think is in Smalltalk and Lisp what
enables more efficient coding.

Example of what C++ does:
=========================

I'll try to provide an example to show why I think C++ offers better
reuse and better interfaces to objects, but that doesn't mean you'll
save coding time.

Iteration over an Array is done like this in ANSI C:
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
#define SIZE 1000
int i; double sum; double foo[SIZE]; int foosize;
dosomething_with_foo_that_sets_foosize_too();
for ( i=0 ; i<foosize ; i++ )
	sum += foo[i];
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

C++ offer to define data types with an interface similar to C's
arrays. 

that can look like this. 
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
int i;
aNumType sum;
aCollection <aNumType> foo(SIZE);
// To explain for non-C++ers: You create a collection `foo' that is a
// collection of type aCollection that can hold 1000 instances
// of `aNumType'

dosomething_that_fills_foo(); // no foosize needed anymore
for ( i=0 ; i<foo.size() ; i++ )
	sum += foo[i];
// C++ allows to give aCollection an interface that is similar to C's
// array. You can use `[i]'.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

This is an example of how C++ support the OOP2 - point above, similar
interfaces for similar classes. This solution is saver that those of C
and you don't have to change your way of thinking when changing from
build-in-arrays to your own collection types.

However, this can be pushed even further. You may want to define a
collection that doesn't store Objects in linear order anymore, a Tree
for example. 

Currently great effort is made to provide implementation-independent
iterators for C++. This can look like this:

vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
int i;
aNumType sum;
aCollection <aNumType> foo(SIZE);

dosomething_that_fills_foo(); 
for ( foo.first() ; !first.end() ; foo.next() )
	sum += foo.current();
// some advanced solutions require the creation and destruction
// of an instance of an iterator class, omitted here.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

This hides how exactly the data is stored in aCollection and therefore
pushed point OOP2 even further.

However, and no we come back to the point of how much code you can
write in a given time, *none* of these steps saves typing. The amount
of code that must be written for this task is about the same in all
three cases.

Languages that offers `code compression':
=========================================

Other languages allow to solve this problem in ways that saves typing,
in a word, by compressing the code. Compressing the code will lead to
shorter coding phase when you implement something without starting
from a given Library base.

To name some mechanisms:
- Dynamic typing
- anonymous functions that are evaluated in a special scope (not
  that of the file line they're defined in)
- a powerful macro system

The second one is relevant for the summation problem.

In Smalltalk this may look like this:
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
| aCollection |
aCollection := aFoo thatReturnsAnArray.
aCollection do: [  :x |  sum := sum + x. ].
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

This is shorter than the C+ solutions. You don't have to provide an
iteration variable in the client code, you don't have to talk to the
collection more than once. You solve a given problem with fewer code
and you didn't have to provide anything special at other place (in the
class definition of the collection or such).

These collection examples are a bit extreme when comparing C++ and
Smalltalk or Lisp. Most other place where C++ suffers from it's
inability to use anonymous functions et al don't make such a big
difference. 

The Smalltalk solution to the summation problem does not only offer
greater (writing) efficiency, it is also better when it comes to point
OOP2. You don't have to tangle with accessing a collection's contents
anymore in Smalltalk. This takes C++'s idea of iteration even
further. So, in this example, I think C++ doesn't even reach it's own
goals.

To expand the example a bit further, the `reduce' function in Common
Lisp make the summation we're doing here even easier.

(setf bla (make-array 3))
(fill-array-somehow)
(setf sum 
      (reduce 
       #'(lambda (p1 p2) (+ p1 p2)) 
       bla))
; the last statement does nothing else than.
;      (setf sum (reduce #'+ bla))
; I used the lambda form to show how this is done in general.
; Although the Lisp solution is easier than Smalltalk's only when using
; the short form.

One could complain: "This is unfair, you compare a hand-written
solution in C++ (and Smalltalk) with a Lisp function that is part of
the system library". Nope. There's no build-in function to sum up the
elements of an array in Common Lisp. There is a macro to iterate over
the elements of a sequence (Common Lisp's term for collection) in a
way that a function that is given as parameter is used to combine two
of them to one, using this one and the next element and so on.

In C++ it is neither possible to define such a macro nor to have
lambda functions with the scope facilities of Common Lisp's. See Henry
G. Baker's excellent Paper on OOP iterators 
vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
ftp://ftp.netcom.com/pub/hb/hbaker/home.html
"Iterators: Signs of Weakness in Object-Oriented Languages".  ACM OOPS
Messenger 4,3 (July 1993), 18-25.  If your language _requires_
iterators in order to get anything done, your language's control
structures are grossly deficient.  9 pages.  [Iterator.ps.Z]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

C++ strong points:
==================

An now the other side. C++ lacks important features to make these
solutions more general and easier to write. However, the way C++ does
things has advantages, too:

1) Efficiency.

1a) All the C and C++ examples above can be done in constant space. No
generation of objects is required while this summation is running,
not even when staring the summation. If if collection has only 2
elements, but the iteration happens more than once, it may be
significant, that Smalltalk creates storage for the block parameter
`x' (at least each time each iteration begins, if not on each single
step).

1b) All function calls may be statically inlined. That means: No
runtime selection of messages is performed and the whole code of the
function may be inlined. In Lisp, static binding in principle is
possible, but not when using anonymous functions (at least no
compiler I know of does this).

2) C++ way to do this iteration is type-save, the presence of each
operation for each object it is call on is checked at run time. OK,
Smalltalk doesn't do this anyway, but imagine a language that is
compile-time typed, but tries to provide block as in
Smalltalk. Type-checking for the code inside such a block will be
horrible. 



Additionally, I think a 7:1- ratio in the productivity in too
high. This will happen when C++ers doesn't even start with libraries
that are widely available today. Smalltalks will always start with a
library of many important types. But today there are Libraries for C++
that will enable C++ to reach the productivity their language
offers. STL, Tools.h++ are widely available for a short time and the
examples posted here are several years old. Those were the times of
Smalltalk-Libs ported to C++, leading to break C++'s style of
programming. Of course this will be horrible inefficient.

A catcher doesn't assume his opponent is still 12 years old when he
prepares for a fight :-)


Anyway, we agree that Smalltalk is more productive, leaving it open
for a moment how much.

So why do people use C++ today when they know they will be less
efficient?
================================================================

1) There are places where performance matters very much and it is
highly application-depended how great the overhead of Smalltalk or
current Lisp dialects is. GUI programming probably is not (the GUI
overhead is several magnitudes greater anyway), number processing
probably is.

2) There are applications where coding in only 1/10 or less of the
total cost. If Smalltalk in 5 times more productive, the end product
will not cost 1/5th of the C++ solution, but about the half. That
sounds different.

3) It is my believe, that while dynamically typed languages provide
better support for `code compression', statically typed languages may
be more effective in regard of point OOP1 - code reuse. At least in
large teams where people have to be forced as soon as possible to
check their assumptions about the protocols of a given class and where
manpower for typing in straightforward workarounds is not rare.

Additionally, C++ offers multiple inheritance. You can leave it out in
C++, but you cannot add it to Smalltalk.

4) Syntax, of course. A company might have the wish to buy a good
library (expensive, because it too long to write it) and have some
not-so-bright inhouse developers who don't do more than putting
library pieces together. Those people doesn't have to know all about
C++ and their experience with C can be used. C++ does a good job to
offer OO interfaces that are hidden behind C's old syntax.

4) People don't believe. Ok, there are many stupid people and many how
don't care for their own productivity. But there is also a large
number of managers who generally don't trust such claims as `You are 5
times as productive'. They probably heard the same from the next
pizza drive who puts some special vegetables on manager's pizzas :-)

The points are not so much the good points of a new techniques.
Everybody know that great improvement is made in software
industry. Advocates of new techniques generally don't do a good job
when it comes to give a realistic view on the drawbacks of a new
technique. This is especially true when it comes to the performance of
dynamic languages. I am sorry for all the brave Smalltalkers, but many
Lisp people are guilty of declaring Lisp can be as fast as C or
whatever, without giving a correct image of how and when. The result
is, that many manager who tried Lisp found the claims to be
untrue. This is the worst that can happen to new technologies and
people will stick with what they know will work or can understand by
themself.


Conclusion:
===========

Be careful how to advocate new languages. Of course, you cannot tell a
manager about anonymous functions, parsing macros in Lisp-like
languages and of objects sending messages to each other without
artificial constraints. You have to show the results of this. And you
have to be honest about the weak points. You are not as fast as C++.
You use higher-level language and you have much better performance
than most other HerLLs, that is the strong point. And so far you don't
use C libraries directly. You integrate them into the Object system of
Smalltalk.  That sounds better for me and don't pretend there is no
effort to do this.



P.S. (technical):

C++ is a lower-level language as Smalltalk, from a certain point
of view. This opens C++ for solutions to implement some functionality
on top of it.

C++'s problem with anonymous functions might be solvable by a
not-too-complex preprocessor that enables a lambda-function mechanism
like those of Lisp in C++. It could make the right scoping possible by
simply moving the source for the function into the right place in
source code, without loosing any of the advantages of C++ (Ok, the standard
libraries will use a different style).


P.S. (reading):

There was an article about `code compression' by Richard Gabriel in
Journal of OO Programming Jan 1993. `Code compression' means something
different there. It is worth reading.

In July/August 1993 there was `The last programming language' or
so. Also very interesting, and can be seen as an optimistic view for
Smalltalker compared to C++.

And don't forget Henry Baker's paper (see above) for the technical
side. 
-- 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Martin Cracauer <cracauer@wavehh.hanse.de> Fax +49 40 522 85 36
