Newsgroups: comp.lang.scheme
Path: cantaloupe.srv.cs.cmu.edu!das-news2.harvard.edu!news2.near.net!news.mathworks.com!news.alpha.net!uwm.edu!psuvax1!news.ecn.bgu.edu!siemens!princeton!news.princeton.edu!blume
From: blume@dynamic.cs.princeton.edu (Matthias Blume)
Subject: Scheme top-level [Was: Re: redefinition of builtin procedures]
In-Reply-To: tammet@cs.chalmers.se's message of 29 Nov 1994 14:52:10 GMT
Message-ID: <BLUME.94Nov29110158@dynamic.cs.princeton.edu>
Originator: news@hedgehog.Princeton.EDU
Sender: news@Princeton.EDU (USENET News System)
Nntp-Posting-Host: dynamic.cs.princeton.edu
Organization: Princeton University
References: <BLUME.94Nov28101600@dynamic.cs.princeton.edu> <6q1Jwc2w165w@sytex.com>
	<3bff6q$q9n@nyheter.chalmers.se>
Date: Tue, 29 Nov 1994 16:01:58 GMT
Lines: 156


Let me first say a few words to Tanel's message before I (once again)
describe how I see the world...

In article <3bff6q$q9n@nyheter.chalmers.se> tammet@cs.chalmers.se (Tanel Tammet) writes:

   I am all for having the implementations support top level (preferrably
   lispish and not ML-ish top level for my taste), which is a great
   tool for program development and debugging.

   However, as people often indicate, the following three features:

	     top level, eval(optional), load 

   make it impossible for the compiler to analyse source and sort out
   primitives or other procedures which are not redefined, thus losing
   lots of optimizations absolutely crucial for efficiency.

Not only that -- in the presence of macros they can lead to surprising
inconsistencies.

   The eval and load are separate cases in the sense that when
   they are _not_ used in the source code, the analyser can detect
   that.

But unless you type everything directly at the top-level or use some
non-standard feature to link separate parts of the program together
you *have to* use LOAD.  (IMO, you never need EVAL, but that's another
thread.)

   However, even in case they are used, the programmer may know that the
   primitives are still never redefined; this cannot generally be found by
   automatic analysis. 

   Obviously, the analyser cannot detect whether the top level is going to
   be used and whether primitives may be redefined.

   It appears that in case we want efficiency from the compiled code,
   we must either

   1) throw away or redefine (in ML-style) the top level, avoid using
     eval and load.

If you redefine the top-level the right way then LOAD isn't in
conflict.  Witness SML/NJ's `use'.

   2) use a declaration, telling the compiler that no redefinitions 
     will take place, no matter what.

But what if you tell the compiler that no redefinition will occur, and
then - in the next line (or maybe *much* later) - you redefine the
name anyway?  Should this be an error?  Should the code before the
redefinition use the old definition while future code uses the new
one?  Or maybe we should ignore all subsequent redefinitions after
such a declaration? ...

It is precisely this grey zone I'm talking about.

   Overall, the second solution is  less restrictive. Essentially,
   we lose nothing (compare to the first solution).

I disagree.  We lose consistency.

   IMHO there is
   also no need to change the language in any way to implement (2),
   since any particular compiler could well implement the declaration
   syntax any way preferred (as they do). 

But this is a bad thing.  It spells `source code incompatibility'.

   The declaration(s) is (are) not really a Scheme language issue at all.
   They concern properties of particular programs.

A programming language is concerned with (and only with) properties of
programs.  This is what a programming language is all about.

   IMHO there is nothing
   wrong (rather I'd say the opposite) with allowing the programmer
   to provide information about properties of the program to the compiler.

Indeed.  And I wish that *all* the information one wants to provide to
the compiler is expressible *in the language (Scheme) itself*.

---

I have to admit that I have a very hard time understanding the strong
resistance against a cleaned-up top-level behavior.

First of all I fail to see what we would lose.
Correct me if I'm wrong -- but isn't it the case that in most
implementations of Scheme it is either illegal to say

	(define and 1)

or this definition does not have any effect on *past* uses of AND.
All I want to do is generalize this behavior -- make it the rule.
This does not take away the ability to change the value of top-level
variables after their respective definitions.  You can still say

	(define (foo ...) ...)
	... use foo, find it broken ...
	(set! foo (lambda (...) ...))
	... test some more ...

SET! doesn't introduce a new definition (binding).  I do not consider
it necessary to declare user-defined top-level procedures or other
values as `constant' or `integrable'.  This sort of stuff can be done
in a lexically scoped entity (module!), where the compiler can figure
it out by itself.  In this sense I agree -- the interactive top-level
is for testing and experimenting -- something which doesn't demand
optimizer heroics.

As I pointed out above most implementations already behave like I want
them to behave for macros and built-in syntax.  Let's extend this to
DEFINE!  Every top-level DEFINE introduces a fresh binding to a fresh
location.  Past uses of the same variable use the old binding.  Future
uses use the new binding.  Since we have SET! this doesn't take away
any functionality, but it adds consistence and SAFETY:

It is now possible for a piece of code to use a top-level variable
without worrying about re-definitions and inadvertent re-uses of the
same variable by unrelated pieces of code (presumably written by
somebody else).  It makes it unnecessary to invent peculiar
identifiers for global variables to avoid name clashes (a practice
that must be called dubious at best and downright dangerous on
average, because everybody seems to use the same scheme for making up
peculiar names like *OBJECTS* and so on).  One can simply write

	(define objects '())

	(define (foo ...) ... objects ...)

	(define (bar ...) ... objects ...)

	...

and be sure that both FOO and BAR refer to the correct OBJECTS
variable.  With this one can even close the scope by adding a last
line of

	(define objects 'im-done-with-it)

This way it is not even possible for somebody else to see (and
therefore modify) the OBJECTS variable used by FOO and BAR.  (Note,
that I do not advocate this style -- I would always use a `real' module
to achieve protection and information hiding.)

The bottom line is:  R4RS-style toplevels are broken.  Most
implementations already more or less behave the way I would like them
to behave for non-variable definitions (constants, primitives, macros,
built-in syntax).  There is nothing we would lose extending this
behavior to all definitions, thereby making things consistent.  We can
only win, and I believe we would.

--
-Matthias
