Okay, so I've fleshed out and changed the syntax a bit from my earlier
note, and recast examples from the first few chapters of Paulson's book
into this syntax.

Parts of it still seem less than ideal to me, and some of the examples
use syntax that I'm not sure should actually be implemented.

You should look at

  http://www.cs.cmu.edu/~kw/paulson.txt
  http://www.cs.cmu.edu/~kw/mlnext.txt

for examples, maybe before reading this.

Parsing
~~~~~~~

The basic grammar is the same for terms, types, and patterns, which
simplifies parsing, and (I think) promotes a pleasing regularity, but
maybe that's my Elf bias showing.

Not every term construct is legal as a pattern or type, of course (and
vice versa), and some keywords have different (but related) meanings at
the three levels.

Lexical syntax
..............

The lexical syntax distinguishes "proper" from "common" identifiers.  You
should think (roughly):

    Proper                      Common

    recursive let name          non-recursive let val
    never bound by patterns     always bound by patterns
    stands for possibly         stands for pure value
      effectful computation
    includes datatype and       includes type variables
      constructor names
    can be symbolic             never symbolic

I usually call a "proper" identifier a NAME and a "common" identifier a
VARIABLE.

The characters allowed in alphabetic identifiers are [A-Za-z0-9.?!_']
and an alphabetic identifier is "proper" if it starts with [A-Z0-9.?!]

** QUESTION: This is exactly the opposite of the Prolog world (but
** it is the same convention as the Mathematica world).  Is there
** anyone who would argue for the other way?

Except that identifiers that start with "'" must obey the syntax for
character literals.

A very few tokens that would otherwise be "common" identifiers are
reserved keywords instead:

  of if with where fn raise _

A few symbolic identifiers (see below) are also reserved keywords:

  ( ) { ; } ` , | := == -> => && ||

Numeric, character, and string literals are thought of as proper
identifiers that are present in the initial environment and can never
be shadowed.  If a proper identifier looks like a numeric literal,
then it is, otherwise it's not.  (So "1st" is legal and is not a
numeric literal or anything like that.)

If an identifier looks qualified (e.g. "Module.Thing") then it is.
But if it uses periods in such a way that it doesn't look
qualified, then it isn't (the most important example being the
base case "." for lists, and floating point literals "3.14159").

** QUESTION: how to fit qualified symbolic identifiers into this model?
** They are so rare that they shouldn't influence the rest of the design,
** but in cases where they're unavoidable there should be some hack.

The double quote can only be used to form string literals (see below).

"Special" characters are [][)(}{;,`] and always form a symbolic identifier
(actually all but [ ] are reserved keywords) by themselves.

The hash "#" is for comments and pragmas, in Elf style (except that Elf
uses "%" instead).

Any other printable ASCII character is a symbolic identifier constituent
(so "~&$<>%%" is a legal symbolic identifier), and symbolic identifiers
are always "proper".

Alphabetic identifiers are never infix or prefix.  Symbolic identifiers
always are.  A use of a symbolic identifier is infix if it looks infix,
or prefix if it looks prefix:

   - 1          Prefix "-"
   1 - 2        Infix "-"

The prefix and infix forms of a symbolic identifier are completely
independent of each other; they can have different types, and different
bindings in the environment (as indeed the two forms of "-" generally
do in all sane programming languages).

If you must destroy the fixity of a symbolic identifier you can do it
by saying e.g.

   (f -> f 1) (#prefix -)               Same as "- 1"
   (f -> f 1 2) (#infix -)              Same as "1 - 2"

There is no way of declaring operator precedences or associativities.

The infix form of a symbolic identifier is right-associative if it ends
with ":", or left-associative otherwise.  The simplest case ":" itself
is the usual right-associative list constructor.

** ASIDE: This can actually be seen as a special case of a convention
** going all the way back to the Principia of Russell and Whitehead (I
** think--I saw it first in Church's paper introducing his simple theory
** of types)

Prefix operators have the highest precedence, followed by left infix
operators, then right infix operators.

** QUESTION: We are giving up the ability to write "1 + 2 * 3" instead of
** "1 + (2 * 3)", and in exchange we don't have to deal with precedence
** declarations--the parsing of an expression is independent of its
** environment.  Is this trade worth it?

Keywords that happen to look like symbolic identifiers (e.g. "->" "&&" etc.)
NEVER have the same precedence as operators; it's always lower.

Any "applicative expression" (see grammar below) can be treated as an
infix identifier by surrounding it with ``:

   x `Elem` set
   foo `Xyzzy 1` bar

These are always left-associative, and have the same precedence as other
left-associative symbolic identifiers.

Grammar
.......

It would be best to read the examples from Paulson and only then look at
the formal grammar (if ever).

The atomic expressions are:

exp0 ::=
    VARIABLE
    NAME
    INTEGER
    STRING
    CHAR
    FLOAT
    fn
    _
    ( exp )
    { funrules }        General lambda abstraction
    { exp }             Simple lambda abstraction

The rest of the core expression grammar goes (from tightest to loosest
binding):

  Applicative           exp1 ::= exp1 exp0 | exp0
  Prefix op             exp2 ::= PREFIX_OP exp2 | exp1
  Left op               exp3 ::= exp3 INFIX_LEFT exp2 | exp2
  Right op              exp4 ::= exp3 INFIX_RIGHT exp4 | exp3
  Conjunction           exp5 ::= exp5 && exp4 | exp4
  Disjunction           exp6 ::= exp6 || exp5 | exp5
  Comma                 exp7 ::= exp7 , exp6 | exp6
  Simple rule           exp8 ::= exp7 -> exp | exp7 => exp | exp7

Note that the ambiguity in -> and => is resolved by making them extend
as far to the right as possible, as usual.
Finally, some constructs of least possible precedence:

exp ::=
    raise exp                   Exception
    exp8 if exp8 | exp          Conditional
    exp8 with { withrules }     Case analysis
    exp8 with withrule            ditto
    exp8 where { decls }        Local binding
    exp8 where decl               ditto
    exp8

Again where there's ambiguity, these extend as far to the right as possible.

The "funrules" and "withrules" differ only in how the patterns on the
lefthand sides are interpreted, not syntactically.  Their common syntax is

rules ::=
    ; rules
    rule rules
    <nothing>

rule ::= exp7 -> exp

The "decls" are also what constitute a top-level program.

decls ::=
    ; decls
    decl decls
    <nothing>

decl ::=
    exp
    exp7 of exp
    exp7 == exp
    exp7 := exp
    type exp7 == exp
    data exp7 == { condecs }
    data exp7 == condec

A "condec" differs from a "decl" only in how it's interpreted, not
syntactically.

The words "type" and "data" are not reserved, but this does not
introduce any ambiguity (exercise for the reader: why?)


Informal semantics
~~~~~~~~~~~~~~~~~~

Proper names
............

Proper identifiers stand for possibly effectful computations (though
quite often they'll actually stand for values).  As in Haskell, the
scope of a proper identifier is the entire block of declarations in
which it occurs (so they're implicitly mutually recursive) as well as
the scope of the block itself.  So in the following

  Foo where { Bar == (x -> Foo x); Foo == (x -> Bar x) }

all three Foos are mentions of the same identifier.

The types of proper identifiers are declared with "of", and the identifiers
themselves are defined with "==".  The definition may be by several
cases, e.g.

  Foo Int of Int;

  Foo 1 == 3;
  Foo 2 == Foo 1;
  Foo x == raise Boggle;

but all cases must be contiguous, with no other declarations intervening,
and any type declarations for Foo must come before all of them, and all the
patterns on the left of == must have the same number of arguments.

Common names
............

By contrast, common identifiers stand for values.  All common identifiers
are introduced by binding them in a pattern.  The scope of a common
identifier extends to the end of the rule (if the pattern introducing it
is the left-hand side of a rule or ==) or the rest of the block (if the pattern
introducing it is the left-hand side of :=) and the scope of the block itself.

So in

  Foo x y Zee where { x := Factorial 100; y := x + 1; Zee a == a + x }

all the occurrences of x are mentions of the some identifier.

Anonymous functions
...................

Function definitions can be written in many different ways.  The type of
a function can always be written in a way that makes it look like the
function itself (except that "->" becomes "=>" and "==" becomes "of").

The general form for anonymous functions is exemplified by

  Map == {
      fn f Nil -> Nil;
      fn f (Cons x xs) -> Cons (f x) (fn f xs);
  };

Each rule is one case of the definition.  The pattern on the left,
and the expression on the right, use the keyword "fn" to stand for the
anonymous function itself (and "fn" can be used just like any other
identifier could be), so for instance we allow

  Map == {
      f `fn` Nil -> Nil;
      f `fn` Cons x xs -> Cons (f x) (f `fn` xs);
  };

and anonymous functions can be recursive.

A "fn" on the right-hand side refers recursively to the nearest enclosing
anonymous function.  It is an error to use "fn" in an expression that's
not nested inside an anonymous function.

When multiple == bindings are used to define a name, the set of bindings
can be transformed into an anonymous binding by replacing each occurrence
of the name, on the left of ==, with "fn", and replacing "==" with "->".
So for instance we transform

  Foo 1 == 3;
  Foo 2 == Foo 1;
  Foo x == raise Boggle;

into

  Foo == {
      fn 1 -> 3;
      fn 2 -> Foo 1;
      fn x -> raise Boggle;
  };

Each rule in an anonymous function must apply "fn" to the same number of
arguments.  If there is just one argument, then the "fn" can be dropped:

  Foo == {
      1 -> 3;
      2 -> Foo 1;
      x -> raise Boggle;
  };

Either all "fn"s are dropped, or none are.

If the function has only one rule, the outer braces can be dropped.
(More generally, braces are always used to enclose semicolon-separated
things.  If there's only one thing in the braces, they can be dropped.)

** QUESTION: In the Paulson examples I also used the shorthand form
** {_+1} or {1+_} to stand for sectioning, which breaks this rule
** slightly (we need the braces to see where _ is implicitly bound).
** It's also useful so that we can say {Foo bar baz} where SML would
** force us to say (fn () => Foo bar baz), when playing with lazy
** computations.  Is this special syntax worth it?  (You'll note I did
** include it in the grammar above).

Value bindings and case analysis
................................

Patterns are also used in some other places, where they don't stand for
calls to an anonymous function.  In these cases, the "fn" is NEVER allowed.

In any context a "pattern" such as "x y" in

  { x y -> 3 };
  x y := 3;

is bogus.  Only "fn" or a constructor name can be applied to arguments
in a pattern.  (But see the discussion of string constants, below.)

Patterns must match the production "exp7" in the grammar.

Patterns are also used to bind value (common) variables using ":=".
In this case, the "fn" is NEVER allowed:

  pi := 3.14159;
  (x, y) := Unzip mylist;

Patterns are also used in the rules of a "with" (case analysis).  Again,
in this case the "fn" is NEVER allowed.

  mylist with { Nil -> 0; Cons x xs -> 1 }

(which is "case mylist of Nil => 0 | Cons(x,xs) => 1" in SML)

Function types
..............

The type of a function, in general, is given by a "rule" where -> becomes =>:

  Add3 of (fn Int Int Int => Int);
  Add3 == (fn x y z -> x + y + z);

All the same conventions apply.  The "fn" can be dropped if there's only
one argument:

  Increment of (Int => Int);
  Increment == (x -> x + 1);

And you can use the same form for type declarations as for Haskell-style
definitions:

  Add3 Int Int Int of Int;
  Add3 x y z == x + y + z;

Type declarations for values must never do that.  But the following is okay:

  x of Float;
  x := 3.14159;

The rule that -> and => extend as far to the right as possible leads to
the natural thing for curried functions (combined with the convention
that the braces surrounding an anonymous function can be dropped if there's
only one rule):

  Add3 of Int => Int => Int => Int;
  Add3 == x -> y -> z -> x + y + z;

Tuples
......

Tuples are given by comma-separated expressions.  The type of a tuple
looks like the tuple itself:

  Diagonal Int of (Int, Int);
  Diagonal x == (x, x);

A tuple need not be surrounded by parentheses.  Commas don't associate
at all.  (So (1, 2, 3) and ((1, 2), 3) and (1, (2, 3)) are all different.)

Type definitions
................

A type (or type constructor) definition is introduced by the keyword "type":

  type IntPair == (Int, Int);
  type Pair3 a == (a, a, a);
  type CPS a answer == (a => answer) => answer;

Constructor-level functions are given by "->", but an anonymous constructor-
level function must only have one rule, and obviously it can't use
non-trivial pattern matching:

  type Pair3 == a -> (a, a, a);
  type CPS == fn a answer -> ((a => answer) => answer);

All the other conventions for the syntax of functions are the same for
constructor-level functions, though:

  # sick and twisted
  type CPS == a `fn` answer -> ((a => answer) => answer);

Datatypes
.........

A datatype is introduced by the keyword "data":

  data Bool == { True; False };
  data List a == { Nil; Cons a (List a) };
  data Sum a b == { Left a; Right b };

In more advanced systems (which I haven't thought much about), it is
necessary to be able to specify the return type of a constructor:

  data GADT a == { GInt Int of GADT Int;
                   GBool Bool of GADT Bool };
  data NList a n == { Nil of NList a 0;
                      Cons a (NList a n) of NList a (n+1) };

When the return types aren't specified, they default to the type on
the left of the ==.  So the earlier datatype declarations, more
explicitly, look like:

  data Bool == { True of Bool; False of Bool };
  data List a == { Nil of List a; Cons a (List a) of List a };
  data Sum a b == { Left a of Sum a b; Right b of Sum a b };

All the same conventions apply for constructor declarations as for
any other declaration, so the following is fine:

  data List a == { Nil of List a; Cons of a => List a => List a };
  data List a == { Nil of List a; Cons of fn a (List a) => List a };

All datatypes are subtypes of a common type Data.  This replaces ML's
"exn".  Once syntax for exception handlers is introduced, it will just
involve case analysis on type Data.

** QUESTION: Obviously this will only make sense if the subtyping for
** datatypes makes sense.  Also there's the question of representation.
** I think having a little flexibility in subtyping datatypes is valuable
** enough that any penalties in the bit-banging of the representations
** are worth it.  But someone who knows should evaluate this.

Existentials in datatypes
.........................

First of all, with the syntax for datatypes discussed above, we
already have existential types, if the binding of type variables is
set up the right way.  I want to think of constructor declarations as
always being in the more explicit form given above.  In that case,
each constructor declaration's free type variables are local to that
declaration.  We can alpha-convert them, and erase the variables on
the outside.  For instance, we could expand

  data List a == { Nil; Cons a (List a) }

into

  data List a == { Nil of List a; Cons a (List a) of List a }

then, since now the full type of each constructor is explicit, erase
the type variable on the outside, and alpha-convert the type variables
in the constructor declarations any way we want:

  data List _ == { Nil of List b; Cons c (List c) of List c }

(We wouldn't have been able to do that, of course, before making the
type of each constructor fully explicit, because

  data List a == { Nil; Cons c (List c) }

obviously doesn't make sense.)

Anyway, if we want an existential type, in the fully explicit form,
we can just introduce more type variables that don't occur in the
result type of a constructor.  For instance, my MinML interpreter
represents functions in CPS style.  The final answer type of the CPS
monad is unknown.  We can capture it with an existential type.  Then
the datatype of values in my MinML interpreter, in fully explicit form,
looks like

  data Value == {
      VBool Bool of Value;
      VInt Int of Value;
      VPair Value Value of Value;
      VFun (Value => (Value => a) => a) of Value;
      # et cetera
  };

Since the "a" of the VFun constructor doesn't appear in the constructor's
result type, it's effectively an existential.  Here we're making use
of the equivalence (this is not ML syntax)

  all a. (Foo a => Bar)  =~=  (exists a. Foo a) => Bar

Now that we've seen the fully explicit form, we can go back to the
lighter form, but it still means the same thing:

  data Value == {
      VBool Bool;
      VInt Int;
      VPair Value Value;
      VFun (Value => (Value => a) => a);
      # et cetera
  };

Sometimes there would be a mixture of type variables; this also makes
sense.  For example, we might want to parameterize Value over the
representation of integers.  Then in fully explicit form we'd have

  data Value intrep == {
      VBool Bool of Value intrep;
      VInt intrep of Value intrep;
      VPair (Value intrep) (Value intrep) of Value intrep;
      VFun (Value intrep => (Value intrep => a) => a) of Value intrep;
      # et cetera
  };

In fully explicit form, we can alpha-convert freely, so the above is
exactly the same as

  data Value _ == {
      VBool Bool of Value alice;
      VInt bob of Value bob;
      VPair (Value carol) (Value carol) of Value carol;
      VFun (Value dave => (Value dave => evelyn) => evelyn) of Value dave;
      # et cetera
  };

Or we can go back to the lighter form

  data Value intrep == {
      VBool Bool;
      VInt intrep;
      VPair (Value intrep) (Value intrep);
      VFun (Value intrep => (Value intrep => a) => a);
      # et cetera
  };

but now of course we can't alpha-convert quite as freely, since the name
"intrep" on the left-hand side of "==" is getting copied implicitly
into each constructor declaration as its result type.

Explicit universals and existentials
....................................

I think the syntax for these should be fairly light.

In a lot of type theories that have universals, we think of them as
being given by a higher-order constructor function (or maybe one of
them for each kind).  So the type of the polymorphic identity function
looks like

  all a. (a => a)  =~=  All (a -> (a => a))

where

  All of (Type => Type) => Type;

I want to leverage this idea but make it syntactically simpler.  So
we say that when "->" is used at constructor level, sometimes the
constructor variables being abstracted over are implicitly
universal or existential.  The idea is that

  a -> !b -> ?c -> !d -> Foo a !b ?c !d

implicitly means

  a -> All (b -> Exists (c -> All (d -> Foo a b c d))

or something like that.  (Here my Isabelle history is showing.)  I
have to work out the details.  You could also say

  fn a !b ?c !d -> Foo a !b ?c !d

as usual.  But anyway, the earlier implicit existential in the Value
datatype could then be made explicit by

  data Value == {
      # ...
      VFun (?a -> Value => (Value => ?a) => ?a);
  };

Another example is well-typed self-application:

  SelfApp of (!a -> !a => !a) => (!a -> !a => !a);
  SelfApp x == x x;

(This is in the model of polymorphism based on subtyping, not explicit
instantiation; i.e. Curry style.)

Another example is the type of the Filter function on dependently
refined lists, which needs an existential:

  data List a n == { Nil of List a 0; Cons of a => List a n => List a (n+1) };
  Filter of (a => Bool) => List a n => ?m -> List a ?m suchthat ?m <= n;
  Filter pred == {
      Nil -> Nil;
      Cons x xs -> Cons x (fn xs) if pred x | fn xs;
  };


Lazy datatypes
..............

TODO

Use Okasaki/wadler even style

Declaring a datatype always implicitly declares its lazy counterpart

Lazy expressions and functions must have a lazy datatype for their types

Lists
.....

The standard list syntax is given by

  data List a == { . ; a : List a };

Strings
.......

The type String is just a synonym for List Char.  A string is just the
list of its Unicode codepoints (Char includes all Unicode codepoints, not
just what the unicode standard calls a "character"... noncharacters are
useful for sentinel values).

** QUESTION: I currently think that strings should just be lists.  But
** can anyone think of counter-arguments?
**
** As far as representation, I think we can afford, in 2005, to be a
** little thunk-happy.  One simple idea: a List is either a machine
** word uniquely identifying the nil (.) constructor, a pointer to a
** tagged pair for the cons (:) constructor, or a pointer to a thunk
** (tagged as such).  So string constants can be stored in UTF-8, or
** whatever, and the code pointed to by the thunk can actually read
** out the individual characters.

The character "#" is special in string literals.  The type of a string
literal is String, if it contains no unescaped "#"s.  If it contains
one "#", then its type is (String => String).  If it contains two "#"s,
then its type is (String => String => String), et cetera.  So we can
write e.g.

  Print ("To # or # to be..." "be" "not")

This would obviously be more powerful in the presence of Haskell-style
type classes, in which case we'd want "To # or # to be..." to have a type
like (Showable => Showable => String) or whatever.

A further generalization would be to allow pattern matching in the same
style:

  Foo ("# + #" s1 s2) == "add # #" s1 s2;

Here I think we'd want to generalize the syntax to allow for regular
expressions.

** QUESTION: If we allow regular expression matching, how do we avoid
** or finesse the issue of multiple possible matches?

Vectors and arrays
..................

** RANT:
**
** In general, there should be no FORCED distinction between "vectors" with
** constant time access and "lists" (you can always use type abstraction
** to CREATE such a distinction if you wish).  Any list can be fed to a
** primitive function
**
**   Pack of List a => List a
**
** that is semantically indistinguishable from the identity function, but
** the list you get back from Pack is "suspiciously fast", in that the
** operation
**
**   Nth of List a => Int => a
**
** will execute in constant time.  The function
**
**   Tabulate of Int => (Int => a) => List a
**
** also creates "suspiciously fast" lists, and in fact Pack can be
** defined in terms of it.
**
** In fact, I might go so far as to eliminate the array types, and go
** with (Ref (List a)) instead.  As long as you use a list in a "linear"
** way, thinking of the primitive
**
**   Update of List a => Int => a => List a
**
** as linear, as long as you never use the first argument again,
** then the underlying vector, if it is a vector, should be destructively
** modified, and the update should happen in constant time.  The
** linearity check could be static or dynamic.  If you're worried that
** the linearity check would be fragile, you can always use type
** abstraction to create an Array a implemented by Ref (List a), where all the
** operations exposed by the abstract type on Array a have implementations
** that are statically linear, and so obviously so that even a rather
** stupid compiler will be able to see that.

Conjunction and disjunction
...........................

As usual, we want these to be short-circuiting, so they need to be special
forms.  As a side benefit, we get to choose the meaning of the special
forms by analogy at the level of types and patterns.

For types, obviously && should be intersection and || should be union.

For patterns, && is simultanous matching and || is alternative matching.
Simultaneous matching generalizes the "as" construct of SML, with the
bonus that you don't have to remember which way it goes (always a problem
for me), because && is commutative:

  Foo (list && Cons x xs) == Cons x list;
  Foo (Cons x xs && list) == Cons x list;

  Primary? (Red || Green || Blue) == True;
  Primary? _ == False;

Numbers
.......

All number types are subtypes of a common type Number.

A Number is one of the following:

1. A positive or negative infinite-precision rational.
2. Positive or negative zero.
3. Positive or negative infinity.
4. An IEEE NaN.

This makes it possible to coerce any IEEE floating type to Number without
loss.  (And the infinities are useful anyway, for the same reason that
they are in the IEEE standard.  We can always carve out another subtype
of Number that disallows them, if we want.)

As usual, the basic equality on Number must be transitive, and a congruence.
So it must distinguish the zeroes, et cetera.  No one is stopping us from
defining other, less sharp equalities.  No one is stopping us from carving
out a subtype of Number not containing the infinities, in which case the
distinction between positive and negative zero is unobservable (because
the special reciprocal operation mapping this subtype to itself will raise
an exception when given a zero rather than returning an infinity).  No one
is stopping us from defining a subtype Integer of Number containing only
the positive zero (and a lossy operation mapping the Number -0 to the
Integer +0).

** QUESTION: Complexes?  (See Kahan's article on the sign of zero.)

... more to come