A function may bind more than one argument by using a pattern, rather than a variable, in the argument position. Function expressions may have the form

`fn`

pat`=>`

exp

where *pat* is a pattern and *exp* is an expression. Application of
such a function proceeds much as before, except that the argument value is matched against
the parameter pattern to determine the bindings of zero or more variables, which are then
used during the evaluation of the body of the function.

For example, we may make the following definition of the Euclidean distance function:

`val dist : real * real -> real = fn (x:real, y:real) => sqrt (x*x + y*y)`

This function may then be applied to a pair (two-tuple!) of arguments to yield the
distance between them. For example, `dist (2.0,3.0)`

evaluates to
(approximately) `4.0`

.

Using `fun`

notation, the distance function may be defined more concisely as
follows:

`fun dist (x:real, y:real):real = sqrt (x*x + y*y)`

The meaning is the same as the more verbose `val`

binding given earlier.

Keyword parameter passing is supported through the use of record patterns. For example, we may define the distance function using keyword parameters as follows:

`fun dist' {x=x:real, y=y:real} = sqrt (x*x + y*y)`

The expression `dist' {x=2.0,y=3.0}`

invokes this function with the
indicated x and y values.

Functions with multiple results may be thought of as functions yielding tuples (or records). For example, we might compute two different notions of distance between two points at once as follows:

`fun dist2 (x:real, y:real):real*real = (sqrt (x*x+y*y), abs(x-y))`

Notice that the result type is a pair, which may be thought of as two results.

These examples illustrate a pleasing regularity in the design of ML. Rather than
introduce *ad hoc* notions such as multiple arguments, multiple results, or keyword
parameters, we make use of the general mechanisms of tuples, records, and pattern
matching.

It is sometimes useful to have a function to select a particular component from a tuple
or record (*e.g.*, the third component or the component labeled `url`

).
Such functions may be easily defined using pattern matching. But since they
arise so frequently, they are pre-defined in ML using *sharp notation*. For
any record type *typ _{1}*

`*`

...`*`

`#i`

of type `*`

...`*`

`->`

typ

`fun #i (_, ..., _, x, _, ..., _) = x`

where `x`

occurs in the *i*th position of the tuple (and there are
underscores in the other *n-1* positions). Thus we may refer to the second
field of a three-tuple *val* by writing `#2 `

*val*. It is
bad style, however, to over-use the sharp notation; code is generally clearer and easier
to maintain if you use patterns wherever possible. Compare, for example, the
following definition of the Euclidean distance function written using sharp notation with
the original.

`fun dist (p:real*real):real = sqrt((#1 p)*(#1 p)+(#2 p)*(#2 p))`

You can easily see that this gets out of hand very quickly, leading to unreadable
code. *Use of the sharp notation is strongly discouraged!*

A similar notation is provided for record field selection. The following function
`#lab`

selects the component of a record with label lab.

`fun #lab {lab=x,...} = x`

Notice the use of ellipsis! Bear in mind the disambiguation requirement: any use
of `#lab`

must be in a context sufficient to determine the full record type of
its argument.

Tuple types have the property that all values of that type have the same form (*n*-tuples,
for some *n* determined by the type); they are said to be *homogeneous*.
For example, all values of type `int*real`

are pairs whose first component is
an integer and whose second component is a real. Any type-correct pattern will match
any value of that type; there is no possibility of failure of pattern matching. The
pattern `(x:int,y:real)`

is of type `int*real`

and hence will match
any value of that type. On the other hand the pattern `(x:int,y:real,z:string)`

is of type `int*real*string`

and cannot be used to match against values of type
`int*real`

; attempting to do so *fails at compile time*.

Other types have values of more than one form; they are said to be *heterogeneous *types.
For example, a value of type `int`

might be `0`

, `1`

, `~1`

,
... or a value of type `char`

might be `#"a"`

or `#"z"`

.
(Other examples of heterogeneous types will arise later on.) Corresponding to each
of the values of these types is a pattern that matches only that value. Attempting
to match any other value against that pattern *fails* *at execution time*.
For the time being we will think of match failure as a fatal run-time error, but later on
we will see that the extent of the failure can be controlled.

Here are some simple examples of pattern-matching against values of a heterogeneous type:

`val 0 = 1-1`

`val (0,x) = (1-1,`

`34`

)

val (0, #"0") = (2-1, #"0")

The first two bindings succeed, the third fails. In the case of the second, the
variable `x`

is bound to

after the match. No
variables are bound in the first or third examples.`34`

The importance of constant patterns becomes clearer once we consider how to define
functions over heterogeneous types. This is achieved in ML using a *clausal
function definition.* The general form of a function is

`fn`

pat_{1}`=>`

exp_{1}`|`

...`|`

pat_{n}`=>`

exp_{n}

where each *pat _{i}* is a pattern and each

`=>`

The typing rules for matches ensure consistency of the clauses. Specifically,

- Each pattern in the match must have the same type
*typ*. - Each expression in the match must have the same type
*typ'*, given the types of the variables in the patterns.

The type of a function whose body is a match satisfying these requirements is *typ*`->`

*typ'.
*Note that there is no requirement that *typ* and *typ'*
coincide!

Application of functions with multiple clauses to a value *val *proceeds by
considering each clause *in the order written.* At stage *i* the
argument value *val* is matched against the pattern *pat _{i}*; if
the pattern match succeeds, evaluation continues with the evaluation of expression

`val recip : int -> int = fn 0 => 0 | n:int => 1 div n`

This defines an integer-valued reciprocal function on the integers, where the
reciprocal of `0`

is arbitrarily defined to be `0`

. The
function has two clauses, one for the argument `0`

, the other for non-zero
arguments `n`

. (Note that `n`

is guaranteed to be non-zero
because the patterns are considered in order: we reach the pattern `n:int`

only
if the argument fails to match the pattern `0`

.)

Using `fun`

notation we may define `recip`

as follows:

`fun recip 0 = 0`

| recip (n:int) = 1 div n

One annoying thing to watch out for is that the "`fun`

" form uses
an equal sign to separate the pattern from the expression in a clause, whereas the "`fn`

"
form uses an arrow.

Heterogeneous types abound. Perhaps the most fundamental one is the type `bool`

of booleans. Its values are `true`

and `false`

.
Functions may be defined on booleans using clausal definitions that dispatch on `true`

and `false`

. For example, the negation function is defined clausally as
follows:

`fun not true = false`

| not false = true

In fact, this function is pre-defined in ML.

Case analysis on the values of a heterogeneous type is performed by application of a clausally-defined function. The notation

`case`

exp`of`

pat_{1}`=>`

exp_{1}`|`

...`|`

pat_{n}`=>`

exp_{n}

is short for the application

`(fn`

pat_{1}`=>`

exp_{1}`|`

...`|`

pat_{n}`=>`

exp_{n}`)`

exp.

Evaluation proceeds by first evaluating *exp*, then matching its value
successively against the patterns in the match until one succeeds, and continuing with
evaluation of the corresponding expression. The `case`

expression fails
if no pattern succeeds to match the value.

The conditional expression

`if`

exp`then`

exp_{1}`else`

exp_{2}

is short-hand for the case analysis

`case`

exp`of`

`true`

`=>`

exp_{1}`|`

`false`

`=>`

exp_{2}

which is itself short-hand for the application

(

`fn`

`true`

`=>`

exp_{1}`|`

`false`

`=>`

exp)_{2}exp.

The "short-circuit" conjunction and disjunction operations are defined as
follows. The expression *exp _{1}*

`andalso`

`if`

`then`

`else`

`false`

and the expression `orelse`

`if`

`then`

`true`

`else`

Conceptually, equality and comparison operations on the types `int`

, `char`

,
and `string`

are defined by infinite (or, at any rate, enormously large)
matches, but in practice they are built into the language as primitives. (The
ordering on `char `

and `string`

are the lexicographic
orderings.) Thus we may write

`fun is_alpha c:char =`

(#"a" <= c andalso c <= #"z") orelse (#"A" <= c andalso c <= #"Z")

to test for alphabetic characters.

All this talk of success and failure of pattern matching brings up the issue of *exhaustiveness
*and *redundancy* in a match. A clause in a match is *redundant*
if any value matching its pattern must have matched the pattern of a preceding clause in
the match. A redundant rule can never be reached during execution. It is *always*
an error to have a redundant clause in a match. Redundant clauses often arise
accidentally. For example, the second clause of the following function definition is
redundant for annoyingly subtle reasons:

`fun not True = false`

| not false = true

The mistake is to have capitalized `True`

so that it is no longer the
boolean-typed constant pattern, but is rather a variable that matches any value of Boolean
type. Hence the second clause is redundant. Reversing the order of clauses can
also lead to redundancy, as in the following mistaken definition of `recip`

:

`fun recip (n:int) = 1 div n`

| recip 0 = 0

The second clause is redundant because the first clause will always match *any*
integer value, including `0`

.

A match (as a whole) is *exhaustive* if every possible value of the domain type
of the match must match some clause of that match. In other words, pattern matching
against an exhaustive pattern cannot fail at run-time. The clauses in the (original)
definition of `recip`

are exhaustive because they cover every possible integer
value. The match comprising the body of the following function is not exhaustive:

`fun is_numeric #"0" = true`

| is_numeric #"1" = true

| is_numeric #"2" = true

| is_numeric #"3" = true

| is_numeric #"4" = true

| is_numeric #"5" = true

| is_numeric #"6" = true

| is_numeric #"7" = true

| is_numeric #"8" = true

| is_numeric #"9" = true

When applied to, say, `#"a"`

, this function fails.

It is *often*, but not always, an error to have an inexhaustive match. The
reason is that the type system cannot record many invariants (such as the property that ```
is_numeric
```

is only called with a character representing a decimal digit), and hence the
compiler will issue a warning about inexhaustive matches. It is a good idea to check
each such warning to ensure that you have not accidentally omitted a clause from the
match.

Any match can be made exhaustive by the inclusion of a *catch-all* clause of the
form

`_ =>`

exp

where *exp* is an expression of the appropriate type. It is a bad idea to
simply stick such a clause at the end of every match in order to eliminate
"inexhaustive pattern" warnings. By doing so you give up the possibility
that the compiler may warn you of a legitimate error (due to a forgotten case) in your
program. The compiler is your friend! Use it to your advantage!