Modules and programming conventions.
------------------------------------


(1) How the source files are structured.
----------------------------------------                 

Although I have not assumed that you will have a module system, I have
structured the source into what would be modules if you did have one. I
have done this by splitting it into different files: one file per
notional module. Each file contains two kinds of predicates: public and
private. Public predicates are those that I know to be called from other
files. Private predicates are those called only from within the file
they are defined in.

Each file looks like this:


    /*  SCRIPT.PL  */


    :- module script.


    :- public next/1,
              again/0,
              script/2,
              section/2.


    /*
    SPECIFICATION
    -------------              

    <general remarks>

    <operator declarations embedded here, if any (see OUTPUT.PL)>

    PUBLIC again:
    <description of again>

    PUBLIC next( Status- ):
    <description of next>

    ...etc...
    */


    /*
    IMPLEMENTATION
    --------------             

    <general remarks>
    */


    :- needs
        bug / 1,
        find_file / 5,
        open_and_reconsult / 2,
        output / 1.


    :- dynamic
        script_position / 1,
        script_name / 2,
        script_text / 2.

Each file starts with a comment giving its name. Note that this may not
be the same as the name on the source disc available from the publisher:
this latter is written on MS-DOS, and its filenames have to conform to
the idiocies of that operating system. The comment is followed by the
directive
    :- module Name.
Name is an atom, usually similar to the filename.

Then follows
    :- public Predicate1/Arity1,
              Predicate2/Arity2,
              ...
The name and arity of every public predicate - i.e. every predicate
called from outside the file (or which it is sensible to call from
outside the file) - appears here.

Following these two directives there may be a :-dynamic directive. This
has the same syntax as public: it declares exported predicates which
are defined by run-time assertions, rather than in the code as written.

After this is the SPECIFICATION comment. This starts with general
information about the module and what it's for, and then gives a
specification of each public predicate individually: argument types and
modes, the predicate's logical meaning, and so on.

The comment may be broken by a part
    */
    <operator declarations>
    /*
The rationale for placing them here is that they will affect what
happens outside the module (by changing how Prolog parses its input),
and so they form part of the interface with other modules.

Each predicate is specified by a line starting
    PUBLIC PredicateName
This facilitates searching for the definition with a text editor.

Now we launch into the implementation, starting with the IMPLEMENTATION
comment. I have made a strict separation between this and the
specification part, following Modula and similar languages. Most
Prolog programmers do not do so, but I think it helps readability.

This is followed by a :-needs directive, listing predicates needed
from elsewhere. You can also quote file names:
    :- needs
        is_letter_or_digit_char / 1,
        file( sdo ),
        file( sdo_output ).
See module LIB.PL for more details. I have put in the directives mainly
to help those with module systems, but LIB also uses them for a loading
system where you needn't specify filenames.

There may also be a :-dynamic directive. This refers to dynamic, non-
exported predicates.

Next comes the source code (and possibly more comments about
implementation). Finally there is an :- endmodule directive.

This structure should be easily adaptable to your module system, if you
have one. Note that I have assumed only two levels of predicate:
*   Predicates private to a module.
*   Predicates potentially callable from anywhere in the program.


(2) The cross-referencer.
-------------------------      

Included with the source is a cross-referencing program from the
public-domain DEC10 library. This program was written by Dave Bowen and
Chris Mellish, with some contribution from others, at the Edinburgh
University Department of Artificial Intelligence. If you give it a list
of the files comprising the Tutor, then it will generate a list of each
module's publics and imports, in the form of public and import
directives. It also generates a conventional cross-reference listing,
showing which predicates are called by which others and in which file
they are defined.

The cross-referencer has to be fed a file defining the arities and names
of all built-in predicates so that it does not, for example, keep on
reporting write/1 as undefined. You will have to change this file to
accord with your system: the cross-referencer output will then enable
you to detect any predicates that I have treated as system predicates
but that you do not have.


(3) Dynamic predicates.
-----------------------        

The student's predicates are not fixed at the start of a run: they are
asserted and retracted dynamically. The same applies to certain internal
assertions used to record details of the Tutor's state (for example,
whether it's in Logic or Prolog mode). Some module systems require you
to be careful about such predicates. For example, Expert Systems
Prolog-2 effectively splits the database into as many disjoint regions
as there are modules. The system state includes a "current output
module", and any asserts and retracts alter that region of the database
belonging to the current output module.

In these systems, you will therefore have to modify my database-update
code, i.e. asserts and retracts, accordingly. It should be clear from
the dynamic directives which module each dynamic predicate belongs to.
For the student's facts, it is probably best to create a new module
(call it student) which holds only those facts and nothing else.


(4) What to do if you don't have modules.
-----------------------------------------                 

The Tutor needs modules for two reasons:

(a) To prevent name clashes between auxiliary predicates in different
modules: for example, the name "open" could easily be chosen by both the
script-reading module and the saving-facts-to-file module.

(b) To prevent predicates anywhere in the Tutor clashing with those
asserted by the student.

In preparing the source for publication, I know that some readers will
have module systems. But I have not been able to assume all will.
Consequently, I have ensured that no two predicates in the source have
the same name. Some may have the same name but different arity; if so,
they will be closely related, and will be defined close to one another.

Making sure that the student's facts don't clash is harder. However, it
helps that most of the predicates in the source have names that are
unlikely to be chosen by the average novice, and that do not resemble
any of the tutorial examples. Students normally choose simple names
derived from everyday words: sells, cost, joins, loves, is_on. I have
never, in fact, ever noticed a clash occur during student use, other
than with the built-in predicate is (see the book for some comments
about this).

If you don't have modules, then in the long run, the best solution is to
change your Prolog supplier. In the short term, while you're seeking a
new supplier, you may want to program around the deficiency (though as I
have said, there are no clashes within the Tutor, and clashes with the
student's predicates are unlikely). You can check for clashes with the
cross-referencer. If you want to be extra certain, you could pass the
code through a renamer which sticks a module name onto the name of every
private predicate. Tony Dodd's book "Prolog: A Logical Approach" has a
listing of such a program; there is also one in the DEC-10 library.


(5) Modes.
----------

The notion of  mode is an important one in Prolog. In a call like
    append( [a,b,c], [d,e,f], L )
we say that append is being used in mode (+,+,-), meaning that the first
two arguments are instantiated before the call, and the third is not,
but should be afterwards.

When documenting Prolog, it is important to indicate which modes each
predicate will be called in. There is some evidence that even
experienced Prolog programmers take much longer to comprehend certain
definitions unless they are told the expected use of modes - which
arguments are set before call, and which the predicate is expected to
set - before reading the definition. (This is an interesting counter to
those who state that Prolog can be read completely declaratively). (See
also my comments on GRIPS in the book.) Modes are also important because
some predicates cannot be used in all possible modes, possibly because
of side-effects, possibly because they calll var, cut and other
non-logical built-ins.

I have tried to be explicit about modes, and most of them are commented.
I use the following notation:
    A+: Means that argument A is instantiated on call, i.e. is an
    input argument.
    A-: Means that A is uninstantiated on call, i.e. is an output
    argument.
    A?: Means that A can be either.

Pragmatically, arguments with mode ? are usually used as output
arguments to be immediately compared with some value. Thus, in module
USEFUL, 'max' is commented as
    PUBLIC max( A+, B+, C? ):
    C is the maximum of A and B.
The call max(X,Y,Max) would, assuming Max is initially uninstantiated,
set it to the greater of X and Y. But max could also be used in a
combined evaluate-and-test: max(X,Y,10). This would work out the greater
of X and Y, compare it with 10, and fail if they differ.


(6) Steadfastness.
------------------

How a predicate deals with mode ? arguments is related to what O'Keefe
in "The Craft of Prolog" calls steadfastness: the property of being
firm, unwavering when given an argument of unexpected mode.

Consider the definition below

    test_clause( Head, Body, 1 ) :-
        clause( Head, Body ), !.

    test_clause( _, _, 0 ).

The idea is that we should be able to call test_clause(H,B,N), given
head H and body B, and have N set to 1 if the clause H:-B
exists, otherwise to 0. Now, suppose we call
    test_clause( a(1), fred, 0 ).
and we have a clause for
    a(1) :- fred.

The 0 in the third argument of the call will fail to unify with the 1 of
test_clause's first clause. So control will immediately pass to clause
2 of test_clause, implying to the caller that there is no clause for
a, even though we know there is. The reason is that performing a match
in the head of test_clause immediately forces execution down the wrong
path.

We can get round this by writing

    test_clause( Head, Body, Out ) :-
        clause( Head, Body ), !, Out = 1.

    test_clause( _, _, 0 ).

delaying the output unification until after the cut. 1 will now unify
with Out; the call to clause will succeed; Out will not be equal
to 1, but the cut will prevent backtracking, so test_clause will fail.

Actually, we may need to be even more careful than this. Suppose we
wanted test_clause to succeed if and only if the first clause for the
predicate being examined is Head:-Body. Suppose further that we have
clauses
    b(2) :- bert.
    b(3) :- tim.
and we call
    test_clause( b(3), tim, Z ).
What will happen is that test_clause will call clause(b(3),tim).
This will test against the first clause for b, fail to match, and so
clause will go on to the next clause, where it will succeed.
And so test_clause will set Z to 1. What has happened here is that
passing instantiated arguments - b(3) and tim - to clause
forced it to backtrack too early. We can prevent this by writing

    test_clause( Head, Body, Out ) :-
        new_term( Head, NewHead ),
        new_term( Body, NewBody ),
        clause( NewHead, NewBody ),
        !,
        NewHead = Head,
        NewBody = Body,
        Out = 1.

    test_clause( _, _, 0 ).

    new_term( Old, Copy ) :-
        functor( Old, F, A ),
        functor( Copy, F, A ).

This section may appear to be a digression. It becomes relevent because
I have tried to make my predicates steadfast where reasonably possible,
and it helps to know why, and what this means.


(7) Mode- and type-checking.
----------------------------

When writing the book, I had hoped to distribute mode- and
type-checkers, and to include all the Tutor's modes and types in
directives. This would undoubtedly have caught a lot of errors. Though
there are checkers in my library (in the DEC-10 tools), I haven't made
them work with modules and to give good diagnostics, and I haven't added
:-mode or :-type declarations to the Tutor's source. Let me know if you
do.


(8) A Lint checker.
-------------------

Amongst C programmers, a "Lint checker" is a program that checks for
possible errors that the compiler, because of the way the language is
defined, cannot itself detect. The name has carried over to Prolog. In
this context, a Lint checker will have as one of its functions a check
for possibly-misspelt variables. Such a check is based on the assumption
that the programmer will use _ to denote variables that occur only once:
    member( X, [X|_] ).
    member( X, [_|T] ) :- member( X, T ).

If this assumption holds, then should a named variable occur once only,
it must be misspelt. The program below performs such a check. It
proved invaluable when I wrote the arrays module, which contains large
numbers of similarly named variables, and in which I made a large number
of typing errors.


(9) Automatic program testing.
------------------------------

When developing programs, it is useful to have some way of testing
predicates against their specification, and to be able to run these
tests automatically every time you update a module. The entry AUTOTEST
in my library does this, and I have included it with the source of the
Tutor. The idea is that one can write a "script" (not the same as the
Tutor's scripts) containing test calls and an indication of whether
these calls ought to succeed, fail or raise an error.

The table below shows some typical tests.
*   append( [1], [2], [1,2] ) :: s.
        The goal should succeed.
*   member( x, [] ) :: f.
        The goal should fail.
*   see( V ) :: c.
        The goal should crash.
*   see( 'Not a filename' )::crashes(error('-RMS-E-FNF, file not found'))
        The goal should crash with a specified error code.

This program has come in useful during several cycles of modification.
It can of course be used for programs other than the Tutor.


(10) Timing goals.
------------------

In writing the Tutor, I have to admit that I have not been unduly
concerned with efficiency, taking the view that if response at the
terminal is too fast to produce a visible delay, it is fast enough.
Nevertheless, it is sometimes necessary to make predicates more
efficient, and attempting to do so can provoke some interesting
exercises in program transformation.

"The Craft of Prolog" contains in chapter 3 predicates for measuring
the amount of CPU time consumed by a goal. I have adapted these to
Poplog, and the result is in CPU_TIME.PL.

When using these on our VAX, it becomes apparent that the values
returned by the Pop procedure systime differ substantially even between
successive calls of the same goal. Although Poplog is unlikely to be in
exactly the same storage state at the start of each goal, it is unlikely
that the differences in state are great enough to account for these
variations. More likely is that the operating system routine called by
systime is counting swap time or other operating system activities as
part of the caller's CPU time: a not terribly helpful way of accounting.

Because of these variations, it's useful to get some idea of the scatter
between adjacent calls of the same goal. The predicates in MEAN_TIME.PL
do this.


(11) The use of Pop-11.
-----------------------                    

The first version of the Tutor made extensive use of Pop-11, calling it
from Prolog whenever something could not be done in Prolog (testing for
file existence, reading command-line parameters), or could be done more
efficiently in Pop (editing strings, sending output to character lists).

In writing this book, I have kept to Prolog wherever possible. In a few
places, there was just no portable way to do what I needed to do, and I
have had to retain the calls to Pop-11. Any modules that do still call
Pop are clearly commented, and I have suggested ways of porting to other
Prologs. Poplog is in fact quite poor in Prolog extensions, presumably
because it  is possible to do most things in Pop-11; many other
Prologs - Quintus, Prolog-2, C-Prolog for example - are well
supplied with predicates for doing what I've had to do via Pop.

In my code, the use of Pop signals itself in two ways. The predicate
prolog_eval(P) treats P as the name of a Pop-11 routine and calls it,
then succeeds. prolog_eval(P,R) calls P, expecting it to return a
result, and unifies R with that result.

The Pop-11 code itself is in separate files with the extension .P.
These are loaded with the predicate pop_compile, defined in LIB.
If you don't use Pop, you can forget all about these.            
