
				   
			  How 'proof' Works
				   
			     Craig Latta
		Experimental Computing Facility (XCF)
			  20 September 1991
			   (about tea-time)


	The program is based on the concept of left-associative
parsing (Hausser, 1986), which states that humans comprehend speech
causally. The essence of the parser is a set of rule packages, which
are in turn made of up rules. Rules take two "categorizations" as
input, and output a third. Each lexical entry consists of a surface
form and one or more categorizations, which correspond to the
different readings of the word. Categorizations consist of "segments",
which are abbreviations for various possible linguistic attributes
(e.g.: "w*-interrogative", "finite verb", etc.)

	The parser checks sequentially by sentence, assuming that all
the words in the input text have lexical entries. Given a sentence,
the lexical entry of the first word is designated the "sentence
start".  If the categorizations of the lexical entries for the
sentence start and the next word are compatible (in the eyes of one of
the rules in the inital rule package), then the next word is grouped
into the surface form of the sentence start, and the output of the
active rule becomes the category of the new sentence start. The name
of the next rule package to operate on the sentence start is derived
from the name of the rule which produced the sentence start.

	Succeeding words are combined into the sentence start, adding
and cancelling grammatical "valencies", which are encoded in the
category of the sentence start. [For example, the sentence start "The
girl is reading a" has a valency for a noun phrase, which would be
introduced by the rule which combined "a" into the sentence start.]
This process continues until the end of the sentence is reached. Then,
a final rule, depending on the tone of the sentence (declarative,
interrogative, etc.)  checks to see that there are only ignorable
category segments left in the sentence start's category (i.e. no
"dangling" valencies). If this is the case, the sentence is considered
grammatical, and the next sentence is analyzed. If the analysis of the
sentence has failed at some intermediate point, then the user is shown
that point in the text, and is told what types of continuations to the
sentence start are valid, and allowed to fix the sentence, or to
proceed with the analysis from the beginning of the next sentence.

	If a sentence is grammatically incorrect, the earliest
offending in the sentence is highlit, and the user is presented with
several options. (S)he may either skip the word, skip the sentence, or
choose a replacement word from a list suggested by 'proof'.

	'proof' works with input from the command line, files, and
with interactive input. It has both terminal and graphic (using X)
user interfaces. It was not as as hard to write as it sounds. It
should be rewritten in Smalltalk, as should all other programs which
haven't been already.



