THE UTTERANCE MODEL

General

Given a particular utterance, NL-Soar has to have a way to represent the syntax of the utterance in order for comprehension to begin. Most syntactic theories can offer a good syntactic representation for an utterance. However, we have chosen Government and Binding theory because the explicit principles and parameters approach of GB fits naturally into the constraints-based generate-and-test framework of NL-Soar. (Lewis, 1993). GB theory is a theory of syntax that evolved during the 1970s and 1980s. Noam Chomsky is the most notable linguist who developed the theory as it is today. Interested readers can refer to (Chomsky, Lectures on GB , 1993).

Now that we have chosen a syntactic theory, we have to worry about how to represent the syntactic structure of an utterance in Soar. We could represent a syntactic structure using either logics or a model (Lewis, section 2.1.1, 1993). Because knowledge encoded in models can be easily extracted using match-like processing, a model representation has the advantage of being computationally efficient and that is the main reason why we chose a model as the representation of syntax. Therefore, given an utterance, we can now create a model of the syntactic structure of the utterance and we call this model the utterance model (henceforth, the u-model).

Explanation of the u-model in NL-Soar

The u-model represents X-bar phrase structure as assumed in GB theory. Below is a basic X-bar schema:

X is called the zero-head and it ranges over the syntactic categories A (adjective), C (complementizer), I (inflection), N (noun), P (preposition), V (verb) and det (determiners). There are two levels of phrasal nodes projected from lexical heads: X', the level-one projection and X'' (or XP), the maximal projection. The set of available syntactic relations between nodes is {spec, comp, comp2, head, zero-head, adjoin} which denotes the structural positions of specifier, complements, heads and adjunction. In the picture above, Y'' is the spec in X'', and Z'' is the comp in X'. Comp2 denotes the second complement position of a ditransitive verb like give. X' is the head in X''. Adjunction is slightly more complicated and readers are advised to refer to a syntax book instead. The short explanation is that a node will be labelled adjoin if it is not labelled otherwise and it is adjoined to the X (X') position and its sister is another X (X'): thus, in the picture below, W'' is in the adjoin position since it is adjoined to an X' position and its sister is another X' node.

Realization of the u-model in NL-Soar

The u-model is realized as an attribute-value structure that hangs off the top-state in Soar. The implementation is straightforward: attributes correspond to the structural X-bar relations (described above), or syntactic features such as category, agreement or case. The values of the attributes correspond to other nodes in the model, or constants representing the values of the syntactic features. Each node in the u-model is represented by a unique identifier gensymed by Soar. The picture below shows a simplified u-model for [NP the man].

See the A/R set to see how this u-model is organized in the working memory and click here to see how the u-model is incrementally constructed.

Back to the problem-space hierarchy.

This page written Han Ming Ong (hanming@cs.cmu.edu)

Note: The idea behind the u-model comes from Rick's thesis (Lewis, An Architecturally-based theory of Human Sentence Comprehension, 1993)