Our dictionary language



next up previous
Next: An example Up: Notation and terminology Previous: Disjunctive form

Our dictionary language

To streamline the difficult process of writing the dictionary, we have incorporated several other features to the dictionary language. Examples of all of these features can be found in section 3.

It is useful to consider connector matching rules that are more powerful than simply requiring the strings of the connectors to be identical. The most general matching rule is simply a table - part of the link grammar - that specifies all pairs of connectors that match. The resulting link grammar is still context-free.

In the dictionary presented later in this paper, and in our larger on-line dictionary, we use a matching rule that is slightly more sophisticated than simple string matching. We shall now describe this rule.

A connector name begins with one or more upper case letters followed by a sequence of lower case letters or *s. Each lower case letter (or *) is a subscript. To determine if two connectors match, delete the trailing + or -, and append an infinite sequence of *s to both connectors. The connectors match if and only if these two strings match under the proviso that * matches a lower case letter (or *).

For example, S matches both Sp and Ss, but Sp does not match Ss. Similarly, D*u, matches Dmu and Dm, but not Dmc. All four of these connectors match Dm.

The formula ``=13((A- =13& B+) =13or ())'' is satisfied either by using both A- and B+, or by using neither of them. Conceptually, then, the the expression ``=13(A+ =13& B+)'' is optional. Since this occurs frequently, we denote it with curly braces, as follows: =13{A+ =13& B+}.

It is useful to allow certain connectors to be able to connect to one or more links. This makes it easy, for example, to allow any number of adjectives to attach to a noun. We denote this by putting an ``@'' before the connector name, and call the result a multi-connector.

Our dictionaries consist of a sequence of entries, each of which is a list of words separated by spaces, followed by a colon, followed by the formula defining the words, followed by a semicolon.



next up previous
Next: An example Up: Notation and terminology Previous: Disjunctive form




Thu Oct 12 13:01:13 EDT 1995