KANTOO Interlingua Description

This document is a synopsis of the evolving KANT Interlingua representation (Mitamura, Nyberg and Carbonell, 1991; Leavitt, Lonsdale, and Franz, 1994; Czuba, Mitamura and Nyberg, 1998) as used in the KANT Knowledge-based Machine Translation system (home page) to represent the semantics of expressions written in KANT Controlled English (KCE). It explains the formal structure of the language and describes the various different types of concepts that are represented.

Structure of the Interlingua

An Interlingua Frame (IF) is a recursive structure that represents a semantic concept. An IF consists of a head and a number of slots of different types. The head of an IF is a concept, described below. There are several different types of slot; these are described below.

When a slot must contain more than one value, the values may be either conjoined by creating a :multiple list, or disjoined by creating a :or list. These lists consist of either :multiple or :or as their first element, and the coordinated values as the remaining elements. Thes values will always be in the same order that they appeared in the source. When the coordinated values are IFs, their order will be the same as their surface realization in the CTE input.

The following figure shows a BNF grammar for Interlingua Frames. Note that symbol, string, and number refer to the Common Lisp data types.


<IF>                  ::= (<head> <slot>*)
<head>                ::= symbol 
<slot>                ::= <semantic-role> | <structure-role> |
                          <feature> | <role-pointer> | <variable> 
<semantic-role>       ::= (<role-name> <IF-value>)
<structure-role>      ::= (<role-name> <IF-value>) 
<feature>             ::= (<slot-name> <atom-value>)
<role-pointer>        ::= (<slot-name> <ptr-value>) 
<variable>            ::= (<slot-name> <atom-value>)
<role-name>           ::= symbol 
<slot-name>           ::= symbol 
<IF-value>            ::= <IF> | <multiple-if-value> | <disjoint-if-value> 
<multiple-IF-value>   ::= (:multiple { <IF> |
                                 <disjoint-IF-value>}+)
<disjoint-IF-value>   ::= (:or {<IF> |
                                 <multiple-IF-value> }+)} 
<atom-value>          ::= <atom> | <multiple-atom-value> |
                                 <disjoint-atom-value> 
<multiple-atom-value> ::= (:multiple { <atom> | <disjoint-atom-value> }+)} 
<disjoint-atom-value> ::= (:or { <atom> | <disjoint-atom-value> }+)} 
<atom>                ::= string | symbol | number
<ptr-value>           ::= <role-name> | <multiple-ptr-value> | <disjoint-ptr-value> 
<multiple-ptr-value>  ::= (:multiple \{ <role-name> | <disjoint-ptr-value> }+)} 
<disjoint-ptr-value>  ::= (:or { <role-name> | <multiple-ptr-value> }+) 

BNF Grammar for IFs


IF Heads

An IF head is a concept, which is represented as a symbol. In general, any symbol in an IF that begins with a * is a concept. There are actually 16 different concept prefixes, depending on the nature of the concept. These are:
  1. General Concept Heads. These are the most general types of concept:
    *A- Action.
    These heads correspond to verbal concepts. Example: *A-DRIVE.
    *O- Object.
    These heads correspond to nominal concepts. Example: *O-TRUCK.
    *M- Manner.
    These heads correspond to adverbial concepts. Example: *M-FREQUENTLY.
    *P- Property.
    These heads correspond to adjectival concepts. Example: *P-CLEAN.
    *K- .
    These heads correspond to prepositional concepts. Example: *K-IN.
    *INT- Intensifier.
    These heads correspond to intensifier concepts. Example: *INT-VERY.
    *CONJ- Conjunction.
    These heads correspond to conjunction concepts. Example: *CONJ-SINCE.
    *QUANT- Quantifier.
    These heads correspond to quantifier concepts. Example: *QUANT-ALL.
  2. Concept Heads. These are concept types for specialized classes of object.
    *Prop- Proper Name.
    These heads correspond to proper nominal concepts. Example: *Prop-December.
    *Sym- Symbol
    These heads correspond to typographic symbols. Example: *Sym-Ampersand.
    *U- Unit.
    These heads correspond to units of measurement. Example: *U-Meter.
  3. Structure Concept Head. These concept types represent more complex structured classes of object:
    *C- Crystal.
    These heads correspond to domain-independent structured concepts. Example: *C-Decimal-Number.
    *G- Grammatical Structure.
    Grammatical Structure. These heads correspond to specially structured grammatical constructions. Example: *G-Coordination.
    *Q- Prepositional semantic roles.
    Prepositional semantic roles. These heads correspond to the semantic information expressed in prepositions. Example: *Q-means_WITH.
    *S- Structured Encoding.
    These heads correspond to SGML-tagged structures. Example: *S-Propname.
    *SP- Special Structure.
    These heads are used when the information being represented is specific to the domain and not generally applicable, or when the information is overtly idiomatic in CTE. Example: *Sp-Vice-Versa.

IF Slots

There are five types of slot that may appear within an IF. These are:


Features

The following is a list of most of the simple interlingua features.


Semantic Roles

The following is a list of the semantic roles used in the interlingua.

References:

Mitamura, Nyberg and Carbonell (1991) "An Efficient Interlingua Translation System for Multi-lingual Document Production" Proceedings of the Third Machine Translation Summit.

Leavitt, Lonsdale, and Franz (1994) "A Reasoned Interlingua for Knowledge-Based Machine Translation" Proceedings of CSCSI-94.

Czuba, Mitamura and Nyberg (1998) "Can Practical Interlinguas Be Used for Difficult Analysis Problems?" Proceedings of AMTA-98 Workshop on Interlinguas.

Contacts:

Teruko Mitamura (teruko@cs.cmu.edu), Eric Nyberg (ehn@cs.cmu.edu)

Copyright © 2004 Carnegie Mellon University