Next: Evaluation
Up: Resolution of NLP Problems
Previous: Evaluation of Zero-Pronoun Resolution
The anaphora-resolution module used in AGIR is based on the module
presented in [Ferrández et al., 1999,Palomar, M., et al., 2001] for the SUPAR system. The algorithm
identifies noun phrase (NP) antecedents of personal, demonstrative, reflexive,
and zero pronouns in Spanish. It identifies both intrasentential and
intersentential antecedents and is applied to the syntactic
analysis generated by SUPAR. It also combines different forms of
knowledge by distinguishing between constraints and preferences.
Whereas constraints are used as combinations of several kinds of
knowledge (lexical, morphological, and syntactic), preferences are
defined as a combination of heuristic rules extracted from a
study of different corpora.
A constraint defines a property that must be satisfied in order
for any candidate to be considered as a possible solution of the
anaphor. The constraints used in the algorithm are the following:
morphological agreement (person, gender, and number) and
syntactic conditions on NP-pronoun non-co-reference.
A preference is a characteristic that is not always satisfied by
the solution of an anaphor. The application of preferences
usually involves the use of heuristic rules in order to obtain a
ranked list of candidates. Some examples of preferences used in
our system are the following: (a) antecedents that
are in the same sentence as the anaphor, (b) antecedents that have
been repeated more than once in the text, (c) antecedents that
appear before their verbs (i.e., the verb of the clause in which
the antecedent appears), (d) antecedents that are proper nouns,
(e) antecedents that are an indefinite NP, and so on.
In order to solve pronominal anaphors, they must be first
located in the text (anaphora detection) and then resolved
(anaphora resolution):
- Anaphora detection. In the algorithm, all the types of
anaphors are identified from left to right as they appear in the
sentence's slot structure obtained after the partial parsing.
To identify each type of pronoun, the information stored in the
POS tags has been used. In the particular case of zero pronouns,
they have been detected in a previous stage, as previously
described.
- Anaphora resolution. After the anaphor has been detected,
the corresponding method, based on constraints and preferences, is applied to
solve it. Each type of anaphor has its own set of
constraints and preferences, although they all follow the same
general algorithm: constraints are applied first, followed by
preferences. Constraints discard some of the candidates, whereas
preferences simply sort the remaining candidates.
Subsections
Next: Evaluation
Up: Resolution of NLP Problems
Previous: Evaluation of Zero-Pronoun Resolution
Jesus Peral
2002-12-13