nextupprevious
Next:Experimental workUp:Anaphora resolution in SpanishPrevious:Constraints and preferences as


The anaphora resolution algorithm (ARDi)

In this section, the intuitive algorithm for anaphora resolution in spoken dialogue systems (ARDi) is presented. ARDi operates with syntactic information provided by the SUPP partial parser [Ferrández et al.1998]. SUPP is based on a partial representation of slot unification grammar analysis [Ferrández et al.1999]. This partial representation gives some of the utterance constituents, such as NPs, PPs, verbal chunks, and partial information about subordinated clauses. Thus, ARDi combines two kinds of knowledge about dialogues: (1) linguistic knowledge, such as lexical, morphological, and syntactic knowledge; and (2) knowledge about the dialogue's structure itself, which is based on the annotation of adjacency pairs5 and knowledge about the topic of the dialogue (manually annotated). Figure 2 shows the anaphora resolution procedure.
Procedure RESOLUTION (A,L(AAS))
Let ANAPHOR = the anaphor A
Let AAS = anaphoric accessibility space from A
Let LIST = a list L(AAS) of all NPs (antecedent candidates)from AAS
For each NP in LIST, apply constrains of morphological
               agreement between NP and ANAPHOR to obtain LIST1
end for
For each NP in LIST1, apply constrains of syntactic conditions
               between NP and ANAPHOR to obtain LIST2
end for
For all NP in LIST2, apply linguistic and discourse structural
               preferences (in the order described in step 3 below )
               until |LIST2| = 1
end for
Return LIST2
end procedure
Figure 2: Anaphora resolution procedure
ARDi is based, intuitively, on the following three steps:
  1. Obtain all possible antecedents from dialogue structure and topic as follows:
    1. take those NPs that are included in the same adjacency pair (AP) as the anaphor, and
    2. take those NPs that are included in the previous AP to that containing the anaphor, and
    3. take those NPs that are included in the most recent unclosed AP containing the AP containing the anaphor, and
    4. take the topic of the dialogue
  2. Discard incompatible antecedents by applying linguistic constraints, as follows:
    1. for pronominal anaphora:
      1. discard those antecedents that do not agree in gender, number, and person
      2. discard the antecedents that are non-co-referent according to the following rule:

      3. A pronoun P is non-co-referential with a (non-reflexive or non-reciprocal) noun phrase N if any of the following conditions6 hold:
        • P and N are in the same utterance and clause, and P and N modify the head of the same NP
        • P and N are in the same utterance and clause, and P does not modify the head of any NP
    2. for adjective anaphora:
      1. discard those antecedents that do not agree in gender
      2. discard those antecedents whose head noun is not of the lexical category ``COMMON''
  3. If more than one antecedent is left, filter the remaining antecedents by applying the following weighted preferences:
    1. for pronominal anaphora:
      1. antecedents that are in the same AP as the anaphor (weight = 35)
      2. antecedents that are in the previous AP to that containing the anaphor (weight = 20)
      3. antecedents that are in the most recent unclosed AP (weight = 30)
      4. antecedents in the topic (weight = 15)
      5. antecedents that appear with the verb of the anaphor more than once (weight = 5)
      6. antecedents that are in the same position with reference to the verb as the anaphor (before or after)(weight = 5)
      7. antecedents that are in the same position with reference to the utterance as the anaphor (weight = 5)
      8. the nearest antecedent to the anaphor (used when more than one candidate obtains the highest value)
    2. for adjectival anaphora:
      1. antecedents that are in the same AP as the anaphor (weight = 35)
      2. antecedents that are in the previous AP to that containing the anaphor (weight = 10)
      3. antecedents that are in the most recent unclosed AP (weight = 10)
      4. antecedents in the topic (weight = 35)
      5. antecedents that share the same kind of modifiers (e.g., prepositional phrases, adjectives, and so on) (weight = 5)
      6. antecedents with exactly the same modifiers (e.g., the same adjective 'red') (weight = 5)
      7. antecedents that agree in number (weight = 5)
      8. the nearest antecedent to the anaphor (used when more than one candidate obtains the highest value)
These preferences were developed as a result of the empirical study explained in the following section.


nextupprevious
Next:Experimental workUp:Anaphora resolution in SpanishPrevious:Constraints and preferences as
patricio 2001-10-17