The anaphora resolution algorithm (ARDi)

Next:Experimental workUp:Anaphora resolution in SpanishPrevious:Constraints and preferences as

The anaphora resolution algorithm (ARDi)

In this section, the intuitive algorithm for anaphora resolution in spoken dialogue systems (ARDi) is presented. ARDi operates with syntactic information provided by the SUPP partial parser [Ferrández et al.1998]. SUPP is based on a partial representation of slot unification grammar analysis [Ferrández et al.1999]. This partial representation gives some of the utterance constituents, such as NPs, PPs, verbal chunks, and partial information about subordinated clauses. Thus, ARDi combines two kinds of knowledge about dialogues: (1) linguistic knowledge, such as lexical, morphological, and syntactic knowledge; and (2) knowledge about the dialogue's structure itself, which is based on the annotation of adjacency pairs⁵ and knowledge about the topic of the dialogue (manually annotated). Figure 2 shows the anaphora resolution procedure.

**Figure 2:** Anaphora resolution procedure
Procedure RESOLUTION (A,L(AAS)) Let ANAPHOR = the anaphor A Let AAS = anaphoric accessibility space from A Let LIST = a list L(AAS) of all NPs (antecedent candidates)from AAS For each NP in LIST, apply constrains of morphological agreement between NP and ANAPHOR to obtain LIST1 end for For each NP in LIST1, apply constrains of syntactic conditions between NP and ANAPHOR to obtain LIST2 end for For all NP in LIST2, apply linguistic and discourse structural preferences (in the order described in step 3 below ) until \|LIST2\| = 1 end for Return LIST2 end procedure

ARDi is based, intuitively, on the following three steps:

Obtain all possible antecedents from dialogue structure and topic as follows:

take those NPs that are included in the same adjacency pair (AP) as the anaphor, and
take those NPs that are included in the previous AP to that containing the anaphor, and
take those NPs that are included in the most recent unclosed AP containing the AP containing the anaphor, and
take the topic of the dialogue

Discard incompatible antecedents by applying linguistic constraints, as follows:

for pronominal anaphora:

discard those antecedents that do not agree in gender, number, and person
discard the antecedents that are non-co-referent according to the following rule:

⁶

P and N are in the same utterance and clause, and P and N modify the head of the same NP
P and N are in the same utterance and clause, and P does not modify the head of any NP

for adjective anaphora:

discard those antecedents that do not agree in gender
discard those antecedents whose head noun is not of the lexical category ``COMMON''

If more than one antecedent is left, filter the remaining antecedents by applying the following weighted preferences:

for pronominal anaphora:

antecedents that are in the same AP as the anaphor (weight = 35)
antecedents that are in the previous AP to that containing the anaphor (weight = 20)
antecedents that are in the most recent unclosed AP (weight = 30)
antecedents in the topic (weight = 15)
antecedents that appear with the verb of the anaphor more than once (weight = 5)
antecedents that are in the same position with reference to the verb as the anaphor (before or after)(weight = 5)
antecedents that are in the same position with reference to the utterance as the anaphor (weight = 5)
the nearest antecedent to the anaphor (used when more than one candidate obtains the highest value)

for adjectival anaphora:

antecedents that are in the same AP as the anaphor (weight = 35)
antecedents that are in the previous AP to that containing the anaphor (weight = 10)
antecedents that are in the most recent unclosed AP (weight = 10)
antecedents in the topic (weight = 35)
antecedents that share the same kind of modifiers (e.g., prepositional phrases, adjectives, and so on) (weight = 5)
antecedents with exactly the same modifiers (e.g., the same adjective 'red') (weight = 5)
antecedents that agree in number (weight = 5)
the nearest antecedent to the anaphor (used when more than one candidate obtains the highest value)

These preferences were developed as a result of the empirical study explained in the following section.

Next:Experimental workUp:Anaphora resolution in SpanishPrevious:Constraints and preferences as

patricio 2001-10-17