Pronominal Anaphora Translation into English

Next: Conclusion Up: Evaluation of the Generation Previous: Pronominal Anaphora Translation into

Pronominal Anaphora Translation into English

In this experiment, the translation of Spanish, third-person, personal pronouns and zero pronouns (excluding reflexive pronouns) into English was evaluated. We tested the method on the portion of the LEXESP corpus that was previously used in the process of anaphora resolution.

We needed to know the semantic category and the grammatical gender of the pronoun's antecedent in order to apply the number and gender rules. In the LEXESP corpus, due to the lack of semantic information, a set of heuristics was used to determine the antecedent's semantic category. On the other hand, the information about the antecedent's gender was provided by the POS tag of the antecedent's head. We conducted a blind test over the entire test corpus, and the results appear in Table 12.

Table 12: Translation of pronominal anaphora into English, evaluation phase

	Subject	Compl	Correct	Total	P(%)
LEXESP	630	145	657	775	84.8

Discussion. In the translation of Spanish personal pronouns in the third person into English, an overall precision of 84.8% (657 out of 775) was obtained. From these results, we extracted the following conclusions:

All the instances of the Spanish plural pronouns (ellos, ellas, les, los, las, and the zero pronouns in plural corresponding to the English pronouns they and them), were correctly translated into English. There are two reasons for this:
- The semantic roles of these pronouns were correctly identified in all of the cases.
- The equivalent English pronouns (they and them) lack gender information, that is, they are valid for masculine and feminine. Therefore, the antecedent's gender did not influence the translation of these pronouns.

The errors occurred in the translation of the Spanish singular pronouns (él, ella, le, lo, la, and in zero pronouns in singular corresponding to the English pronouns he, she, it, him, and her). There were different causes for these errors:
- There were mistakes in the anaphora-resolution stage (79.7% of the global mistakes), which caused an incorrect translation into Spanish, mainly due to the proposed antecedent and the correct one having different grammatical gender. Sometimes both had the same gender, but they had different semantic categories.
- There were mistakes in the application of the heuristic used to identify the antecedent's semantic category (20.3%). This involved the application of an incorrect morphological rule.

Our proposal was compared with the SYSTRANLinks output. As shown in Table 13, the precision obtained by the AGIR system was approximately 28% higher than that obtained by Systran.

Table 13: Translation of pronominal anaphora into English, SYSTRANLinks and AGIR

	SYSTRANLinks	AGIR
LEXESP	56.9	84.8

The low results obtained in Systran are mainly the result of errors that occurred in the translation of Spanish zero pronouns. Specifically, out of 775 Spanish pronouns, 334 errors occurred, and 293 of them (87.7% of the global errors) originated in the translation of zero pronouns, whereas the remainder (12.3%) originated in the translation of the remaining not-omitted pronouns. The errors in the translation of zero pronouns mainly originated in their incorrect resolution.

Next: Conclusion Up: Evaluation of the Generation Previous: Pronominal Anaphora Translation into

Jesus Peral 2002-12-13