Several works about automatic topic detection have been publishedReynar
1999, Youmans 1991, or Hearst
1994. In MartínezBarco
et al. 1999 an automatic topic detection algorithm as applied
to anaphora resolution is presented.
This algorithm selects noun phrases (NP) occurring before an anaphor.
These NPs are included in a list that is then weighted. Each time the NP
appears in a new turn (frequency), its weight is increased, and each time
the NP does not appear in a new turn (infrequency), its weight is decreased.
According to this algorithm, the dialogue topic may be determined by its
salience, i.e., by determining the NP with the heaviest weight (high frequency
in a short distance) occurring before an anaphor. In order to obtain this
information (weight), the algorithm uses the following two coefficients:

C_{f}: coefficient of frequency

C_{i}: coefficient of infrequency
C_{f} increases the salience of a referring expression when the
entity appears in the current intervention turn. C_{i} decreases
the salience of expressions that appeared in previous intervention turns
but not in the current one, indicating a loss of importance. Both coefficients
obviously affect the salience of expressions in reflecting their frequency
and their distance from the current intervention turn where the anaphor
has been found. The expression with the highest salience will be the most
favored candidate antecedent on the whole list and therefore the most relevant
topic for the current intervention turn.
This automatic topic detection method has the following advantage over
other methods: it does not obtain a single topic, but rather a list of
topic candidates ordered by salience. That is important for our anaphora
resolution system because, if the highestranked candidate does not fulfill
the relevant constraints, then the next highest candidate can be tested.
Initially, values of 10 units and 1 unit, respectively, were assigned
to C_{f} and C_{i}. These values were arrived at experimentally,
but further study could lead to more precise values.
