next up previous
Next: Identifying the Speakers Up: ESPER: architecture Previous: Using a Decision Tree

Character Identification in a Children's Story

It is not enough to merely identify the pieces of speech in a story. In order to model each piece of speech using appropriately different voices, it is vital that we also be able to identify the characters in the story, all of whom are potential speakers. This, in a sense, is a named-entity extraction task, since we first need to identify all the proper names in a particular story, and extract from these Named Entities, only those which are names of characters in the story. For this purpose, we have considered using a Named-Entity Extraction System for the task. Here we evaluated the performance of one of the most commonly-used named-entity extraction systems, the BBN IdentiFinder [5], which can scan through a body of text and locate the names of people, places, and other named entities of interest to the user, and output these entities in a markup format. We tested the BBN IdentiFinder on two manually-labeled children's stories selected for their contrasting stylistic differences. The number of characters in each story is within the range of 14-16. Results are shown in Table 3 and 4.

However, it is not sufficient to confine the scope of character identification to only proper names in the story. For instance, the works of Hans Christian Andersen contain a considerable number of characters who are not named, but merely referred to by descriptions; for example, the peasant's wife, the man with the sheep. In these cases we would need additional linguistic information to make the proper identification.

To this end, we have created a character identification module within ESPER. This module uses pattern matching to extract proper names from the story, similar to the functionality of the BBN IdentiFinder. In addition, it uses the part-of-speech information derived from the probabilistic POS tagger in the Festival speech synthesis system to extract non-proper names, such as definite noun phrases, as potential character names. Similar to the BBN IdentiFinder, our character identification module also encapsulates the extracted entities in the text using a markup language, specifically the CSML language, where each each character is also assigned an ID for reference purposes, as well as a CLASS to represent its type (i.e., the character name could be a proper name or a definite NP, etc.), which is useful in the speaker identification stage. An example CSML:

<CHARACTER ID="LITTLE_TUK" CLASS="properName"> 
Little Tuk</CHARACTER> sprang out of bed
quickly and read over his lesson in the 
book... <CHARACTER ID="THE_OLD_WASHERWOMAN" 
CLASS="defNP"> The old washerwoman 
</CHARACTER> put her head in at the door, 
and nodded to him quite kindly..
We tested the performance of both the BBN IdentiFinder and ESPER's character identification module and the results are as follows:


Table 3: Performance of the BBN IdentiFinder and the ESPER Character-Markup Module on Alice In Wonderland, Chapter 3.
  Recall Precision
BBN 77.8% 73.7%
ESPER 88.9% 53.2%


Table 4: Performance of the BBN IdentiFinder and the ESPER Character-Markup Module on Little Tuk
  Recall Precision
BBN 61.5% 53.3%
ESPER 76.9% 38.5%

From these results we observe that ESPER performed with a higher recall than the BBN IdentiFinder since it is able to identify more characters within the story. But at the same time, it also retrieves more non-relevant entities in the story as character names, and hence it suffers from a lower precision than the BBN IdentiFinder. However, in our case it is more important to have a higher recall (i.e., not to overlook potential character names). If the actual set of story characters is a subset of what we retrieved, we can utilize some discrimination factors or relevance rankings in later steps of processing to filter out the non-relevant entity names from our retrieval set. On the other hand, if an actual character name is not included in our retrieval set, there is no way to recover them in future steps of processing.


next up previous
Next: Identifying the Speakers Up: ESPER: architecture Previous: Using a Decision Tree
Alan W Black 2003-10-20