RESEARCH

SAMPLES

PAPERS

PEOPLE

HOME

 
 

3. Discourse Strategies for Generating Captions

Explanations about informational graphics can be classified into at least three categories based on the structural properties of the picture, the structure of the underlying data attributes, and their mapping to spaces and graphemes. These explanation strategies reflect the overall structure of the graphic presentation: whether the spaces are aligned along a common axis, and around the functionally independent attribute (FIA). An attribute is functionally independent if it uniquely determines the values of all other attributes. For example, in one of our current datasets about house sales, the house's street address has been specified as the (FIA); it uniquely determines the asking price, selling price, and the other attributes in the database. In contrast, the listing agency does not uniquely determine any of the other attributes in the house-sales relation.

In addition to the factors mentioned above--used to select the overarching discourse strategies--the system also makes use of additional information about the symbols and their mappings used in the display to select and organize information to be presented in the caption. For instance, the system uses graphical information to determine the order in which information is presented. This reasoning can occur at various levels of the picture representation: at the space level (all objects in a space are described before objects in another space), at the grapheme cluster level (all objects in a cluster are described together) and at the encoder level (all objects that map the same attribute type are described together). sage 's representation of the graphical display thus provides additional information that can be considered when text explanations are generated.


These three charts show information about houses from data set PGH-23. Each chart has two axes. The Y-axis identifies the houses in the three charts. The data set contains 18 items. The X-axis in the first chart indicates house prices. The origin is at zero and there are five ticks on the axis, with the minimum value being $44,000. The difference between each tick is $110,000. The values mapped to the axis range from $55,000 to $399,000. The left edge of the bar shows the asking price of a house. Selling prices shown range from $55,00 to $387,000. Asking prices range from $61,000 to $399,000. The horizontal position of the square mark shows the agency estimate. These range from $55,000 to $387,000. For example…

Figure 5

A fragment of one possible verbose caption for the graphic in Figure 6.



The process of generating natural language explanations can be divided into three conceptual stages: (i) select a discourse strategy to provide the overall organization of the explanation based on the structural properties of the graphical presentation, the relations expressed in the dataset and the data to grapheme mappings, (ii) within each space of the presentation, use the complexity metric to determine the amount of detail to be included in the explanation, and (iii) reason about the tactical decisions in sentence planning.



Figure 6

Graphic with the caption generated using strategy 1.

These three charts show information about houses from data set PGH-23. The Y-axis identifies the houses in the three charts. In the first chart, house prices are shown by the X axis. The house’s selling price is shown by the left edge of the bar, whereas the asking price is shown by the right edge. The horizontal position of the mark shows the agency estimate. For example, as shown in the highlighted tuple, the asking price of 3237 Beechwood is $82K, its selling price is $75K, and the agency estimate is $81K. In the second chart, the house’s date on the market is shown by the left edge of a bar, whereas date sole is shown by the right edge. Color indicates the neighborhood. The third chart shows the listing agency.



In our current application, content selection mainly consists of determining the complex or ambiguous aspects of a graphic presentation. In general, knowledge based systems cannot afford to generate a paraphrase of the entire knowledge base. As illustrated by Figure 5, an explanation that includes all the facts in the underlying picture representation or data set for even a simple graphic in sage would be extremely verbose. Most of the facts expressed in such a caption would be both obvious and unnecessary for the average user. Studies have shown approximately three-fourths of the time spent by users in interpreting a graphic is used in understanding the data to grapheme mappings (Shah, 1995; Cleveland and McGill, 1987). Therefore, our initial goal was to generate captions describing only those mappings that might be either complex or ambiguous for the average user. The system can currently analyze a picture representation for five different types of complexities and ambiguities; these are discussed in greater detail in the following section (Section 4). This section discusses the three strategies used by the system to structure the content during text planning. The sentence planning phase is discussed in Section 5, where the individual components implementing the tactical decisions in the micro-planner are described in detail.

3.1 Strategy 1: Graphic organized around the functionally independent attribute

As mentioned earlier, the three strategies used by the caption generator depend upon both the structure of the graphic presentation and the relations in the data set presented in the graphic. The first strategy can be applied when the data set contains a functionally independent attribute (FIA) that is used as an organizing device or "anchor" for the entire graphic. This occurs either when the graphic has only one space and the FIA is mapped to one of the axes, or when there are multiple spaces and the FIA is mapped to the axis of alignment.



This chart and table show information about house sales from data set PGH-23. The Y-axis identifies the houses in the two spaces. In the chart, dates are shown along the X-axis. The house’s date on the market is shown by the left edge of a bar, whereas the date sold is shown by the right edge. Color indicates the listing agency. The label to the left of a bar indicates the asking price, whereas the label to the right indicates the selling price. The table shows the agency estimate.

Figure 7

Caption for an alternative presentation of the dataset used in Figure 6.


In such cases, the strategy attempts to reinforce the organizing role of the functionally independent attribute. The explanation strategy identifies the anchor and the independent attribute first. Then, it describes each space in the picture relative to the anchor. Domain attributes mapped in the graphic are also mentioned in the context of the FIA and the type of relationship defined between them (one:one or one:many). Two sage generated graphics and the associated explanations that illustrate this organizing principle are shown in Figures 6 and 7. These two figures illustrate the importance of a caption generator in this application. Both figures present the same data set about house sales. However, the presentations generated by sage are different, make use of different mappings, and give rise to different perceptual complexities. Consequently, the content of the captions generated is also different. However, in both the captions, the overall discourse strategy is the same: to emphasize the aligning Y-axis, the functionally independent attribute--the house-address,and structure the description of the other attributes in terms of the FIA.


 



This chart and table show information about house sales from data set PGH-23. It emphasizes the relationship between house prices and the number of days on the market. The X-axis shows the house prices, whereas the Y-axis shows the house’s number of days on the market. The house’s listing agency is indicated by color. The selling price is shown by the left edge of the bar, whereas the asking price is shown by the right edge. The position of the mark shows the agency estimate.

Figure 8

Graphic with caption generated using strategy 2.


3.2 Strategy 2: Single space organized around dependent attributes

However, in cases where the graphic is organized around dependent attributes, the explanation cannot be structured around any of them. This is because the attribute may be defined in either one:many or many:many relationships in the dataset and cannot therefore be used as an identifier. This is the case in Figures 8 and 9. In these two figures, the attributes that are mapped to the axes of the charts are dependent attributes such as days-on-market, number-of-rooms and lot-size. Neither of these can be used to refer to other attributes unambiguously. Thus, the discourse strategy cannot be the same as in the case where an FIA is mapped along one of the axes. Instead the explanation emphasizes the relation between the dependent attribute(s) that serve as organizer(s). There are two strategies depending on whether or not the figure consists of multiple spaces. If there is only a single space in the graphic, the explanation emphasizes the relation between the attributes encoded against the two axes. A sage generated graphic and the associated explanation that illustrates this organizing principle is shown in Figure 8. The caption generated for the figure illustrates how the strategy emphasizes the relationship between the attributes mapped along the axes. Figure 8 shows the relationship between the variation in house prices and the number of days a house is on the market in the data set.



FIGURE 9

These charts show information about house sales from data set PGH-23. In the two charts, the X-axis shows the selling prices. The top chart emphasizes the relationship between the number of rooms and the selling price. The bottom chart emphasizes the relationship between the lot size and the selling price.

Figure 9

Graphic with caption generated using strategy 3.


3.3 Strategy 3: Multiple spaces aligned along an axis with dependent attributes

The second strategy discussed above is only applicable if there is a single space in the presentation. However, sage is capable of designing presentations with multiple spaces that are aligned along dependent attributes in the data-set. In such cases, the explanation generator cannot describe all the concepts in the presentation using strategy #2. This is because if one of the spaces in the presentation happens to have the FIA mapped to its non-aligned axis, a description such as ("this space shows the (one:one) relationship between the <FIA> and <attribute-2>") would not be natural. In such cases, it is more natural to use strategy #1 to describe the mappings in that space. Therefore, strategy #3 allows the system to organize the caption for each space accordingly, depending upon whether the FIA is mapped along its non-aligned axis. Figure 9 shows such a graphic and the corresponding caption. The two charts in 9 are aligned along the X-axis, which is used to encode (house-price). In generating the captions for the two charts, the system describes each one independently, using either strategy #1 or #2, as appropriate. It describes the top one first (following the structure of the graphic) and then the bottom one. Each of them, in this case, is described using strategy #2 because they both have dependent attributes mapped along the axes.



To next section.

 
   

Paper Sections:

     To Title page
     To Part 1: Introduction
     To Part 2: SAGE: A System for Automatic Graphical Explanations
     To Part 4: Graphical Complexity: The Need for Clarification
     To Part 5: Generating Explanatory Captions
     To Part 6: System Implementation and Evaluation
     To Part 7: Related Work
     To Part 8: Conclusions and Future Work
     To Appendix A
     To Acknowledgements
 
    [RESEARCH]     [SAMPLES]     [PAPERS]     [PEOPLE]     [HOME]