next up previous
Next: Precision and Recall Up: Problematic Dialogue Predictor Results Previous: Task-independent Features

Hand-labelled Features

Row 7 in table 5 gives the results using hand-labelled and automatic features including both SLU-success and auto-SLU-success. By comparing rows 6 and 7, one can see that there is not very much to be gained by adding the other hand-labelled features given in Figure 7 to the hand-labelled and SLU-success feature set. Only the increase for Exchange 1 from 75.6% to 77.1% is significant (df=866, t=2.3, p=0.024). For the whole utterance there is actually a degradation of results from 92.9% to 91.7%.


 
Table 6: Precision and Recall with Exchange 1 Automatic Features
Class Occurred Predicted Recall Precision
TASKSUCCESS 67.0 % 81.7 % 88.1 % 72.5 %
PROBLEMATIC 33.0 % 18.3 % 31.6 % 56.6 %

 



Helen Hastie
2002-05-09