Speech talk     3/12/98
Topic  :  Confidence
Speaker : Dhananjay  Bansal

you can get the postscript format of the slides  here
  
About me:  (slide 1)

Name : Dhananjay Bansal
                       Master's student, LTI, SCS, CMU.

Research interests :  Developing robust confidence measures and use them for improving
                                 recognition accuracy.

Advisors :   Raj Reddy,  Ravi Mosur.
 
 

What is confidence all about:  (slide 2)

            truth or a lie.    
 

Some of existing measures  (slide 3)

            - Frame %age where decoded sentence phones matched to phone only decoding. (5.1)
            - Total score and duration (5.3)
            - N-Best list homogeneity  (8.7)
            - Language model score  (5.6)
            - Phone only distance (5.6)
            - best (15)
 
 

An example of miss-recognition  (slide 4)

                " In nineteen ninety was  invaded kuwait  this park in  the gulf war so a lot of activity we'll
           have to keep on top of  an old they won't  certainly a number of planes are going to be in place"                 " In nineteen ninety he    invaded kuwait  sparking      the gulf war so a lot of activity we'll
           have to keep on top of  it all day                certainly a number of planes are going to be in place"
 

Idea : (slide 5)

This is how it's done:  (slide 6)
 
            here is the postscript version of the slide. (depicts the process with a diagram)

Formal definition of the measures:  (slide 7)
 
     S0 = original decoded sentence.
      Si  = decoded sentence after word(i) removed from the lattice.

                if  Wj  is removed from the lattice, the new decoded sentence Sj does not contain Wi  too. Wj  is
          an  affecting word for Wi .
 
                    for( j=1; j<= total # words in S0 ; j++)
                            if  Wi  not found in  Sj
                                    M1(Wi) = M1(Wi) + 1
                  M2(Wi)  =  T(S0) - T(Si)
                  N-Best homogeneity score is the fraction of times that word appears in the N-Best list in the same
        time frames.
 

Measures at work as seen through the  keyhole.  (slide 8)

                    Word      catagory     NBH             M1                M2
                    in                 c            1.000              1                268333
                    nineteen       c            1.000              1                379158
                    ninety           c           1.000               1                301205
                    was             s            0.544               4                    3534
                    invaded       c            1.000               1                243547
                    kuwait         c            1.000               0                197954
                    this              i             0.812               6                  26259
                    park            i             0.812               6                  26259
                    in                s             0.812               7                  26259
                    the              c             1.000               1                194811
                    gulf             c             1.000                2                267237
                    war             c             1.000               2                267237
                    so               c             1.000                3               184748
                    a_lot_of      c             1.000                1               335727
                    activity        c             1.000                1               294961
             
 

Plots :  (slide 9)

                Fraction of correct words removed    = (# correct words removed) / (# initial correct words)
                Fraction of incorrect words removed = (# incorrect words removed) / (# initial incorrect words)
                  False alarm =   (# correct words removed) / (total # words removed)                [M1]      [M2]     [NBH]      [M1-M2]     [M1-NBH]     [M2-NBH]    [M1-M2-NBH]
 
               Error is the fraction of incorrect words in the remaining words after removing words tagged incorrect..
               Drop  is the fraction of initial words tagged incorrect.             [M1]           (1-D, correct/incorrect in the same plot)
            [M2]           (1-D, correct/incorrect in the same plot)
            [NBH] [zoom1] [zoom2]   (1-D, correct/incorrect in the same plot)
            [M1-M2]    (2-D, for correct words)
            [M1-M2]    (2-D, for incorrect words)
            [M1-NBH] (2-D, for correct words)
            [M1-NBH] (2-D, for incorrect words)
            [M2-NBH] (2-D, for correct words)
            [M2-NBH] (2-D, for incorrect words)             [M1-M2]      correctincorrectsuperimposed
            [M1-NBH]   correctincorrectsuperimposed
            [M2-NBH]   correctincorrectsuperimposed

 Questions yet to be answered : (slide 10)

                - Bad input  :  probably of no help.
                - Decoder's fault  :  may be of some help.

Final words : (slide 11)

                      54  correct         words tagged incorrect  and removed from lattices.
                    178  incorrect       ( -----------do----------------)
                    Absolute Error increased by 0.3 %.