DistribSet method: cdcn (Codebook Dependent Cepstral Normalisation)
Syntax:
<distrib set> cdcn <distrib1> <distrib2> <feature> [optional flags]
Example:
CDCNdss cdcn CDCN SIL(I) LOGSPEC -itcount 30
For the <feature> the CDCN algorithm is executed. The <distrib set> contains the joined CDCN distribution <distrib1>. Although <distrib1> holds the Information for silence and speech <distrib2> is needed to get the number of codebook vectors for silence. As result the channel compensated version of <feature> is placed in the feature designated in the codebook-set desription file.
Optional flags:
-itcount <number>
-n <feature FMatrix>
-q <feature FMatrix>
-f <feature FMatrix>
The correct name for the implemented version should be CDLSN (Codeword Dependent Log-Spectral Normalisation) but there are very little differences to the original CDCN and so we opted for CDCN. The basic difference is that we work in the log-spectral and not in the cepstral domain. We can consider this as a implementation detail because of the linear properties of the Fourier transform. For more information see: [1] Acero Alejandro. "Acoustical and Environmental Robustness in Automatic Speech Recognition", Department of Electrical and Computer Engineering Carnegie Mellon University Pittsburgh Pennsylvania 15231 13.9.1990
First of all we need a number of description files.
With them we perform the training of the two codebooks.
Note: The shown files are only examples. Especially the feature description file differs from system to system.
SIL CDCNFEA 50 30 DIAGONAL SPEECH CDCNFEA 200 30 DIAGONAL
SIL(|) SIL SPEECH(|) SPEECH
ROOT-b {0=SIL} LSPEECH LSIL - -
ROOT-m {0=SIL} LSPEECH LSIL - -
ROOT-e {0=SIL} LSPEECH LSIL - -
LSPEECH {} - - - SPEECH(|)
LSIL {} - - - SIL(|)
#--------------------------------------------------------------------------
#fes command name source parameter
#--------------------------------------------------------------------------
$fes readADC ADC $arg(ADCFILE) -h $arg(ADCHEADER) \
-v 0 -offset mean
#----------------- mel filter bank ----------------------------------------
set melN 30
$fes spectrum FFT ADC 16ms
if { [llength [objects FBMatrix matrixMEL]] != 1} {
set points [$fes:FFT configure -coeffN]
set rate [expr 1000 * [$fes:FFT configure -samplingRate]]
[FBMatrix matrixMEL] mel -N $melN -p $points -rate $rate
}
$fes filterbank MEL FFT matrixMEL
$fes log CDCNFEA MEL 1.0 1.0
$fes meansub CDCNFEA CDCNFEA -a 0
# -----------------------------------------------------------------------
#initialise CDCN -> basic Object = CDCNdss
# -----------------------------------------------------------------------
source ../cdcn_desc/cdcn.tcl
cdcnInit $SID -dssdesc ../cdcn_desc/cdcnDistribSet -dssparam ../cdcn_create_melv/3i.dss.gz \
-cbsdesc ../cdcn_desc/cdcnCodebookSetMel -cbsparam ../cdcn_create_melv/3i.cbs.gz
# -----------------------------------------------------------------------
#--------------------------------------------------------------------------
#fes command name source parameter
#--------------------------------------------------------------------------
$fes readADC ADC $arg(ADCFILE) -h $arg(ADCHEADER) \
-v 0 -offset mean
#----------------- mel filter bank ----------------------------------------
set melN 30
$fes spectrum FFT ADC 16ms
if { [llength [objects FBMatrix matrixMEL]] != 1} {
set points [$fes:FFT configure -coeffN]
set rate [expr 1000 * [$fes:FFT configure -samplingRate]]
[FBMatrix matrixMEL] mel -N $melN -p $points -rate $rate
}
$fes filterbank CDCNFEA FFT matrixMEL
$fes log MCEP CDCNFEA 1.0 1.0
#----------------- CDCN ----------------------------------------
global CDCNdss
CDCNdss cdcn CDCN SIL(|) MCEP -itcount 30
#----------------- cepstrum ----------------------------------------
set cepN 13
if { [llength [objects FMatrix matrixCOS]] != 1} {
set n [$fes:CDCNFEA configure -coeffN]
[FMatrix matrixCOS] cosine $cepN $n -type 1
}
$fes matmul MCEP CDCNFEA matrixCOS
#----------------- context -----------------------------------------------
There is a possibility to use other transforms like third square-root instead of the logarithm. But this can't be done without internal changes in the "cdcn" method.
References:
[2] Baumgärtner Rainer.: Diplomarbeit: Kanalkompensation in der Spracherkennung; Universität Karlsruhe, Institut für Logik, Komplexität und Deduktionsysteme 1996