cmds2/run_DNN_SAT.sh  --  Training SAT Models for DNNs
---------------------------------------------------------------------------------------------------------------------
Refer to this webpage for more information about SAT for DNNs.

Arguments

argument                 
meaning/value                              
comments                                     
--train-data training data specification required
--valid-data valid data specification required
--si-nnet-spec  --si-nnet-spec="dF:h(1):h(2):...:h(n):s"   
   
Eg.250:1024:1024:1024:1024:1920
required. Specifies structure of the SI model. dF-feature dimension; h(i)-size of the i-th hidden layers; s-number of targets
--adapt-nnet-spec  --si-nnet-spec="dI:ha(1):ha(2):...:ha(m)"   
   
Eg.100:512:512
required. Specifies structure of the Adaptation model. dI-i-vector dimension; ha(i)-size of the i-th adaptation layers
--init-model path to the initial DNN model
required. A well-trained DNN model which serves as the initialization of the SI model
--wdir     
working directory required
   
--param-output-file
(prefix) path to save model parameters in the PDNN format
by default "": doesn't output PDNN-formatted model. Filenames for the SI and Adaptation models are appended with the suffix ".si" and ".adapt" respectively
--cfg-output-file
(prefix) path to save model config
by default "": doesn't output model config. Filenames for the SI and Adaptation models are appended with the suffix ".si" and ".adapt" respectively
--kaldi-output-file
(prefix) path to save the Kaldi-formatted model
by default "": doesn't output Kaldi-formatted model. Filenames for the SI and Adaptation models are appended with the suffix ".si" and ".adapt" respectively
--model-save-step
number of epochs between model saving
by default 1: save the tmp model after each epoch
 
--ptr-file
pre-trained model file                     
by default "": no pre-training
--ptr-layer-number how many layers to be initialized with the pre-trained model 
required if --pre-file is provided
 
--lrate learning rate by default D:0.08:0.5:0.05,0.05:15
--batch-size mini-batch size for SGD by default 256
--momentum the momentum by default 0.5
    
--activation the same as dnn
by default sigmoid
  
--input-dropout-factor
the same as dnn by default 0: no dropout is applied to the input features
--dropout-factor
the same as dnn by default "": no dropout is applied.
   
--l1-reg l1 norm regularization weight
train_objective = cross_entropy + l1_reg * [l1 norm of all weight matrices]
by default 0
--l2-reg l2 norm regularization weight
train_objective = cross_entropy + l2_reg * [l2 norm of all weight matrices]
by default 0
--max-col-norm the max value of norm of gradients; usually used in dropout and maxout
by default none: not applied



Example

python pdnn/cmds2/run_DNN_SAT.py  --train-data "train.pfile.gz,partition=2000m,stream=true,random=true" \
                                  --valid-data "valid.pfile.gz,partition=600m,stream=true,random=true" \
                                  --si-nnet-spec "330:1024:1024:1024:1024:1901" \
                                  --adapt-nnet-spec "100:512:512" \
                                  --init-model mdl.init --wdir ./ \
                                  --param-output-file nnet.mdl 
--cfg-output-file nnet.cfg


By this example, the SI model has the architecture of 330:1024:1024:1024:1024:1901. The Adaptation network has the architecture of 100:512:512:330. That is, an additional layer, which has the size equal to the SI model input dimension, is automatically added to the Adaptation network. This additional layer adopts the linear activation function. After training is finished, you will find the model files: nnet.mdl.si & nnet.cfg.si for the SI model, nnet.mdl.adapt & nnet.cfg.adapt for the Adaptation model.