cmds/run_MTL.sh  --  Multi-Task Learning
---------------------------------------------------------------------------------------------------------------------
Arguments

argument                 
meaning/value                              
default value / comments 
--train-data training data specification required. Data paths for different tasks are separated by "|". 
--valid-data valid data specification required. Data paths for different tasks are separated by "|".
--task-number
how many tasks you are running (in order for verification) required. Its value equals the number of tasks indicated by --train-data and --valid-data
--shared-nnet-spec     
--shared-nnet-spec="d:h(1):h(2):...:h(m)"
   
Eg. 250:1024:1024:1024
required. Specifies the structure of the lower layers shared across tasks.
d
-input dimension
h(i)-size of the i-th hidden layer 
--indiv-nnet-spec     
--indiv-nnet-spec="h(1)(n):s(1)|
...|h(T)(n):s(T)"

   
Eg. 1024:1920|1024:1887|1024:1790
required. Specifies task-specific upper layers separated by "|". Although we only show one hidden layer h(t)(n), each task can have arbitrary upper-lower architecture.
h(t)(n)-size of the n-th hidden layer for task t
s(t)- number of targets for task t
--wdir     
working directory required
   
--param-output-file
(prefix) path to save model parameters in the PDNN format
by default "": doesn't output PDNN-formatted model. Filename for each task is appended with the suffix ".task#"
--cfg-output-file
(prefix) path to save model config
by default "": doesn't output model config. Filename for each task is appended with the suffix ".task#"
--kaldi-output-file
(prefix) path to save the Kaldi-formatted model
by default "": doesn't output Kaldi-formatted model. Filename for each task is appended with the suffix ".task#"
--model-save-step
number of epochs between model saving
by default 1: save the tmp model after each epoch
 
--ptr-file
pre-trained model file                     
by default "": no pre-training
--ptr-layer-number how many layers to be initialized with the pre-trained model 
required if --pre-file is provided
 
--lrate learning rate by default D:0.08:0.5:0.05,0.05:15
--batch-size mini-batch size for SGD 256
--momentum the momentum 0.5
    
--activation the same as dnn
by default sigmoid
  
--input-dropout-factor
the same as dnn by default 0: no dropout is applied to the input features
--dropout-factor
the same as dnn by default "": no dropout is applied.
   
--l1-reg l1 norm regularization weight
train_objective = cross_entropy + l1_reg * [l1 norm of all weight matrices]
by default 0
--l2-reg l2 norm regularization weight
train_objective = cross_entropy + l2_reg * [l2 norm of all weight matrices]
by default 0
--max-col-norm the max value of norm of gradients; usually used in dropout and maxout
by default none: not applied



Example

python pdnn/cmds/run_MTL.py  --train-data "train.pickle.T1.gz|train.pickle.T2.gz,partition=600m,random=true" \
                             --valid-data "
valid.pickle.T1.gz|valid.pickle.T2.gz,partition=600m,random=true" \
                             --task-number 2 --wdir ./ \
                             --shared-nnet-spec "330:1024:1024:1024"
--shared-nnet-spec "1024:1920|1024:1887" \
                             --activation sigmoid 
/ \
                             --param-output-file nnet.mdl 
--cfg-output-file nnet.cfg

By this example, we are doing multi-task learning on two tasks. They have the DNNs of
330:1024:1024:1024:1024:1920 and
330:1024:1024:1024:1024:1887 respectively. The lower 3 hidden layers (330:1024:1024:1024) are shared by the two tasks. After training is finished, you will find the model files: nnet.mdl.task1 & nnet.cfg.task1 for Task1, nnet.mdl.task2 & nnet.cfg.task2 for Task2