cmds/run_CNN.sh  --  Training Convolutional Neural Networks
---------------------------------------------------------------------------------------------------------------------
Arguments

argument                 
meaning/value                              
comments                                     
--train-data training data specification required
--valid-data valid data specification required
--conv-nnet-spec
net specification for the convolutional layers

--conv-nnet-spec="txnxm:a,bxc,pdxe,f"  
 
Eg. "1x29x29:64,4x4,p2x2:128,5x5,p3x3,f" stacks two convolutional layers
required

"txnxm": the inputs are t feature maps, each with the dimension of  n x m    

"
a,bxc,pdxe,f" describes one convolution layer a   -- number of feature maps
bxc -- size of local filters (kernels)
dxe -- pooling size
if "f" appears, the outputs are flattened

you can continue to stack more convolution layers
--nnet-spec            
net specification for the FC layers
--nnet-spec="h(1):h(2):...:h(n):s"   

   
Eg. 1024:1024:1024:1920
required. h(i)-size of the i-th FC hidden layers; s-number of targets
--wdir     
working directory required
   
--param-output-file
path to save model parameters in the PDNN format
by default "": doesn't output PDNN-formatted model
--cfg-output-file
path to save model config
by default "": doesn't output model config
--kaldi-output-file
path to save the Kaldi-formatted model
by default "": doesn't output Kaldi-formatted model
--model-save-step
number of epochs between model saving
by default 1: save the tmp model after each epoch
 
--ptr-file
pre-trained model file                     
by default "": no pre-training
--ptr-layer-number how many layers to be initialized with the pre-trained model 
required if --pre-file is provided
 
--lrate learning rate by default D:0.08:0.5:0.05,0.05:15
--batch-size mini-batch size for SGD by default 256
--momentum the momentum by default 0.5
--use-fast
whether to use the fast version of CNN
by default false. More details at the bottom of this page
    
--conv-activation
activation function for the convolutional layers; more details on the DNN webpage by default sigmoid
--activation activation function for the FC layers; more details on the DNN webpage
by default sigmoid
  
--input-dropout-factor
dropout factor for the input layer  (features)
by default 0: no dropout is applied to the input features
--dropout-factor
comma-delimited dropout factors for *hidden layers*. Note the matching between dropout factors and network structure (nnet-spec)
E.g.
--dropout-factor 0.2,0.2,0.2,0.2
by default "": no dropout is applied. This is equivalent to setting dropout factors to all 0s. However, the latter case will be slower. Thus, "--dropout-factor  0,0,0,0" is NOT recommended.
   
--l1-reg l1 norm regularization weight
train_objective = cross_entropy + l1_reg * [l1 norm of all weight matrices]
by default 0
--l2-reg l2 norm regularization weight
train_objective = cross_entropy + l2_reg * [l2 norm of all weight matrices]
by default 0
--max-col-norm the max value of norm of gradients; usually used in dropout and maxout
by default none: not applied


Example

python pdnn/cmds/run_CNN.py --train-data "train.pickle.gz,partition=600m,stream=true,random=true" \
                            --valid-data "
valid.pickle.gz,partition=600m,stream=true,random=true" \
                            --conv-nnet-spec "1x29x29:64,4x4,p2x2:128,5x5,p3x3,f" \
                            --nnet-spec "1024:1024:1024:1901" \
                            --wdir ./ 
--activation sigmoid --conv-activation sigmoid \
                           
--param-output-file nnet.mdl  --cfg-output-file nnet.cfg



Fast Version of CNN Training

If you want to speed up CNN training, you can switch to the fast implementation. This is based on the pylearn2 wrappers for the cuda-convnet library. Depending on your CNN architecutre, you can get 2x to 3x speed up.

1. download pylearn2:  git clone git://github.com/lisa-lab/pylearn2.git

2. add the pylearn2 directory to your PYTHONPATH:  export PYTHONPATH=$PYTHONPATH:/path/to/pylearn2

3. call run_CNN.py with "--use-fast true"

However, adopting this fast version imposes restrictions on your CNN architecture. Check this webpage for the restrictions.