Documentation: Training Stacked Restricted Boltzmann Machines

cmds/run_RBM.py -- Training Stacked Restricted Boltzmann Machines

            -----------------------------------------------------------------------------------------------------------------------

Arguments

argument	meaning	default value / comment
--train-data	training data specification	required
--nnet-spec	--nnet-spec="d:h(1):h(2):...:h(n):s" Eg.250:1024:1024:1024:1024:1920	required. d-input dimension; h(i)-size of the i-th hidden layers; s-number of targets
--wdir	working directory	required

--param-output-file	path to save model parameters in the PDNN format	by default "": doesn't output PDNN-formatted model
--cfg-output-file	path to save model config	by default "": doesn't output model config
--kaldi-output-file	path to save the Kaldi-formatted model	by default "": doesn't output Kaldi-formatted model

--learning-rate	learning rate for Bernoulli-Bernoulli RBM	by default 0.08
--gbrbm-learning-rate	learning rate for Gaussian-Bernoulli RBM	by default 0.005
--epoch-number	number of epochs	by default 10
--batch-size	mini-batch size during training	by default 128
--momentum	momentum string with the format "init:final:init_epochs" E.g.: "0.5:0.9:5"	by default 0.5:0.9:5, it has 3 fields: init -- initial momentum final -- final momentum init_epochs -- for how many epochs we use initial_momentum
--ptr-layer-number	number of layers to be trained	by default train all the hidden layers

--first-layer-type	type for the first layer; either "bb" (Bernoulli-Bernoulli) or "gb" (Gaussian-Bernoulli)	by default "gb"

Example

python pdnn/cmds/run_RBM.py --train-data "train.pickle.gz,partition=600m,stream=true,random=true" \
                      --nnet-spec "330:1024:1024:1024:1024:1901" \
                      --wdir ./ --ptr-layer-number 4 \
                      --epoch-number 10 --batch-size 128 \
                      --learning-rate 0.08 --gbrbm-learning-rate 0.005 \
                            --momentum 0.5:0.9:5 --first_layer_type gb \
                            --param-output-file rbm.mdl

Your application may not have targets, for example, unsupervised training. In this case, you still need to specify the target number (1901 in this example). However, you can specify a fake target number. The layers which will be trained are decided by --ptr-layer-number. For instance, this example only trains the first 4 hidden layers and ignores the final softmax layer.