Home
Research
Work
Personal
Resources

Rule Transduction Toolkit

Synchronous grammar rule learning for a Syntax based machine translation.

Download:
Please email me for a jar file of the code.

To run:
java -jar rulelearner.jar RuleLearningDriver <configfile>

Config file:

Depending upon the mode of operation (T2S) or (T2T) , some of the TPARSE_FILE may be optional. Everything else is required.  A sample config file is given below.

###ROOT LEVEL
VRULES_ROOT=C:/rulelearner/

#### RULE LEARNING MODES ####
INPUT_MODE=T2T
OUTPUT_MODE=T2T

##### Parallel Treebank
CORPUS_FILE=C:/rulelearner/ger/ec.txt
SPARSE_FILE=C:/rulelearner/ger/en1.parsed
TPARSE_FILE=C:/rulelearner/ger/de1.parsed
GRA_FILE=C:/rulelearner/ger/grammar.gra
PTABLE_FILE=C:/rulelearner/ger/phrases.phr
LEXICON_FILE=C:/rulelearner/ger/lexicon.lex
CORPUS_FILE Format:
Any number of sentences can be given as input in this format. Each sentence should be separated by a new line. Anything starting with a semicolon is a comment. Sample example for one sentence is below
;; This is a comment
SentenceIndex:1
SL: Resumption of the session
TL: reprise de la session
Alignment:((1,1),(2,2),(3,3),(4,4))
Type: S
INPUT_MODE
OUTPUT_MODE