* Transfer Engine The transfer engine is written in c++. It uses the Genkit feature unification package. Genkit uses templates heavily and so is sensistive to g++ version. I can only get it to compile with g++ 3.3, so that is what I use to compile all the code and the support libraries. I don't recommend recompiling any of the support libraries unless you really have to. Avenue/Transfer - main code files transfer.hpp Stores the class header for all of the transfer engine classes. transfer.cpp Code for the older version of the transfer process. Deprecated but still functional. transfer3.cpp Code for the modern bottom-up, simaltaneous parse and transfer system. transfer-support.cpp Includes helper functions to translate files, produce lattices. transfer-decoder.cpp Include main decoder function, decodeFast, relies on data stored in rangeBeams transfermain.cpp Simple front end to transfer engine with main command to read command line arguments UnicodeTools.{hpp,cpp} Some simple code I wrote to handle upper and lower casing for UTF-8. Probably better to convert to ICU at some point. language.cpp/hpp, chinese.cpp/hpp, english.cpp/hpp Some files I wrote to do number handling for English and Chinese. Now handled by morph servers. Makefile Location of libraries and code, for Avenue and Barrow * ANTLR files (first converted to c++) Avenue/Transfer transfer.g Reads in transfer grammar and lexicons and processes to internal format. Produces TransferGrammarLexer.{hpp,cpp}, TransferGrammarParser.{hpp,cpp} fstruct-cpp.g Reads in feature structures and processes to internal format. Produces FStructLexer.{hpp,cpp}, FStructParser.{hpp,cpp} parsetree-cpp.g Reads in a parse tree as output by Stanford parser (or similar treebank format) and extracts constituent boundaries and types Produces ParseTreeLexer.{hpp,cpp}, ParseTreeParser.{hpp,cpp} * Support libraries: These support libraries are compiled on both Avenue and Barrow in their 32-bit and 64-bit versions, respectively. Some of the pathnames may vary due to this difference, check the Makefile. GENKIT Unification library, written by Ben Han, include UKernel, Toolkit Very sensitive to g++ version, currently requires 3.3 /shared/code/genkit ANTLR Similar to LEX, YACC, used for reading in grammars, lexicons, feature structures /shared/code/antlr-2.7.6/lib/cpp/src SRILM SRI language modeling library /shared/code/srilm1.5/ SALM Suffix Array Language modeling toolkit /shared/code/SALM STTK Calculate translation probabilities * Scoring Scripts I've put all the common necessary code for doing BLEU and METEOR scoring in barrow:/usr8/eepeter/decoder/scoreutils.pl. Use the method scoreFile($hypfile, $reference) Just require this file, and call scoreFile * Morphology English Avenue/Transfer/English anamorphEnglish.pl - Relies on morpha from University of Sheffield genmorphEnglish.pl - Relies on morphg from University of Sheffield Chinese barrow:/shared/code/segmenter/segserver5.pl * Weights Avenue/Transfer/weights/optimizenbest - extracthyps.pl: Extract training file from n-best output