# Scripts and commands to create the StemLex and the SuffixLex required by thesegmenter # Also see 00-README.txt 0. To extracted just segmentations to be used as a guide ./ExtractSegmentation.pl < ../quechua2spa/lexicons/FreqWordList_00001-00100-irene-corr2.csv > ! ../quechua2spa/lexicons/SegmentationGuide 1. perl script to parse the CVS file from FreqWordList_00001-00100-irene-corr(2).xls ./CreateStemLexicon.pl < ../quechua2spa/lexicons/FreqWordList_00001-00100-irene-corr2.csv >! ../lexicons/StemLexicon4Seg 2. To create an initial suffix lexicon for all the suffixes that appear in the Freq file, SuffixLexicon_v3.xls, just save as .CSV SuffixLexicon_v3-seg.csv 3. Look at 00-REAME-Christian.txt for further massaging of these files' format