This application builds an FP passage index for a collection of documents. Documents are segmented into passages of size passageSize with an overlap of passageSize/2 terms per passage.
To use it, follow the general steps of running a lemur application.
The parameters are:
index: name of the index table-of-content file without the .ifp extension. memory: memory (in bytes) of InvFPPushIndex (def = 96000000). stopwords: name of file containing the stopword list. acronyms: name of file containing the acronym list. countStopWords: If true, count stopwords in document length. docFormat: stemmer: KstemmerDir: Path to directory of data files used by Krovetz's stemmer. arabicStemDir: Path to directory of data files used by the Arabic stemmers. arabicStemFunc: Which stemming algorithm to apply, one of: dataFiles: name of file containing list of datafiles to index. passageSize: Number of terms per passage.
1.2.18