---------------------------------------- ----------------------------------------

CorpusBuilder Slovenian Corpus

---------------------------------------- ----------------------------------------

The documents in this corpus were collected in January 2001 by the CorpusBuilder system. They were all filtered using van Noord's TextCat language filter. A document is included if TextCat assigned Slovenian as the most probable language. Some documents may contain small amounts of English or other languages. No manual filtering has been performed on these pages.

CorpusBuilder, by Ghani, Jones and Mladenic
Dunja Mladenic (dunja AT cs.cmu.edu)
http://www.cs.cmu.edu/~TextLearning/corpusbuilder/
Last modified: Tue Jun 26 20:55:07 EDT 2001