CorpusBuilder Italian Corpus

The documents in this corpus were collected in February 2001 by the CorpusBuilder system. They were all filtered using van Noord's TextCat language filter. A document is included if TextCat assigned Italian as the most probable language. Some documents may contain small amounts of English or other languages. No manual filtering has been performed on these pages. For copyright reasons, we include here only the URLs of the pages. CorpusBuilder, by Ghani, Jones and Mladenic
Rosie Jones (rosie AT cs.cmu.edu)
http://www.cs.cmu.edu/~TextLearning/corpusbuilder/
Last modified: Sun Mar 11 00:12:50 EST 2001