info.ephyra.indexing
Class AQUAINTPreprocessor

java.lang.Object
  extended by info.ephyra.indexing.AQUAINTPreprocessor

public class AQUAINTPreprocessor
extends java.lang.Object

A preprocessor for the AQUAINT corpus:

Version:
2006-04-30
Author:
Nico Schlaefer

Field Summary
private static java.lang.String dir
          Directory of the AQUAINT corpus
 
Constructor Summary
AQUAINTPreprocessor()
           
 
Method Summary
private static boolean addParagraphTags()
          Adds paragraph tags if missing.
static void main(java.lang.String[] args)
          Entry point of the program.
private static boolean splitParagraphs()
          Splits paragraphs, e.g. to separate publisher details.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

dir

private static java.lang.String dir
Directory of the AQUAINT corpus

Constructor Detail

AQUAINTPreprocessor

public AQUAINTPreprocessor()
Method Detail

addParagraphTags

private static boolean addParagraphTags()
Adds paragraph tags if missing.

Returns:
true, iff the preprocessing was successful

splitParagraphs

private static boolean splitParagraphs()
Splits paragraphs, e.g. to separate publisher details.

Returns:
true, iff the preprocessing was successful

main

public static void main(java.lang.String[] args)

Entry point of the program.

Preprocesses the AQUAINT corpus.

Parameters:
args - argument 1: directory of the AQUAINT corpus