info.ephyra.questionanalysis
Class QuestionNormalizer

java.lang.Object
  extended by info.ephyra.questionanalysis.QuestionNormalizer

public class QuestionNormalizer
extends java.lang.Object

This class provides methods that modify a question to facilitate pattern matching and to anticipate the format of text passages that answer the question.

Version:
2006-06-18
Author:
Nico Schlaefer

Constructor Summary
QuestionNormalizer()
           
 
Method Summary
private static java.lang.String dropFillers(java.lang.String question)
          Drops filler words from the question string.
private static java.lang.String dropPunctuationMarks(java.lang.String question)
          Removes the final punctuation mark and quotation marks from the question string.
private static java.lang.String[] handleAuxCanMay(java.lang.String question, java.lang.String tagged)
          Modifies the question string by applying the following rule: can/could/will/would/shall/should/may/might/must [...]
private static java.lang.String[] handleAuxDid(java.lang.String question, java.lang.String tagged)
          Modifies the question string by applying the following rule: did [...] infinitive -> simple_past
private static java.lang.String[] handleAuxDo(java.lang.String question, java.lang.String tagged)
          Modifies the question string by applying the following rule: do [...] infinitive -> infinitive
private static java.lang.String[] handleAuxDoes(java.lang.String question, java.lang.String tagged)
          Modifies the question string by applying the following rule: does [...] infinitive -> 3rd person singular
private static java.lang.String[] handleAuxHasHad(java.lang.String question, java.lang.String tagged)
          Modifies the question string by applying the following rule: have/has/had [...] past_participle -> has/have/had past_participle / simple_past
static java.lang.String[] handleAuxiliaries(java.lang.String qn)
          Handles auxiliary verbs by applying the rules specified in the documentations of the handleAux...() methods.
private static java.lang.String[] handleAuxIs(java.lang.String question, java.lang.String tagged)
          Modifies the question string by applying the following rule: is/are/was/were [...] gerund / past participle -> is/are/was/were gerund / past participle
static java.lang.String normalize(java.lang.String question)
          Normalizes a question string by removing abundant whitespaces, replacing short forms and dropping filler words.
private static java.lang.String replaceShortForms(java.lang.String question)
          Replaces short forms of "is" and "are" that occur in combination with interrogatives.
static java.lang.String stemVerbsAndNouns(java.lang.String qn)
          Converts the verbs to infinitive and the nouns to their singular forms.
static java.lang.String transformList(java.lang.String question)
          Replaces certain expressions in a list question to transform it into a factoid question.
static java.lang.String unstem(java.lang.String sub, java.lang.String stemmed, java.lang.String qn)
          Unstems a substring of the stemmed question string by mapping it to the normalized question string.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

QuestionNormalizer

public QuestionNormalizer()
Method Detail

replaceShortForms

private static java.lang.String replaceShortForms(java.lang.String question)
Replaces short forms of "is" and "are" that occur in combination with interrogatives.

Parameters:
question - the question string
Returns:
modified question string

dropFillers

private static java.lang.String dropFillers(java.lang.String question)
Drops filler words from the question string.

Parameters:
question - the question string
Returns:
modified question string

handleAuxIs

private static java.lang.String[] handleAuxIs(java.lang.String question,
                                              java.lang.String tagged)

Modifies the question string by applying the following rule:

is/are/was/were [...] gerund / past participle -> is/are/was/were gerund / past participle

Parameters:
question - question string
tagged - tagged question
Returns:
modified question strings

handleAuxCanMay

private static java.lang.String[] handleAuxCanMay(java.lang.String question,
                                                  java.lang.String tagged)

Modifies the question string by applying the following rule:

can/could/will/would/shall/should/may/might/must [...] infinitive -> can/could/will/would/shall/should/may/might/must infinitive

Parameters:
question - question string
tagged - tagged question
Returns:
modified question strings

handleAuxHasHad

private static java.lang.String[] handleAuxHasHad(java.lang.String question,
                                                  java.lang.String tagged)

Modifies the question string by applying the following rule:

have/has/had [...] past_participle -> has/have/had past_participle / simple_past

Parameters:
question - question string
tagged - tagged question
Returns:
modified question strings

handleAuxDo

private static java.lang.String[] handleAuxDo(java.lang.String question,
                                              java.lang.String tagged)

Modifies the question string by applying the following rule:

do [...] infinitive -> infinitive

Parameters:
question - question string
tagged - tagged question
Returns:
modified question strings

handleAuxDoes

private static java.lang.String[] handleAuxDoes(java.lang.String question,
                                                java.lang.String tagged)

Modifies the question string by applying the following rule:

does [...] infinitive -> 3rd person singular

Parameters:
question - question string
tagged - tagged question
Returns:
modified question strings

handleAuxDid

private static java.lang.String[] handleAuxDid(java.lang.String question,
                                               java.lang.String tagged)

Modifies the question string by applying the following rule:

did [...] infinitive -> simple_past

Parameters:
question - question string
tagged - tagged question
Returns:
modified question strings

dropPunctuationMarks

private static java.lang.String dropPunctuationMarks(java.lang.String question)
Removes the final punctuation mark and quotation marks from the question string.

Parameters:
question - the question string
Returns:
modified question string

normalize

public static java.lang.String normalize(java.lang.String question)
Normalizes a question string by removing abundant whitespaces, replacing short forms and dropping filler words.

Parameters:
question - question string
Returns:
normalized question string

stemVerbsAndNouns

public static java.lang.String stemVerbsAndNouns(java.lang.String qn)
Converts the verbs to infinitive and the nouns to their singular forms.

Parameters:
qn - normalized question string
Returns:
stemmed question string

unstem

public static java.lang.String unstem(java.lang.String sub,
                                      java.lang.String stemmed,
                                      java.lang.String qn)
Unstems a substring of the stemmed question string by mapping it to the normalized question string.

Parameters:
sub - a substring of the stemmed question string
stemmed - the stemmed question string
qn - the normalized question string
Returns:
unstemmed string or sub, if it is not a substring of stemmed

handleAuxiliaries

public static java.lang.String[] handleAuxiliaries(java.lang.String qn)

Handles auxiliary verbs by applying the rules specified in the documentations of the handleAux...() methods.

Parameters:
qn - normalized question string
Returns:
question strings with modified verbs

transformList

public static java.lang.String transformList(java.lang.String question)
Replaces certain expressions in a list question to transform it into a factoid question.

Parameters:
question - a list question
Returns:
transformed question