info.ephyra.questionanalysis.atype.extractor
Class EnglishFeatureExtractor

java.lang.Object
  extended by info.ephyra.questionanalysis.atype.extractor.FeatureExtractor
      extended by info.ephyra.questionanalysis.atype.extractor.EnglishFeatureExtractor

public class EnglishFeatureExtractor
extends FeatureExtractor

Feature extractor for English answer type classification.

See initialize for a description of the input properties required by this class.

See createInstance for a description of the features extracted by this class.

See loadFile for a specification of the input file format.

Version:
2008-02-10
Author:
Justin Betteridge

Field Summary
private static java.lang.String HOW_MANY_PTRN
           
private static java.lang.String HOW_MUCH_OF_PTRN
           
private static java.lang.String HOW_MUCH_PTRN
           
private static java.lang.String HOW_PTRN
           
private static org.apache.log4j.Logger log
           
private static java.lang.String[] OF_HEAD_WORDS
           
private static java.lang.String REST_PTRN
           
private static java.lang.String SPACE_PTRN
           
private static java.lang.String WHAT_ANYWHERE_PTRN
           
private static java.lang.String WHAT_PTRN
           
private static java.lang.String WHEN_PTRN
           
private static java.lang.String WHERE_PTRN
           
private static java.lang.String WHICH_ANYWHERE_PTRN
           
private static java.lang.String WHICH_PTRN
           
private static java.lang.String WHO_PTRN
           
private static java.lang.String WHOM_PTRN
           
private static java.lang.String WHOSE_PTRN
           
private static java.util.List<java.lang.String> whPtrns
           
private static java.lang.String WHY_PTRN
           
 
Fields inherited from class info.ephyra.questionanalysis.atype.extractor.FeatureExtractor
classLevels, datasetExamplePattern, isInitialized, labelPosition, numLoaded, parsePosition, questionPosition, useClassLevels
 
Constructor Summary
EnglishFeatureExtractor()
           
 
Method Summary
private static void addSemanticFeatures(edu.cmu.minorthird.classify.MutableInstance instance, edu.cmu.lti.javelin.qa.Term focusTerm)
           
private static void addSyntacticFeatures(edu.cmu.minorthird.classify.MutableInstance instance, java.util.List<edu.cmu.lti.javelin.qa.Term> terms, java.lang.String parseTree, edu.cmu.lti.javelin.qa.Term focusTerm)
           
private static void addWordLevelFeatures(edu.cmu.minorthird.classify.MutableInstance instance, java.util.List<edu.cmu.lti.javelin.qa.Term> terms, edu.cmu.lti.javelin.qa.Term focus)
           
 edu.cmu.minorthird.classify.Instance createInstance(java.util.List<edu.cmu.lti.javelin.qa.Term> terms, java.lang.String parseTree)
          Creates and populates an Instance from a QuestionAnalysis object.
 edu.cmu.minorthird.classify.Instance createInstance(java.lang.String question)
          Creates an Instance for question classification when nothing but the original question is available for feature extraction.
 void initialize()
          Initializes static resources.
 
Methods inherited from class info.ephyra.questionanalysis.atype.extractor.FeatureExtractor
createExample, createInstance, getClassLevels, getDatasetExamplePattern, getLabelPosition, getNumLoaded, getParsePosition, getQuestionPosition, isInitialized, isUsingClassLevels, loadFile, printFeatures, printFeaturesFromQuestions, setClassLevels, setDatasetExamplePattern, setInitialized, setLabelPosition, setParsePosition, setQuestionPosition, setUseClassLevels
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

log

private static final org.apache.log4j.Logger log

HOW_MUCH_PTRN

private static java.lang.String HOW_MUCH_PTRN

HOW_MUCH_OF_PTRN

private static java.lang.String HOW_MUCH_OF_PTRN

HOW_MANY_PTRN

private static java.lang.String HOW_MANY_PTRN

WHOSE_PTRN

private static java.lang.String WHOSE_PTRN

WHO_PTRN

private static java.lang.String WHO_PTRN

WHOM_PTRN

private static java.lang.String WHOM_PTRN

WHAT_PTRN

private static java.lang.String WHAT_PTRN

WHEN_PTRN

private static java.lang.String WHEN_PTRN

WHERE_PTRN

private static java.lang.String WHERE_PTRN

WHY_PTRN

private static java.lang.String WHY_PTRN

HOW_PTRN

private static java.lang.String HOW_PTRN

WHICH_PTRN

private static java.lang.String WHICH_PTRN

WHICH_ANYWHERE_PTRN

private static java.lang.String WHICH_ANYWHERE_PTRN

WHAT_ANYWHERE_PTRN

private static java.lang.String WHAT_ANYWHERE_PTRN

REST_PTRN

private static java.lang.String REST_PTRN

SPACE_PTRN

private static java.lang.String SPACE_PTRN

OF_HEAD_WORDS

private static java.lang.String[] OF_HEAD_WORDS

whPtrns

private static java.util.List<java.lang.String> whPtrns
Constructor Detail

EnglishFeatureExtractor

public EnglishFeatureExtractor()
Method Detail

initialize

public void initialize()
                throws java.lang.Exception
Initializes static resources.

Overrides:
initialize in class FeatureExtractor
Throws:
java.lang.Exception - if one of the required properties is not defined.

addWordLevelFeatures

private static void addWordLevelFeatures(edu.cmu.minorthird.classify.MutableInstance instance,
                                         java.util.List<edu.cmu.lti.javelin.qa.Term> terms,
                                         edu.cmu.lti.javelin.qa.Term focus)

addSyntacticFeatures

private static void addSyntacticFeatures(edu.cmu.minorthird.classify.MutableInstance instance,
                                         java.util.List<edu.cmu.lti.javelin.qa.Term> terms,
                                         java.lang.String parseTree,
                                         edu.cmu.lti.javelin.qa.Term focusTerm)

addSemanticFeatures

private static void addSemanticFeatures(edu.cmu.minorthird.classify.MutableInstance instance,
                                        edu.cmu.lti.javelin.qa.Term focusTerm)

createInstance

public edu.cmu.minorthird.classify.Instance createInstance(java.util.List<edu.cmu.lti.javelin.qa.Term> terms,
                                                           java.lang.String parseTree)
Creates and populates an Instance from a QuestionAnalysis object. All features are binary features of one of the following types:

Word-level features: Syntactic features: Semantic features:

Specified by:
createInstance in class FeatureExtractor
Parameters:
terms - the Terms of the question
parseTree - the syntactic parse tree of the question
Returns:
an Instance which can be used for question classification
Throws:
java.lang.Exception

createInstance

public edu.cmu.minorthird.classify.Instance createInstance(java.lang.String question)
Description copied from class: FeatureExtractor
Creates an Instance for question classification when nothing but the original question is available for feature extraction. Assumes words in the input question are separated by white-space.

Specified by:
createInstance in class FeatureExtractor
Parameters:
question - the input question
Returns:
the Instance object