|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectinfo.ephyra.answerselection.filters.Filter
info.ephyra.answerselection.filters.ScoreNormalizationFilter
public class ScoreNormalizationFilter
A filter that normalizes the scores of the answer candidates by applying a trained classifier. The weight of the positive class ("answer correct") is used as the normalized score.
The main method can be used to evaluate different combinations of features and models and to train a classifier with the best combination.
The filter is applied to factoid answers only.
This class extends the class Filter.
| Field Summary | |
|---|---|
private static java.lang.String |
ADA_BOOST_10_M
Identifier for the Ada Boost model (boosts a decision tree learner 10 times). |
private static java.lang.String |
ADA_BOOST_100_M
Identifier for the Ada Boost model (boosts a decision tree learner 100 times). |
private static java.lang.String |
ADA_BOOST_L_M
Identifier for the Ada Boost model (Logistic Regression version). |
private static java.lang.String |
ADA_BOOST_N_M
Identifier for the Ada Boost model (boosts a decision tree learner NUM_BOOSTS times). |
private static java.lang.String[] |
ALL_FEATURES
All feature identifiers. |
private static java.lang.String[] |
ALL_MODELS
All model identifiers. |
private static java.lang.String |
ANSWER_TYPES_F
Identifier for the answer type features. |
private static java.lang.String |
BALANCED_WINNOW_M
Identifier for the Balanced Winnow model. |
private static edu.cmu.minorthird.classify.Classifier |
classifier
Classifier for score normalization. |
private static java.lang.String |
DECISION_TREE_M
Identifier for the Decision Tree model. |
private static java.lang.String |
EXTRACTORS_F
Identifier for the extractor features. |
private static java.lang.String |
KNN_M
Identifier for the K-Nearest-Neighbor model. |
private static java.lang.String |
KWAY_MIXTURE_M
Identifier for the K-Way Mixture model. |
private static java.lang.String |
MARGIN_PERCEPTRON_M
Identifier for the Margin Perceptron model. |
private static java.lang.String |
MAX_ENT_M
Identifier for the Maximum Entropy model. |
private static java.lang.String |
MAX_SCORE_F
Identifier for the maximum score feature. |
private static java.lang.String |
MEAN_SCORE_F
Identifier for the mean score feature. |
private static java.lang.String |
MIN_SCORE_F
Identifier for the minimum score feature. |
private static java.lang.String |
NAIVE_BAYES_M
Identifier for the Naive Bayes model. |
private static java.lang.String |
NEGATIVE_BINOMIAL_M
Identifier for the Negative Binomial model. |
private static java.lang.String |
NUM_ANSWERS_F
Identifier for the number of answers feature. |
private static int |
NUM_BOOSTS
The N in ADA_BOOST_N_M. |
private static int |
NUM_FOLDS
Number of folds for cross validation. |
private static java.lang.String |
SCORE_F
Identifier for the score feature. |
private static java.lang.String[] |
SELECTED_FEATURES
Subset of the features used to train the classifier. |
private static java.lang.String |
SELECTED_MODEL
Model used for the classifier. |
private static java.lang.String |
SVM_M
Identifier for the SVM model. |
private static java.lang.String |
VOTED_PERCEPTRON_M
Identifier for the Voted Perceptron model. |
| Constructor Summary | |
|---|---|
ScoreNormalizationFilter(java.lang.String classifierFilename)
Creates the filter and loads a serialized classifier from a file. |
|
| Method Summary | |
|---|---|
private static void |
addAnswerTypeFeatures(edu.cmu.minorthird.classify.MutableInstance instance,
Result result)
Adds the answer types of the question as features to the instance. |
private static void |
addExtractorFeature(edu.cmu.minorthird.classify.MutableInstance instance,
Result result)
Adds the extractor used to obtain the answer candidate as a feature to the instance. |
private static void |
addMaxScoreFeature(edu.cmu.minorthird.classify.MutableInstance instance,
Result result,
Result[] results)
Adds the maximum score of all factoid answers from the same extractor as a feature to the instance. |
private static void |
addMeanScoreFeature(edu.cmu.minorthird.classify.MutableInstance instance,
Result result,
Result[] results)
Adds the mean score of all factoid answers from the same extractor as a feature to the instance. |
private static void |
addMinScoreFeature(edu.cmu.minorthird.classify.MutableInstance instance,
Result result,
Result[] results)
Adds the minimum score of all factoid answers from the same extractor as a feature to the instance. |
private static void |
addNumAnswersFeature(edu.cmu.minorthird.classify.MutableInstance instance,
Result result,
Result[] results)
Adds the number of factoid answers from the same extractor as a feature to the instance. |
private static void |
addScoreFeature(edu.cmu.minorthird.classify.MutableInstance instance,
Result result)
Adds the score of the answer candidate as a feature to the instance. |
private static void |
addSelectedFeatures(edu.cmu.minorthird.classify.MutableInstance instance,
java.lang.String[] features,
Result result,
Result[] results)
Adds the selected features to the instance. |
Result[] |
apply(Result[] results)
Normalizes the scores of the factoid answers, using the features specified in SELECTED_FEATURES and the classifier specified
in classifier. |
private static edu.cmu.minorthird.classify.Dataset |
createDataset(java.lang.String[] features,
java.lang.String serializedDir)
Creates a training/evaluation set from serialized judged Result objects. |
private static edu.cmu.minorthird.classify.Example |
createExample(java.lang.String[] features,
Result result,
Result[] results,
java.lang.String qid)
Creates a training/evaluation example from a judged answer candidate. |
private static edu.cmu.minorthird.classify.Instance |
createInstance(java.lang.String[] features,
Result result,
Result[] results)
Creates an instance for training/evaluation or classification from an answer candidate. |
private static edu.cmu.minorthird.classify.Instance |
createInstance(java.lang.String[] features,
Result result,
Result[] results,
java.lang.String qid)
Creates an instance for training/evaluation or classification from an answer candidate, using the question ID as a subpopulation ID. |
private static edu.cmu.minorthird.classify.ClassifierLearner |
createLearner(java.lang.String model)
Creates a classifier learner for the given model. |
private static java.lang.String |
createReport(java.lang.String[] dataSets,
java.lang.String[] features,
java.lang.String model,
edu.cmu.minorthird.classify.experiments.Evaluation eval,
long runTime)
Builds a report comprising the selected parameters (data sets, features and model) and evaluation statistics. |
static edu.cmu.minorthird.classify.experiments.Evaluation |
evaluate(java.lang.String serializedDir,
java.lang.String[] features,
java.lang.String model)
Performs a cross-validation on the given data set for the given features and model. |
static java.lang.String[][] |
evaluateAll(java.lang.String serializedDir,
java.lang.String reportDir)
Performs a cross-validation on the given data set for all combinations of features and models and writes a report for each evaluation. |
static void |
loadClassifier(java.lang.String classifierFilename)
Loads a serialized classifier for score normalization from a file. |
static void |
main(java.lang.String[] args)
Evaluates all combinations of features and models and trains a classifier using the best combination. |
Result[] |
preserveOrderAveraging(Result[] results)
Calculates the average normalization factor for each extraction technique and normalizes the scores with this factor to ensure that the order suggested by the original scores is preserved. |
Result[] |
preserveOrderResorting(Result[] results)
Reassigns the normalized scores for each extraction technique to ensure that the order suggested by the original scores is preserved. |
Result[] |
preserveOrderTop(Result[] results)
Calculates the normalization factor of the top answer for each extraction technique and normalizes the scores with this factor to ensure that the order suggested by the original scores is preserved. |
private static Result[] |
readSerializedResults(java.io.File input)
Reads serialized results from a file. |
static edu.cmu.minorthird.classify.Classifier |
train(java.lang.String serializedDir)
Trains a classifier using the given training data, the features specified in SELECTED_FEATURES and the model specified in
SELECTED_MODEL. |
static edu.cmu.minorthird.classify.Classifier |
train(java.lang.String serializedDir,
java.lang.String[] features,
java.lang.String model)
Trains a classifier using the given training data, features and model. |
| Methods inherited from class info.ephyra.answerselection.filters.Filter |
|---|
apply |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
private static final java.lang.String SCORE_F
private static final java.lang.String EXTRACTORS_F
private static final java.lang.String ANSWER_TYPES_F
private static final java.lang.String NUM_ANSWERS_F
private static final java.lang.String MEAN_SCORE_F
private static final java.lang.String MAX_SCORE_F
private static final java.lang.String MIN_SCORE_F
private static final java.lang.String[] ALL_FEATURES
private static final java.lang.String[] SELECTED_FEATURES
private static final java.lang.String ADA_BOOST_10_M
private static final java.lang.String ADA_BOOST_100_M
private static int NUM_BOOSTS
N in ADA_BOOST_N_M.
private static java.lang.String ADA_BOOST_N_M
NUM_BOOSTS times).
private static final java.lang.String ADA_BOOST_L_M
private static final java.lang.String BALANCED_WINNOW_M
private static final java.lang.String DECISION_TREE_M
private static final java.lang.String KNN_M
private static final java.lang.String KWAY_MIXTURE_M
private static final java.lang.String MARGIN_PERCEPTRON_M
private static final java.lang.String MAX_ENT_M
private static final java.lang.String NAIVE_BAYES_M
private static final java.lang.String NEGATIVE_BINOMIAL_M
private static final java.lang.String SVM_M
private static final java.lang.String VOTED_PERCEPTRON_M
private static final java.lang.String[] ALL_MODELS
private static final java.lang.String SELECTED_MODEL
private static final int NUM_FOLDS
private static edu.cmu.minorthird.classify.Classifier classifier
| Constructor Detail |
|---|
public ScoreNormalizationFilter(java.lang.String classifierFilename)
classifierFilename - filename of a serialized classifier| Method Detail |
|---|
private static Result[] readSerializedResults(java.io.File input)
input - input file
private static void addScoreFeature(edu.cmu.minorthird.classify.MutableInstance instance,
Result result)
private static void addExtractorFeature(edu.cmu.minorthird.classify.MutableInstance instance,
Result result)
private static void addAnswerTypeFeatures(edu.cmu.minorthird.classify.MutableInstance instance,
Result result)
private static void addNumAnswersFeature(edu.cmu.minorthird.classify.MutableInstance instance,
Result result,
Result[] results)
private static void addMeanScoreFeature(edu.cmu.minorthird.classify.MutableInstance instance,
Result result,
Result[] results)
private static void addMaxScoreFeature(edu.cmu.minorthird.classify.MutableInstance instance,
Result result,
Result[] results)
private static void addMinScoreFeature(edu.cmu.minorthird.classify.MutableInstance instance,
Result result,
Result[] results)
private static void addSelectedFeatures(edu.cmu.minorthird.classify.MutableInstance instance,
java.lang.String[] features,
Result result,
Result[] results)
private static edu.cmu.minorthird.classify.Instance createInstance(java.lang.String[] features,
Result result,
Result[] results)
features - selected featuresresult - answer candidateresults - all answers to the question
private static edu.cmu.minorthird.classify.Instance createInstance(java.lang.String[] features,
Result result,
Result[] results,
java.lang.String qid)
features - selected featuresresult - answer candidateresults - all answers to the questionqid - question ID
private static edu.cmu.minorthird.classify.Example createExample(java.lang.String[] features,
Result result,
Result[] results,
java.lang.String qid)
features - selected featuresresult - judged answer candidateresults - all answers to the questionqid - question ID
private static edu.cmu.minorthird.classify.Dataset createDataset(java.lang.String[] features,
java.lang.String serializedDir)
Result objects.
features - selected featuresserializedDir - directory containing serialized results
private static edu.cmu.minorthird.classify.ClassifierLearner createLearner(java.lang.String model)
model - selected model
private static java.lang.String createReport(java.lang.String[] dataSets,
java.lang.String[] features,
java.lang.String model,
edu.cmu.minorthird.classify.experiments.Evaluation eval,
long runTime)
dataSets - used data setsfeatures - selected featuresmodel - selected modeleval - evaluation statisticsrunTime - run time of the evaluation
public static edu.cmu.minorthird.classify.Classifier train(java.lang.String serializedDir)
SELECTED_FEATURES and the model specified in
SELECTED_MODEL.
serializedDir - directory containing serialized results
public static edu.cmu.minorthird.classify.Classifier train(java.lang.String serializedDir,
java.lang.String[] features,
java.lang.String model)
serializedDir - directory containing serialized resultsfeatures - selected featuresmodel - selected model
public static edu.cmu.minorthird.classify.experiments.Evaluation evaluate(java.lang.String serializedDir,
java.lang.String[] features,
java.lang.String model)
serializedDir - directory containing serialized resultsfeatures - selected featuresmodel - selected model
public static java.lang.String[][] evaluateAll(java.lang.String serializedDir,
java.lang.String reportDir)
serializedDir - directory containing serialized resultsreportDir - output directory for evaluation reports
public static void main(java.lang.String[] args)
args - {directory containing serialized results,
output directory for evaluation reports and classifier}public static void loadClassifier(java.lang.String classifierFilename)
classifierFilename - filename of a serialized classifierpublic Result[] preserveOrderResorting(Result[] results)
results - array of Result objects
Result objects with new normalized scorespublic Result[] preserveOrderAveraging(Result[] results)
results - array of Result objects
Result objects with new normalized scorespublic Result[] preserveOrderTop(Result[] results)
results - array of Result objects
Result objects with new normalized scorespublic Result[] apply(Result[] results)
SELECTED_FEATURES and the classifier specified
in classifier.
apply in class Filterresults - array of Result objects
Result objects with normalized scores
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||