info.ephyra.answerselection.filters
Class WikipediaGoogleTermImportanceFilter

java.lang.Object
  extended by info.ephyra.answerselection.filters.Filter
      extended by info.ephyra.answerselection.filters.WebTermImportanceFilter
          extended by info.ephyra.answerselection.filters.WikipediaGoogleTermImportanceFilter

public class WikipediaGoogleTermImportanceFilter
extends WebTermImportanceFilter

A web term importance filter that counts term frequencies in a Wikipedia article on the target of the question. If no Wikipedia article is found, text snippets retrieved with the Google search engine are used instead.

This class extends the class WebTermImportanceFilter.

Version:
2008-02-15
Author:
Guido Sautter

Nested Class Summary
 
Nested classes/interfaces inherited from class info.ephyra.answerselection.filters.WebTermImportanceFilter
WebTermImportanceFilter.TermCounter
 
Field Summary
private static java.lang.String GOOGLE_KEY
          Google license key.
private static int MAX_RESULTS_PERQUERY
          Maximum number of search results per query.
private static int MAX_RESULTS_TOTAL
          Maximum total number of search results.
private static java.util.HashMap<java.lang.String,WebTermImportanceFilter.TermCounter> missPageTermCounters
           
private static int RETRIES
          Number of retries if search fails.
 
Fields inherited from class info.ephyra.answerselection.filters.WebTermImportanceFilter
event, LINEAR_LENGTH_NORMALIZATION, location, LOG_10_LENGTH_NORMALIZATION, LOG_LENGTH_NORMALIZATION, NO_NORMALIZATION, organization, person, SQUARE_ROOT_LENGTH_NORMALIZATION, TEST_TARGET_GENERATION
 
Constructor Summary
WikipediaGoogleTermImportanceFilter(int normalizationMode, int tfNormalizationMode, boolean isCombined)
           
 
Method Summary
private  java.util.HashMap<java.lang.String,WebTermImportanceFilter.TermCounter> getGoogleTermCounters(java.lang.String target)
           
 java.util.HashMap<java.lang.String,WebTermImportanceFilter.TermCounter> getTermCounters(java.lang.String[] targets)
          fetch the term frequencies in the top X result snippets of a web search for some target
private  void initMissTerms()
           
static void main(java.lang.String[] args)
           
 
Methods inherited from class info.ephyra.answerselection.filters.WebTermImportanceFilter
addTermCounters, apply, getCountSum, getMaxCount, getTargets, sumDiff
 
Methods inherited from class info.ephyra.answerselection.filters.Filter
apply
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

GOOGLE_KEY

private static final java.lang.String GOOGLE_KEY
Google license key.

See Also:
Constant Field Values

MAX_RESULTS_TOTAL

private static final int MAX_RESULTS_TOTAL
Maximum total number of search results.

See Also:
Constant Field Values

MAX_RESULTS_PERQUERY

private static final int MAX_RESULTS_PERQUERY
Maximum number of search results per query.

See Also:
Constant Field Values

RETRIES

private static final int RETRIES
Number of retries if search fails.

See Also:
Constant Field Values

missPageTermCounters

private static java.util.HashMap<java.lang.String,WebTermImportanceFilter.TermCounter> missPageTermCounters
Constructor Detail

WikipediaGoogleTermImportanceFilter

public WikipediaGoogleTermImportanceFilter(int normalizationMode,
                                           int tfNormalizationMode,
                                           boolean isCombined)
Parameters:
normalizationMode -
tfNormalizationMode -
isCombined -
Method Detail

initMissTerms

private void initMissTerms()

getTermCounters

public java.util.HashMap<java.lang.String,WebTermImportanceFilter.TermCounter> getTermCounters(java.lang.String[] targets)
Description copied from class: WebTermImportanceFilter
fetch the term frequencies in the top X result snippets of a web search for some target

Specified by:
getTermCounters in class WebTermImportanceFilter
Parameters:
targets - an array of strings containing the targets
Returns:
a HashMap mapping the terms in the web serach results to their frequency in the snippets
See Also:
WebTermImportanceFilter.getTermCounters(java.lang.String[])

getGoogleTermCounters

private java.util.HashMap<java.lang.String,WebTermImportanceFilter.TermCounter> getGoogleTermCounters(java.lang.String target)

main

public static void main(java.lang.String[] args)