info.ephyra.answerselection
Class AnswerPattern

java.lang.Object
  extended by info.ephyra.answerselection.AnswerPattern
All Implemented Interfaces:
java.lang.Comparable<AnswerPattern>

public class AnswerPattern
extends java.lang.Object
implements java.lang.Comparable<AnswerPattern>

An AnswerPattern is applied to a sentence to extract a PROPERTY object of the type specified in the property field. The sentence must contain a TARGET tag to indicate the object of which the PROPERTY is wanted.

This class implements the interface Comparable. Note: it has a natural ordering that is inconsistent with equals().

Version:
2008-01-29
Author:
Nico Schlaefer

Field Summary
private  int correct
          Counter for the number of correct applications of the pattern.
private  java.lang.String desc
          The pattern descriptor from which the pattern is built.
private  int distID
          ID of the group that covers the string between TARGET and PROPERTY.
private static int MAX_DIST
          Maximum distance between TARGET and PROPERTY in tokens.
private static int MAX_PROP
          Maximum length of a PROPERTY object in tokens.
private  java.util.regex.Pattern pattern
          The Pattern that is applied to a sentence.
private  java.lang.String property
          The type of PROPERTY that is extracted with this pattern.
private  int propertyID
          ID of the group that represents the PROPERTY to be extracted.
private  int wrong
          Counter for the number of wrong applications of the pattern.
 
Constructor Summary
AnswerPattern(java.lang.String expr, java.lang.String prop)
          Creates an AnswerPattern from a descriptor that is a regular expression but additionally contains the following tags: <TO> - exactly one TARGET tag <CO> - an arbitrary number of CONTEXT tags <PO(_NExyz)*> - exactly one PROPERTY tag, optionally combined with NE types <NExyz(_NExyz)*> - an arbitrary number of NE tags, which are combinations of one or more NE types
AnswerPattern(java.lang.String expr, java.lang.String prop, int correct, int wrong)
          Creates an AnswerPattern from a descriptor by applying the constructor AnswerPattern(String expr, String prop).
 
Method Summary
private  java.lang.String addDistGroup(java.lang.String expr)
          Adds a capturing group that covers the string between the TARGET and the PROPERTY and sets the distID field.
 java.lang.String[] apply(java.lang.String sentence)
          Applies the pattern to a sentence of space-delimited tokens containing a TARGET tag and optionally a number of CONTEXT and NE tags.
 int compareTo(AnswerPattern ap)
          Compares two AnswerPattern objects by comparing the number of correct applications.
 boolean equals(java.lang.Object o)
          Compares this object to another AnswerPattern.
 float getConfidence()
          Calculates a confidence measure for the pattern by applying the formula confidence = correct / (correct + wrong).
 int getCorrect()
          Returns the number of correct applications of the pattern.
 java.lang.String getDesc()
          Returns the pattern descriptor.
 java.lang.String getProperty()
          Returns the type of PROPERTY that is extracted with this pattern.
 java.lang.String[] getPropertyTypes()
          Returns the NE types that are allowed for a PROPERTY object to match the pattern.
 int getWrong()
          Returns the number of wrong applications of the pattern.
 int hashCode()
          The hashcode of an AnswerPattern is the hashcode of its descriptor.
 void incCorrect()
          Increments the number of correct applications by 1.
 void incWrong()
          Increments the number of wrong applications by 1.
private  java.lang.String optimizePattern(java.lang.String expr)
          Optimizes the pattern to improve its runtime performance.
private  java.lang.String replaceContextTags(java.lang.String expr)
          Replaces CONTEXT tags by regular expressions that match CONTEXT tags with tag IDs.
private  java.lang.String replaceNeTags(java.lang.String expr)
          Replaces NE tags by regular expressions that match NE tags with at least one of the NE types.
private  java.lang.String replacePropertyTag(java.lang.String expr)
          Sets the propertyID field and replaces the PROPERTY tag by a capturing group.
private  java.lang.String replaceTargetTag(java.lang.String expr)
          Replaces the TARGET tag by a regular expression that matches TARGET tags with tag IDs.
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

MAX_DIST

private static final int MAX_DIST
Maximum distance between TARGET and PROPERTY in tokens.

See Also:
Constant Field Values

MAX_PROP

private static final int MAX_PROP
Maximum length of a PROPERTY object in tokens.

See Also:
Constant Field Values

desc

private java.lang.String desc
The pattern descriptor from which the pattern is built.


pattern

private java.util.regex.Pattern pattern
The Pattern that is applied to a sentence.


property

private java.lang.String property
The type of PROPERTY that is extracted with this pattern.


propertyID

private int propertyID
ID of the group that represents the PROPERTY to be extracted.


distID

private int distID
ID of the group that covers the string between TARGET and PROPERTY.


correct

private int correct
Counter for the number of correct applications of the pattern.


wrong

private int wrong
Counter for the number of wrong applications of the pattern.

Constructor Detail

AnswerPattern

public AnswerPattern(java.lang.String expr,
                     java.lang.String prop)
Creates an AnswerPattern from a descriptor that is a regular expression but additionally contains the following tags:

Parameters:
expr - pattern descriptor
prop - PROPERTY that the pattern extracts

AnswerPattern

public AnswerPattern(java.lang.String expr,
                     java.lang.String prop,
                     int correct,
                     int wrong)

Creates an AnswerPattern from a descriptor by applying the constructor AnswerPattern(String expr, String prop).

In addition, it sets the counters for the number of correct/wrong applications of the pattern.

Parameters:
expr - pattern descriptor
prop - PROPERTY that the pattern extracts
correct - number of correct applications
wrong - number of wrong applications
Method Detail

addDistGroup

private java.lang.String addDistGroup(java.lang.String expr)
Adds a capturing group that covers the string between the TARGET and the PROPERTY and sets the distID field. Required to measure the distance between TARGET and PROPERTY.

Parameters:
expr - pattern descriptor
Returns:
descriptor with capturing group

replaceTargetTag

private java.lang.String replaceTargetTag(java.lang.String expr)
Replaces the TARGET tag by a regular expression that matches TARGET tags with tag IDs.

Parameters:
expr - pattern descriptor
Returns:
descriptor with a regular expression for TARGET tags

replaceContextTags

private java.lang.String replaceContextTags(java.lang.String expr)
Replaces CONTEXT tags by regular expressions that match CONTEXT tags with tag IDs.

Parameters:
expr - pattern descriptor
Returns:
descriptor with regular expressions for CONTEXT tags

replacePropertyTag

private java.lang.String replacePropertyTag(java.lang.String expr)
Sets the propertyID field and replaces the PROPERTY tag by a capturing group.

Parameters:
expr - pattern descriptor
Returns:
descriptor without PROPERTY tag

replaceNeTags

private java.lang.String replaceNeTags(java.lang.String expr)
Replaces NE tags by regular expressions that match NE tags with at least one of the NE types.

Parameters:
expr - pattern descriptor
Returns:
descriptor with regular expressions for NE tags

optimizePattern

private java.lang.String optimizePattern(java.lang.String expr)
Optimizes the pattern to improve its runtime performance.

Parameters:
expr - pattern descriptor
Returns:
optimized pattern

equals

public boolean equals(java.lang.Object o)
Compares this object to another AnswerPattern. Two AnswerPattern objects are equal, iff the pattern descriptors are equal.

Overrides:
equals in class java.lang.Object
Parameters:
o - the reference object with which to compare
Returns:
true, iff this object is the same as the o argument

compareTo

public int compareTo(AnswerPattern ap)
Compares two AnswerPattern objects by comparing the number of correct applications.

Specified by:
compareTo in interface java.lang.Comparable<AnswerPattern>
Parameters:
ap - the AnswerPattern to be compared
Returns:
a negative integer, zero or a positive integer as this AnswerPattern is less than, equal to or greater than the specified AnswerPattern

hashCode

public int hashCode()
The hashcode of an AnswerPattern is the hashcode of its descriptor.

Overrides:
hashCode in class java.lang.Object
Returns:
hashcode

getDesc

public java.lang.String getDesc()
Returns the pattern descriptor.

Returns:
pattern descriptor

getProperty

public java.lang.String getProperty()
Returns the type of PROPERTY that is extracted with this pattern.

Returns:
the PROPERTY

getCorrect

public int getCorrect()
Returns the number of correct applications of the pattern.

Returns:
number of correct applications

getWrong

public int getWrong()
Returns the number of wrong applications of the pattern.

Returns:
number of wrong applications

getConfidence

public float getConfidence()
Calculates a confidence measure for the pattern by applying the formula confidence = correct / (correct + wrong).

Returns:
confidence in the pattern

incCorrect

public void incCorrect()
Increments the number of correct applications by 1.


incWrong

public void incWrong()
Increments the number of wrong applications by 1.


getPropertyTypes

public java.lang.String[] getPropertyTypes()
Returns the NE types that are allowed for a PROPERTY object to match the pattern.

Returns:
NE types or null iff no specific types are expected

apply

public java.lang.String[] apply(java.lang.String sentence)
Applies the pattern to a sentence of space-delimited tokens containing a TARGET tag and optionally a number of CONTEXT and NE tags. For each match, a PROPERTY object is extracted.

Parameters:
sentence - a sentence
Returns:
array of PROPERTY objects or an empty array, if the sentence does not match the pattern