info.ephyra.nlp
Class StanfordParser

java.lang.Object
  extended by info.ephyra.nlp.StanfordParser

public class StanfordParser
extends java.lang.Object

Wrapper for the Stanford parser.

Version:
2007-10-30
Author:
Justin Betteridge, Nico Schlaefer

Nested Class Summary
protected static class StanfordParser.MutableInteger
           
 
Field Summary
static java.lang.String BEGIN_KEY
           
protected static java.util.regex.Pattern bracket_label_pattern
           
protected static java.util.regex.Pattern double_quote_lable_pattern
           
static java.lang.String END_KEY
           
protected static java.util.regex.Pattern escaped_char_pattern
           
protected static org.apache.log4j.Logger log
           
protected static edu.stanford.nlp.parser.lexparser.LexicalizedParser parser
           
protected static edu.stanford.nlp.trees.TreebankLanguagePack tlp
           
protected static java.util.regex.Pattern whitespace_pattern
           
 
Constructor Summary
protected StanfordParser()
          Hide default ctor.
 
Method Summary
protected static java.util.List<edu.cmu.lti.javelin.util.RangeMap> createMapping(java.lang.String sentence)
           
static void destroy()
          Unloads static resources.
static double getPCFGScore(java.lang.String sentence)
          Parses a sentence and returns the PCFG score as a confidence measure.
static void initialize()
          Initializes static resources.
static void main(java.lang.String[] args)
           
protected static void mapOffsets(edu.stanford.nlp.trees.Tree tree, java.util.List<edu.cmu.lti.javelin.util.RangeMap> mapping)
          Maps Tree node offsets using provided mapping.
static java.lang.String parse(java.lang.String sentence)
          Parses a sentence and returns a string representation of the parse tree.
protected static void updateTreeLabels(edu.stanford.nlp.trees.Tree root, edu.stanford.nlp.trees.Tree tree, StanfordParser.MutableInteger offset, StanfordParser.MutableInteger leafIndex)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

log

protected static final org.apache.log4j.Logger log

whitespace_pattern

protected static final java.util.regex.Pattern whitespace_pattern

escaped_char_pattern

protected static final java.util.regex.Pattern escaped_char_pattern

double_quote_lable_pattern

protected static final java.util.regex.Pattern double_quote_lable_pattern

bracket_label_pattern

protected static final java.util.regex.Pattern bracket_label_pattern

BEGIN_KEY

public static final java.lang.String BEGIN_KEY
See Also:
Constant Field Values

END_KEY

public static final java.lang.String END_KEY
See Also:
Constant Field Values

tlp

protected static edu.stanford.nlp.trees.TreebankLanguagePack tlp

parser

protected static edu.stanford.nlp.parser.lexparser.LexicalizedParser parser
Constructor Detail

StanfordParser

protected StanfordParser()
Hide default ctor.

Method Detail

initialize

public static void initialize()
                       throws java.lang.Exception
Initializes static resources.

Throws:
java.lang.Exception

destroy

public static void destroy()
                    throws java.lang.Exception
Unloads static resources.

Throws:
java.lang.Exception

parse

public static java.lang.String parse(java.lang.String sentence)
Parses a sentence and returns a string representation of the parse tree.

Parameters:
sentence - a sentence
Returns:
Tree whose Label is a MapLabel containing correct begin and end character offsets in keys BEGIN_KEY and END_KEY

getPCFGScore

public static double getPCFGScore(java.lang.String sentence)
Parses a sentence and returns the PCFG score as a confidence measure.

Parameters:
sentence - a sentence
Returns:
PCFG score

updateTreeLabels

protected static void updateTreeLabels(edu.stanford.nlp.trees.Tree root,
                                       edu.stanford.nlp.trees.Tree tree,
                                       StanfordParser.MutableInteger offset,
                                       StanfordParser.MutableInteger leafIndex)

createMapping

protected static java.util.List<edu.cmu.lti.javelin.util.RangeMap> createMapping(java.lang.String sentence)
Parameters:
sentence -
Returns:
a list of RangeMap objects which define a mapping of character offsets in a white-space depleted version of the input string back into offsets in the input string.

mapOffsets

protected static void mapOffsets(edu.stanford.nlp.trees.Tree tree,
                                 java.util.List<edu.cmu.lti.javelin.util.RangeMap> mapping)
Maps Tree node offsets using provided mapping.

Parameters:
tree - the Tree whose begin and end extents should be mapped.
mapping - the list of RangeMap objects which defines the mapping.

main

public static void main(java.lang.String[] args)
                 throws java.lang.Exception
Throws:
java.lang.Exception