|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object jangada.SigFilePredictor
Signature File extraction Algorithm It follows the description in "Learning to Extract Signature and Reply Lines from Email", V.R.Carvalho and W.W.Cohen, CEAS (Conference of Email and Anti-Span), 2004 *
Nested Class Summary | |
static class |
SigFilePredictor.WindowRepresentation
Inner class to represent the message as a sequence of features - using window features (neighbor lines) |
Field Summary | |
int |
CURRENT_VERSION_NUMBER
|
static long |
serialVersionUID
|
Constructor Summary | |
SigFilePredictor()
|
Method Summary | |
static void |
createModel(java.lang.String[] args,
java.lang.String linetag)
|
java.util.ArrayList |
DetectAndPredict(java.lang.String wholeMessage)
Detects if there is a sig in the email message AND predicts (extracts) the signature lines . |
static boolean |
detectFromName(java.lang.String tmp,
java.lang.String testLine)
From Line feature function: extracts a "name" from the fromLine of an email message and attempts to match any of its components with the words in the target line In other words, if a piece of the sender's name is detected in this line, it returns true. |
java.lang.String |
getMsgWithoutSignatureLines(java.lang.String doc)
returns the original message, without the signature lines. |
java.lang.String |
getSignatureLines(java.lang.String doc)
returns the signature file lines (usually the last lines of the messages), if any signature is found. |
static void |
main(java.lang.String[] args)
|
java.util.ArrayList |
Predict(java.lang.String wholeMessage)
Predicts the sig file lines in the email message. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
public static final long serialVersionUID
public final int CURRENT_VERSION_NUMBER
Constructor Detail |
public SigFilePredictor()
Method Detail |
public java.util.ArrayList Predict(java.lang.String wholeMessage)
public java.util.ArrayList DetectAndPredict(java.lang.String wholeMessage)
public java.lang.String getMsgWithoutSignatureLines(java.lang.String doc)
public java.lang.String getSignatureLines(java.lang.String doc)
public static boolean detectFromName(java.lang.String tmp, java.lang.String testLine)
testLine
- in String format
public static void createModel(java.lang.String[] args, java.lang.String linetag) throws java.io.IOException
java.io.IOException
public static void main(java.lang.String[] args)
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |