#include <Parser.hpp>
Inheritance diagram for Parser:
Public Methods | |
Parser () | |
virtual | ~Parser () |
virtual void | parse (const string &filename) |
virtual void | parseFile (const string &filename)=0 |
Parse a file. | |
virtual void | parseBuffer (char *buf, int len)=0 |
Parse a buffer. | |
virtual void | setAcroList (const WordSet *acronyms) |
virtual void | setAcroList (string filename) |
Set the acronym list from this file. | |
virtual long | fileTell ()=0 |
return the current byte position of the file being parsed | |
virtual long | getDocBytePos () |
return the byte position at the beginning of the current document | |
Static Public Attributes | |
const string | category = "Parser" |
const string | identifier = "parser" |
Protected Methods | |
bool | isAcronym (const char *word) |
void | clearAcros () |
clears internal acronym list | |
Protected Attributes | |
long | docpos |
|
|
|
|
|
clears internal acronym list
|
|
return the current byte position of the file being parsed
Implemented in ArabicParser, BrillPOSParser, ChineseCharParser, ChineseParser, IdentifinderParser, InqArabicParser, InQueryOpParser, ReutersParser, TrecParser, and WebParser. |
|
return the byte position at the beginning of the current document
|
|
Checks to see if the word is in the acronym list. Returns false if the list is not set. |
|
Parse a file. use parseFile. this method will be deprecated in future |
|
Parse a buffer.
Implemented in ArabicParser, BrillPOSParser, ChineseCharParser, ChineseParser, IdentifinderParser, InqArabicParser, InQueryOpParser, ReutersParser, TrecParser, and WebParser. |
|
Parse a file.
Implemented in ArabicParser, BrillPOSParser, ChineseCharParser, ChineseParser, IdentifinderParser, InqArabicParser, InQueryOpParser, ReutersParser, TrecParser, and WebParser. |
|
Set the acronym list from this file.
|
|
Set the acronym list. Can be an empty implementation if the parser is not designed to deal with acronyms by using a list. WordSet still belongs to the caller |
|
Reimplemented from TextHandler. |
|
|
|
Reimplemented from TextHandler. Reimplemented in ArabicParser, BrillPOSParser, ChineseCharParser, ChineseParser, IdentifinderParser, InqArabicParser, InQueryOpParser, ReutersParser, TrecParser, and WebParser. |