#include <Parser.hpp>
Inheritance diagram for Parser:

Public Methods | |
| Parser () | |
| virtual | ~Parser () |
| virtual void | parse (const string &filename) |
| virtual void | parseFile (const string &filename)=0 |
| Parse a file. | |
| virtual void | parseBuffer (char *buf, int len)=0 |
| Parse a buffer. | |
| virtual void | setAcroList (const WordSet *acronyms) |
| virtual void | setAcroList (string filename) |
| Set the acronym list from this file. | |
| virtual long | fileTell ()=0 |
| return the current byte position of the file being parsed | |
| virtual long | getDocBytePos () |
| return the byte position at the beginning of the current document | |
Static Public Attributes | |
| const string | category = "Parser" |
| const string | identifier = "parser" |
Protected Methods | |
| bool | isAcronym (const char *word) |
| void | clearAcros () |
| clears internal acronym list | |
Protected Attributes | |
| long | docpos |
|
|
|
|
|
|
|
|
clears internal acronym list
|
|
|
return the current byte position of the file being parsed
Implemented in ArabicParser, BrillPOSParser, ChineseCharParser, ChineseParser, IdentifinderParser, InqArabicParser, InQueryOpParser, ReutersParser, TrecParser, and WebParser. |
|
|
return the byte position at the beginning of the current document
|
|
|
Checks to see if the word is in the acronym list. Returns false if the list is not set. |
|
|
Parse a file. use parseFile. this method will be deprecated in future |
|
||||||||||||
|
Parse a buffer.
Implemented in ArabicParser, BrillPOSParser, ChineseCharParser, ChineseParser, IdentifinderParser, InqArabicParser, InQueryOpParser, ReutersParser, TrecParser, and WebParser. |
|
|
Parse a file.
Implemented in ArabicParser, BrillPOSParser, ChineseCharParser, ChineseParser, IdentifinderParser, InqArabicParser, InQueryOpParser, ReutersParser, TrecParser, and WebParser. |
|
|
Set the acronym list from this file.
|
|
|
Set the acronym list. Can be an empty implementation if the parser is not designed to deal with acronyms by using a list. WordSet still belongs to the caller |
|
|
Reimplemented from TextHandler. |
|
|
|
|
|
Reimplemented from TextHandler. Reimplemented in ArabicParser, BrillPOSParser, ChineseCharParser, ChineseParser, IdentifinderParser, InqArabicParser, InQueryOpParser, ReutersParser, TrecParser, and WebParser. |
1.2.18