Main Page   Namespace List   Class Hierarchy   Alphabetical List   Compound List   File List   Namespace Members   Compound Members   File Members   Related Pages  

Parser Class Reference

#include <Parser.hpp>

Inheritance diagram for Parser:

TextHandler ArabicParser BrillPOSParser ChineseCharParser ChineseParser IdentifinderParser InqArabicParser InQueryOpParser ReutersParser TrecParser WebParser List of all members.

Public Methods

 Parser ()
virtual ~Parser ()
virtual void parse (const string &filename)
virtual void parseFile (const string &filename)=0
 Parse a file.

virtual void parseBuffer (char *buf, int len)=0
 Parse a buffer.

virtual void setAcroList (const WordSet *acronyms)
virtual void setAcroList (string filename)
 Set the acronym list from this file.

virtual long fileTell ()=0
 return the current byte position of the file being parsed

virtual long getDocBytePos ()
 return the byte position at the beginning of the current document


Static Public Attributes

const string category = "Parser"
const string identifier = "parser"

Protected Methods

bool isAcronym (const char *word)
void clearAcros ()
 clears internal acronym list


Protected Attributes

long docpos

Detailed Description

Provides a generic parser interface. Assumes that the parser uses an acronym list. If, when developing your parser, you do not use an acronym list, you can just provide an empty implementation of the setAcroList function.


Constructor & Destructor Documentation

Parser::Parser  
 

Parser::~Parser   [virtual]
 


Member Function Documentation

void Parser::clearAcros   [protected]
 

clears internal acronym list

virtual long Parser::fileTell   [pure virtual]
 

return the current byte position of the file being parsed

Implemented in ArabicParser, BrillPOSParser, ChineseCharParser, ChineseParser, IdentifinderParser, InqArabicParser, InQueryOpParser, ReutersParser, TrecParser, and WebParser.

virtual long Parser::getDocBytePos   [inline, virtual]
 

return the byte position at the beginning of the current document

bool Parser::isAcronym const char *    word [protected]
 

Checks to see if the word is in the acronym list. Returns false if the list is not set.

virtual void Parser::parse const string &    filename [inline, virtual]
 

Parse a file. use parseFile. this method will be deprecated in future

virtual void Parser::parseBuffer char *    buf,
int    len
[pure virtual]
 

Parse a buffer.

Implemented in ArabicParser, BrillPOSParser, ChineseCharParser, ChineseParser, IdentifinderParser, InqArabicParser, InQueryOpParser, ReutersParser, TrecParser, and WebParser.

virtual void Parser::parseFile const string &    filename [pure virtual]
 

Parse a file.

Implemented in ArabicParser, BrillPOSParser, ChineseCharParser, ChineseParser, IdentifinderParser, InqArabicParser, InQueryOpParser, ReutersParser, TrecParser, and WebParser.

void Parser::setAcroList string    filename [virtual]
 

Set the acronym list from this file.

void Parser::setAcroList const WordSet   acronyms [virtual]
 

Set the acronym list. Can be an empty implementation if the parser is not designed to deal with acronyms by using a list. WordSet still belongs to the caller


Member Data Documentation

const string Parser::category = "Parser" [static]
 

Reimplemented from TextHandler.

long Parser::docpos [protected]
 

const string Parser::identifier = "parser" [static]
 

Reimplemented from TextHandler.

Reimplemented in ArabicParser, BrillPOSParser, ChineseCharParser, ChineseParser, IdentifinderParser, InqArabicParser, InQueryOpParser, ReutersParser, TrecParser, and WebParser.


The documentation for this class was generated from the following files:
Generated on Fri Jul 2 16:25:43 2004 for Lemur Toolkit by doxygen1.2.18