Main Page   Namespace List   Class Hierarchy   Compound List   File List   Namespace Members   Compound Members   File Members   Related Pages  

TextHandler Class Reference

#include <TextHandler.hpp>

Inheritance diagram for TextHandler:

BrillPOSTokenizer CtfIndexer DocFreqIndexer DocOffsetParser FlattextDocMgr FreqCounter InvFPTextHandler KeyfileDocMgr MemParser Parser PropIndexTH QueryDocument QueryTextHandler Stemmer Stopper WriterInQueryHandler WriterTextHandler List of all members.

Public Types

enum  TokenType {
  BEGINDOC = 1, ENDDOC = 2, WORDTOK = 3, BEGINTAG = 4,
  ENDTAG = 5, SYMBOLTOK = 6
}

Public Methods

 TextHandler ()
virtual ~TextHandler ()
virtual void setTextHandler (TextHandler *th)
 Set the TextHandler that this TextHandler will pass information on to.

virtual TextHandler * getTextHandler ()
 Set the TextHandler that this TextHandler will pass information on to.

virtual void foundToken (TokenType type, char *token=NULL, char *orig=NULL, PropertyList *properties=NULL)
virtual char * handleBeginDoc (char *docno, char *original, PropertyList *list)
virtual char * handleEndDoc (char *token, char *original, PropertyList *list)
virtual char * handleWord (char *word, char *original, PropertyList *list)
virtual char * handleBeginTag (char *tag, char *original, PropertyList *list)
 Handle a begin tag.

virtual char * handleEndTag (char *tag, char *original, PropertyList *list)
 Handle an end tag.

virtual char * handleSymbol (char *symbol, char *original, PropertyList *list)
virtual void foundDoc (char *docno)
 Found a document with document number.

virtual void foundDoc (char *docno, char *original)
virtual void foundWord (char *word)
 Found a word.

virtual void foundWord (char *word, char *original)
virtual void foundEndDoc ()
 Found end of doc.

virtual void foundSymbol (char *sym)
 Found a word.

virtual char * handleDoc (char *docno)
 Handle a doc.

virtual char * handleWord (char *word)
 Handle a word, possibly transforming it.

virtual void handleEndDoc ()
 Handle the end of the doc.

virtual char * handleSymbol (char *sym)
 Handle a word, possibly transforming it.


Protected Attributes

TextHandler * textHandler
 The next textHandler in the chain.

char buffer [MAXWORDSIZE]

Detailed Description

TextHandlers have their own internal buffer for modification of the string. The foundWord function copies the word into the buffer then calls handleWord with the copy. The handleWord function may then modify the string and return the pointer to the string. This process is also done for foundDoc/handleDoc.


Member Enumeration Documentation

enum TextHandler::TokenType
 

Enumeration values:
BEGINDOC 
ENDDOC 
WORDTOK 
BEGINTAG 
ENDTAG 
SYMBOLTOK 


Constructor & Destructor Documentation

TextHandler::TextHandler   [inline]
 

virtual TextHandler::~TextHandler   [inline, virtual]
 


Member Function Documentation

virtual void TextHandler::foundDoc char *    docno,
char *    original
[inline, virtual]
 

virtual void TextHandler::foundDoc char *    docno [inline, virtual]
 

Found a document with document number.

virtual void TextHandler::foundEndDoc   [inline, virtual]
 

Found end of doc.

virtual void TextHandler::foundSymbol char *    sym [inline, virtual]
 

Found a word.

virtual void TextHandler::foundToken TokenType    type,
char *    token = NULL,
char *    orig = NULL,
PropertyList   properties = NULL
[inline, virtual]
 

virtual void TextHandler::foundWord char *    word,
char *    original
[inline, virtual]
 

virtual void TextHandler::foundWord char *    word [inline, virtual]
 

Found a word.

virtual TextHandler* TextHandler::getTextHandler   [inline, virtual]
 

Set the TextHandler that this TextHandler will pass information on to.

Reimplemented in FlattextDocMgr.

virtual char* TextHandler::handleBeginDoc char *    docno,
char *    original,
PropertyList   list
[inline, virtual]
 

Handle a doc begin - default implementation calls handleDoc for backwords compat

virtual char* TextHandler::handleBeginTag char *    tag,
char *    original,
PropertyList   list
[inline, virtual]
 

Handle a begin tag.

Reimplemented in ElemDocMgr.

virtual char* TextHandler::handleDoc char *    docno [inline, virtual]
 

Handle a doc.

Reimplemented in DocFreqIndexer, FreqCounter, InvFPTextHandler, PropIndexTH, FlattextDocMgr, KeyfileDocMgr, WriterInQueryHandler, and WriterTextHandler.

virtual void TextHandler::handleEndDoc   [inline, virtual]
 

Handle the end of the doc.

Reimplemented in DocFreqIndexer, FlattextDocMgr, and KeyfileDocMgr.

virtual char* TextHandler::handleEndDoc char *    token,
char *    original,
PropertyList   list
[inline, virtual]
 

Handle a doc end - default implementation calls old handleEndDoc for backwords compat

virtual char* TextHandler::handleEndTag char *    tag,
char *    original,
PropertyList   list
[inline, virtual]
 

Handle an end tag.

Reimplemented in ElemDocMgr.

virtual char* TextHandler::handleSymbol char *    sym [inline, virtual]
 

Handle a word, possibly transforming it.

Reimplemented in QueryDocument, and WriterInQueryHandler.

virtual char* TextHandler::handleSymbol char *    symbol,
char *    original,
PropertyList   list
[inline, virtual]
 

Handle a symbol - default implementation calls old handleSymbol for backwords compat

virtual char* TextHandler::handleWord char *    word [inline, virtual]
 

Handle a word, possibly transforming it.

Reimplemented in CtfIndexer, DocFreqIndexer, FreqCounter, InvFPTextHandler, QueryTextHandler, DocOffsetParser, KeyfileDocMgr, QueryDocument, Stemmer, Stopper, WriterInQueryHandler, and WriterTextHandler.

virtual char* TextHandler::handleWord char *    word,
char *    original,
PropertyList   list
[inline, virtual]
 

Handle a word - default implementation calls old handleWord for backwords compat

Reimplemented in PropIndexTH, and BrillPOSTokenizer.

virtual void TextHandler::setTextHandler TextHandler *    th [inline, virtual]
 

Set the TextHandler that this TextHandler will pass information on to.


Member Data Documentation

char TextHandler::buffer[MAXWORDSIZE] [protected]
 

TextHandler* TextHandler::textHandler [protected]
 

The next textHandler in the chain.


The documentation for this class was generated from the following file:
Generated on Fri Feb 6 07:12:08 2004 for LEMUR by doxygen1.2.16