Main Page   Namespace List   Class Hierarchy   Compound List   File List   Namespace Members   Compound Members   File Members   Related Pages  

ChineseCharParser Class Reference

#include <ChineseCharParser.hpp>

Inheritance diagram for ChineseCharParser:

Parser TextHandler List of all members.

Public Methods

 ChineseCharParser ()
void parseFile (char *filename)
 Parse a file.

void parseBuffer (char *buf, int len)
 Parse a buffer of len length.

long fileTell ()

Private Methods

void doParse ()
 Actual parsing action flow.


Private Attributes

int state
 The state of the parser.


Detailed Description

Parses unsegmented Chinese documents in NIST's TREC format, (GB encoding), producing character at a time tokens. The following fields are parsed: TEXT, HL, HEAD, HEADLINE, LP, TTL


Constructor & Destructor Documentation

ChineseCharParser::ChineseCharParser  
 


Member Function Documentation

void ChineseCharParser::doParse   [private]
 

Actual parsing action flow.

long ChineseCharParser::fileTell   [virtual]
 

Gives current byte position offset into file being parsed. Don't use with parseBuffer

Implements Parser.

void ChineseCharParser::parseBuffer char *    buf,
int    len
[virtual]
 

Parse a buffer of len length.

Implements Parser.

void ChineseCharParser::parseFile char *    filename [virtual]
 

Parse a file.

Implements Parser.


Member Data Documentation

int ChineseCharParser::state [private]
 

The state of the parser.


The documentation for this class was generated from the following file:
Generated on Fri Feb 6 07:11:58 2004 for LEMUR by doxygen1.2.16