Main Page   Namespace List   Class Hierarchy   Alphabetical List   Compound List   File List   Namespace Members   Compound Members   File Members   Related Pages  

DocStream Class Reference

Abstract interface for a collection of documents. More...

#include <DocStream.hpp>

Inheritance diagram for DocStream:

BasicDocStream List of all members.

Public Methods

virtual ~DocStream ()
Document Iteration
virtual void startDocIteration ()=0
 start document iteration

virtual bool hasMore ()=0
virtual DocumentnextDoc ()=0
 return a pointer to next document (static memory, do not delete returned instance). hasMore() should be called before calling nextDoc()


Detailed Description

Abstract interface for a collection of documents.

DocStream is an abstract interface for a collection of documents. A given realization can have special tokenization, document header formats, etc, and will return a special Document instance to indicate this.

The following is an example of supporting an index with position information:

An example of supporting index with position information

   
    // a DocStream that handles position
    class PosDocStream : public DocStream {
    ... 
         Document *nextDoc() {
	   return (new PosDocument(...)); // returns a special Document
         } 
	 ...
     };

     // a Document that has position information 
     class PosDocument : public Document {
         ... 
	 TokenTerm  *nextTerm() {
	 return (new PosTerm(...)); // returns a special Term
	 }
     };

     // a Term that has position 
     class PosTerm: public TokenTerm {
        int getPosition() { 
	    ...
         }
     };
 
     // Indexer that records term positions
     class PosIndex : public Index {
        ...
       PosDocStream *db;

       ... // when indexing
 
      db->startDocIteration();
      Document *doc;
      while (db->hasMore()) {
      Document *doc = db->nextDoc(); // we'll actually get a PosDocument
      doc->startTermIteration();
      PosTerm *term;
      while (doc->hasMore()) {
          term = (PosTerm *)nextTerm(term); 
	  // note that down-casting!
	  term->getPosition();
	  term->spelling();
	     ...
	     
      }
    }
    ... 
    }
   


Constructor & Destructor Documentation

virtual DocStream::~DocStream   [inline, virtual]
 


Member Function Documentation

virtual bool DocStream::hasMore   [pure virtual]
 

Implemented in BasicDocStream.

virtual Document* DocStream::nextDoc   [pure virtual]
 

return a pointer to next document (static memory, do not delete returned instance). hasMore() should be called before calling nextDoc()

Implemented in BasicDocStream.

virtual void DocStream::startDocIteration   [pure virtual]
 

start document iteration

Typical usage:

See also:
Document
DocStream &myStream;
...
myStream.startDocIteration();
Document *doc;
while (myStream.nextDoc(doc)) {
Term *term;
doc->startTermIteration();
while (doc->nextTerm(term)) {
... process "term" ...
YOU MUST NOT DELETE term, as it is a pointer to a local static memory }
YOU MUST NOT DELETE doc, as it is a pointer to a local static memory }

Implemented in BasicDocStream.


The documentation for this class was generated from the following file:
Generated on Wed Nov 3 12:59:30 2004 for Lemur Toolkit by doxygen1.2.18