Main Page   Namespace List   Class Hierarchy   Compound List   File List   Namespace Members   Compound Members   File Members   Related Pages  

InvPushIndex Class Reference

#include <InvPushIndex.hpp>

Inheritance diagram for InvPushIndex:

PushIndex InvFPPushIndex IncFPPushIndex InvPassagePushIndex IncPassagePushIndex List of all members.

Public Methods

 InvPushIndex ()
 InvPushIndex (char *prefix, int cachesize=128000000, long maxfilesize=2100000000, DOCID_T startdocid=1)
 ~InvPushIndex ()
void setName (char *prefix)
 sets the name for this index. the name will be the prefix for all files related to this index

bool beginDoc (DocumentProps *dp)
 the beginning of a new document, returns true if initiation was successful

bool addTerm (Term &t)
 adding a term to the current document, returns true if term was added successfully.

void endDoc (DocumentProps *dp)
 signify the end of current document

virtual void endDoc (DocumentProps *dp, const char *mgr)
 signify the end of current document and associate with certain document manager. this doesn't change the mgr that was previously set.

void endCollection (CollectionProps *cp)
 signify the end of this collection. properties passed at the beginning of a collection should be handled by the constructor.

void setDocManager (const char *mgrID)
 set the document manager to use for succeeding documents


Protected Methods

void writeTOC (int numinv)
void writeDocIDs ()
void writeCache ()
void lastWriteCache ()
void writeDTIDs ()
void writeDocMgrIDs ()
int docMgrID (const char *mgr)
virtual void doendDoc (DocumentProps *dp, int mgrid)

Protected Attributes

long maxfile
MemCachecache
 the biggest our file size can be

vector< char * > docIDs
 the main memory handler for building

vector< char * > termIDs
 list of external docids in internal docid order

vector< char * > tempfiles
 list of terms in termid order

vector< char * > dtfiles
 list of tempfiles we've written to flush cache

vector< char * > docmgrs
 list of dt index files

FILE * writetlookup
ofstream writetlist
 filestream for writing the lookup table to the docterm db

int tcount
 filestream for writing the list of located terms for each document

int tidcount
 count of total terms

int dtidcount
 count of unique terms

char * name
 count of unique terms in a current doc

int namelen
 the prefix name

TABLE_T wordtable
 the length of the name (avoid many calls to strlen)

map< int, int > termlist
 table of all terms and their doclists

int * membuf
 maps of terms and freqs

int membufsize
 memory to use for cache and buffers

int curdocmgr

Constructor & Destructor Documentation

InvPushIndex::InvPushIndex   [inline]
 

InvPushIndex::InvPushIndex char *    prefix,
int    cachesize = 128000000,
long    maxfilesize = 2100000000,
DOCID_T    startdocid = 1
 

InvPushIndex::~InvPushIndex  
 


Member Function Documentation

bool InvPushIndex::addTerm Term   t [virtual]
 

adding a term to the current document, returns true if term was added successfully.

Implements PushIndex.

Reimplemented in IncPassagePushIndex, InvFPPushIndex, and InvPassagePushIndex.

bool InvPushIndex::beginDoc DocumentProps   dp [virtual]
 

the beginning of a new document, returns true if initiation was successful

Implements PushIndex.

Reimplemented in IncPassagePushIndex, and InvPassagePushIndex.

int InvPushIndex::docMgrID const char *    mgr [protected]
 

returns the internal id of given docmgr if not already registered, mgr will be added

void InvPushIndex::doendDoc DocumentProps   dp,
int    mgrid
[protected, virtual]
 

Reimplemented in IncPassagePushIndex, InvFPPushIndex, and InvPassagePushIndex.

void InvPushIndex::endCollection CollectionProps   cp [virtual]
 

signify the end of this collection. properties passed at the beginning of a collection should be handled by the constructor.

Implements PushIndex.

Reimplemented in InvFPPushIndex.

void InvPushIndex::endDoc DocumentProps   dp,
const char *    mgr
[virtual]
 

signify the end of current document and associate with certain document manager. this doesn't change the mgr that was previously set.

void InvPushIndex::endDoc DocumentProps   dp [virtual]
 

signify the end of current document

Implements PushIndex.

void InvPushIndex::lastWriteCache   [protected]
 

void InvPushIndex::setDocManager const char *    mgrID [virtual]
 

set the document manager to use for succeeding documents

Implements PushIndex.

void InvPushIndex::setName char *    prefix
 

sets the name for this index. the name will be the prefix for all files related to this index

void InvPushIndex::writeCache   [protected]
 

void InvPushIndex::writeDocIDs   [protected]
 

void InvPushIndex::writeDocMgrIDs   [protected]
 

void InvPushIndex::writeDTIDs   [protected]
 

void InvPushIndex::writeTOC int    numinv [protected]
 

Reimplemented in InvFPPushIndex.


Member Data Documentation

MemCache* InvPushIndex::cache [protected]
 

the biggest our file size can be

int InvPushIndex::curdocmgr [protected]
 

vector<char*> InvPushIndex::docIDs [protected]
 

the main memory handler for building

vector<char*> InvPushIndex::docmgrs [protected]
 

list of dt index files

vector<char*> InvPushIndex::dtfiles [protected]
 

list of tempfiles we've written to flush cache

int InvPushIndex::dtidcount [protected]
 

count of unique terms

long InvPushIndex::maxfile [protected]
 

int* InvPushIndex::membuf [protected]
 

maps of terms and freqs

int InvPushIndex::membufsize [protected]
 

memory to use for cache and buffers

char* InvPushIndex::name [protected]
 

count of unique terms in a current doc

int InvPushIndex::namelen [protected]
 

the prefix name

int InvPushIndex::tcount [protected]
 

filestream for writing the list of located terms for each document

vector<char*> InvPushIndex::tempfiles [protected]
 

list of terms in termid order

vector<char*> InvPushIndex::termIDs [protected]
 

list of external docids in internal docid order

map<int, int> InvPushIndex::termlist [protected]
 

table of all terms and their doclists

Reimplemented in InvFPPushIndex.

int InvPushIndex::tidcount [protected]
 

count of total terms

TABLE_T InvPushIndex::wordtable [protected]
 

the length of the name (avoid many calls to strlen)

ofstream InvPushIndex::writetlist [protected]
 

filestream for writing the lookup table to the docterm db

FILE* InvPushIndex::writetlookup [protected]
 


The documentation for this class was generated from the following files:
Generated on Mon Sep 30 14:14:08 2002 for LEMUR by doxygen1.2.18