Main Page   Namespace List   Class Hierarchy   Compound List   File List   Namespace Members   Compound Members   File Members   Related Pages  

InvIndex Class Reference

#include <InvIndex.hpp>

Inheritance diagram for InvIndex:

Index InvFPIndex List of all members.

Public Methods

 InvIndex ()
 InvIndex (const char *indexName)
 ~InvIndex ()
void setMesgStream (ostream *lemStream)
 set the mesg stream.

Open index
bool open (const char *indexName)
 Open previously created Index with given prefix, return true if opened successfully.

Spelling and index conversion
int term (const char *word)
 Convert a term spelling to a termID.

const char* term (int termID)
 Convert a termID to its spelling.

int document (const char *docIDStr)
 Convert a spelling to docID.

const char* document (int docID)
 Convert a docID to its spelling.

const char* docManager (int docID)
 A String identifier for the document manager to get at the source of the document with this document id.

Summary counts
int docCount ()
 Total count (i.e., number) of documents in collection.

int termCountUnique ()
 Total count of unique terms in collection.

int termCount (int termID)const
 Total counts of a term in collection.

int termCount ()const
 Total counts of all terms in collection.

float docLengthAvg ()
 Average document length.

int docCount (int termID)
 Total counts of doc with a given term.

int docLength (DOCID_T docID)const
 Total counts of terms in a document, including stop words.

int docLengthCounted (int docID)
 Total count of terms in given document, not including stop words.

Index entry access
DocInfoListdocInfoList (int termID)
 doc entries in a term index,
See also:
DocList , InvFPDocList.


TermInfoListtermInfoList (int docID)
 word entries in a document index (bag of words),
See also:
TermList.



Protected Methods

bool fullToc (const char *fileName)
 readin all toc.

bool indexLookup ()
 readin index lookup table.

bool invFileIDs ()
 readin inverted index filenames map.

bool docMgrIDs ()
 read in document manager internal and external ids map.

bool dtLookup ()
 read in dt index lookup table of format ver1.9 (and up?).

bool dtLookup_ver1 ()
 read in dt index lookup table of format older than ver1.9.

bool dtFileIDs ()
 read in dt index filenames map.

bool termIDs ()
 read in termIDs to term spelling map.

bool docIDs ()
 read in docIDs to doc spelling map.


Protected Attributes

int* counts
char** names
float aveDocLen
inv_entrylookup
dt_entrydtlookup
int dtloaded
TERM_Tterms
EXDOCID_Tdocnames
char** dtfiles
char** invfiles
vector<char*> docmgrs
map<TERM_T, TERMID_T, ltstrtermtable
map<EXDOCID_T, DOCID_T, ltstrdoctable
ostream* msgstream

Constructor & Destructor Documentation

InvIndex::InvIndex ( )
 

InvIndex::InvIndex ( const char * indexName )
 

InvIndex::~InvIndex ( )
 


Member Function Documentation

int InvIndex::docCount ( int termID ) [virtual]
 

Total counts of doc with a given term.

Reimplemented from Index.

int InvIndex::docCount ( ) [inline, virtual]
 

Total count (i.e., number) of documents in collection.

Reimplemented from Index.

bool InvIndex::docIDs ( ) [protected]
 

read in docIDs to doc spelling map.

DocInfoList * InvIndex::docInfoList ( int termID ) [virtual]
 

doc entries in a term index,

See also:
DocList , InvFPDocList.

Reimplemented from Index.

Reimplemented in InvFPIndex.

int InvIndex::docLength ( DOCID_T docID ) const
 

Total counts of terms in a document, including stop words.

float InvIndex::docLengthAvg ( ) [virtual]
 

Average document length.

Reimplemented from Index.

int InvIndex::docLengthCounted ( int docID )
 

Total count of terms in given document, not including stop words.

const char * InvIndex::docManager ( int docID ) [virtual]
 

A String identifier for the document manager to get at the source of the document with this document id.

Reimplemented from Index.

bool InvIndex::docMgrIDs ( ) [protected]
 

read in document manager internal and external ids map.

const char * InvIndex::document ( int docID ) [virtual]
 

Convert a docID to its spelling.

Reimplemented from Index.

int InvIndex::document ( const char * docIDStr ) [virtual]
 

Convert a spelling to docID.

Reimplemented from Index.

bool InvIndex::dtFileIDs ( ) [protected]
 

read in dt index filenames map.

bool InvIndex::dtLookup ( ) [protected]
 

read in dt index lookup table of format ver1.9 (and up?).

bool InvIndex::dtLookup_ver1 ( ) [protected]
 

read in dt index lookup table of format older than ver1.9.

bool InvIndex::fullToc ( const char * fileName ) [protected]
 

readin all toc.

bool InvIndex::indexLookup ( ) [protected]
 

readin index lookup table.

bool InvIndex::invFileIDs ( ) [protected]
 

readin inverted index filenames map.

bool InvIndex::open ( const char * indexName ) [virtual]
 

Open previously created Index with given prefix, return true if opened successfully.

Reimplemented from Index.

void InvIndex::setMesgStream ( ostream * lemStream )
 

set the mesg stream.

const char * InvIndex::term ( int termID ) [virtual]
 

Convert a termID to its spelling.

Reimplemented from Index.

int InvIndex::term ( const char * word ) [virtual]
 

Convert a term spelling to a termID.

Reimplemented from Index.

int InvIndex::termCount ( ) const [inline, virtual]
 

Total counts of all terms in collection.

Reimplemented from Index.

int InvIndex::termCount ( int termID ) const [virtual]
 

Total counts of a term in collection.

Reimplemented from Index.

int InvIndex::termCountUnique ( ) [inline, virtual]
 

Total count of unique terms in collection.

Reimplemented from Index.

bool InvIndex::termIDs ( ) [protected]
 

read in termIDs to term spelling map.

TermInfoList * InvIndex::termInfoList ( int docID ) [virtual]
 

word entries in a document index (bag of words),

See also:
TermList.

Reimplemented from Index.

Reimplemented in InvFPIndex.


Member Data Documentation

float InvIndex::aveDocLen [protected]
 

int * InvIndex::counts [protected]
 

vector< char *> InvIndex::docmgrs [protected]
 

EXDOCID_T * InvIndex::docnames [protected]
 

map< EXDOCID_T,DOCID_T,ltstr > InvIndex::doctable [protected]
 

char ** InvIndex::dtfiles [protected]
 

int InvIndex::dtloaded [protected]
 

dt_entry * InvIndex::dtlookup [protected]
 

char ** InvIndex::invfiles [protected]
 

inv_entry * InvIndex::lookup [protected]
 

ostream * InvIndex::msgstream [protected]
 

char ** InvIndex::names [protected]
 

TERM_T * InvIndex::terms [protected]
 

map< TERM_T,TERMID_T,ltstr > InvIndex::termtable [protected]
 


The documentation for this class was generated from the following files:
Generated at Fri Jul 26 18:27:03 2002 for LEMUR by doxygen1.2.4 written by Dimitri van Heesch, © 1997-2000