Main Page   Namespace List   Class Hierarchy   Alphabetical List   Compound List   File List   Namespace Members   Compound Members   File Members   Related Pages  

Index Class Reference

Abstract Class for indexed document collection. More...

#include <Index.hpp>

Inheritance diagram for Index:

BasicIndex IndexWithCat InvIndex KeyfileIncIndex BasicIndexWithCat InvFPIndex List of all members.

Public Methods

virtual ~Index ()
virtual TermInfoListtermInfoListSeq (int docID) const
Open index
virtual bool open (const string &indexName)=0
 Open previously created Index, return true if opened successfully, indexName should be the full name of the table-of-content file for the index. E.g., "index.bsc" for an index built with the basic indexer.

Spelling and index conversion
virtual int term (const string &word) const=0
 Convert a term spelling to a termID, returns 0 if out of vocabulary. Valid index starts at 1.

virtual const string term (int termID) const=0
 Convert a valid termID to its spelling.

virtual int document (const string &docIDStr) const=0
 Convert a spelling to docID, returns 0 if out of vocabulary. Valid index starts at 1.

virtual const string document (int docID) const=0
 Convert a valid docID to its spelling.

virtual const DocumentManagerdocManager (int docID) const
virtual const string termLexiconID () const
 Return a string ID for the term lexicon (usually the file name of the lexicon).

Summary counts
virtual int docCount () const=0
 Total count (i.e., number) of documents in collection.

virtual int termCountUnique () const=0
 Total count of unique terms in collection, i.e., the term vocabulary size.

virtual int termCount (int termID) const=0
 Total counts of a term in collection.

virtual int termCount () const=0
 Total counts of all terms in collection.

virtual float docLengthAvg () const=0
 Average document length.

virtual int docCount (int termID) const=0
 Total counts of doc with a given term.

virtual int docLength (int docID) const=0
 Total counts of terms in a document.

Index entry access
virtual DocInfoListdocInfoList (int termID) const=0
 returns a new instance of DocInfoList which represents the doc entries in a term index, you must delete the instance later.
See also:
DocInfoList


virtual TermInfoListtermInfoList (int docID) const=0
 returns a new instance of TermInfoList which represents the word entries in a document index, you must delete the instance later.
See also:
TermInfoList



Detailed Description

Abstract Class for indexed document collection.

This is an abstract class that provides a uniform interface for access to an indexed document collection. The following is an example of using it.



Index &myIndex;

myIndex.open("index-file");


int t1;
... 

// now fetch doc info list for term t1
// this returns a dynamic instance, so you'll need to delete it
DocInfoList *docList = myIndex.docInfoList(t1);

docList->startIteration();

DocInfo *entry;
while (docList->hasMore()) {
  entry = docList->nextEntry(); 
  // this returns a pointer to a *static* memory, do don't delete entry!
  
  cout << "entry doc id: "<< entry->docID() <<endl;
  cout << "entry term count: "<< entry->termCount() << endl;
}

delete docList;


Constructor & Destructor Documentation

virtual Index::~Index   [inline, virtual]
 


Member Function Documentation

virtual int Index::docCount int    termID const [pure virtual]
 

Total counts of doc with a given term.

Implemented in BasicIndex, BasicIndexWithCat, InvIndex, and KeyfileIncIndex.

virtual int Index::docCount   [pure virtual]
 

Total count (i.e., number) of documents in collection.

Implemented in BasicIndex, BasicIndexWithCat, InvIndex, and KeyfileIncIndex.

virtual DocInfoList* Index::docInfoList int    termID const [pure virtual]
 

returns a new instance of DocInfoList which represents the doc entries in a term index, you must delete the instance later.

See also:
DocInfoList

Implemented in BasicIndex, BasicIndexWithCat, InvFPIndex, InvIndex, and KeyfileIncIndex.

virtual int Index::docLength int    docID const [pure virtual]
 

Total counts of terms in a document.

Implemented in BasicIndex, and BasicIndexWithCat.

virtual float Index::docLengthAvg   [pure virtual]
 

Average document length.

Implemented in BasicIndex, BasicIndexWithCat, InvIndex, and KeyfileIncIndex.

virtual const DocumentManager* Index::docManager int    docID const [inline, virtual]
 

A String identifier for the document manager to get at the source of the document with this document id

Reimplemented in InvIndex, and KeyfileIncIndex.

virtual const string Index::document int    docID const [pure virtual]
 

Convert a valid docID to its spelling.

Implemented in BasicIndex, BasicIndexWithCat, InvIndex, and KeyfileIncIndex.

virtual int Index::document const string &    docIDStr const [pure virtual]
 

Convert a spelling to docID, returns 0 if out of vocabulary. Valid index starts at 1.

Implemented in BasicIndex, BasicIndexWithCat, InvIndex, and KeyfileIncIndex.

virtual bool Index::open const string &    indexName [pure virtual]
 

Open previously created Index, return true if opened successfully, indexName should be the full name of the table-of-content file for the index. E.g., "index.bsc" for an index built with the basic indexer.

Implemented in BasicIndex, BasicIndexWithCat, InvIndex, and KeyfileIncIndex.

virtual const string Index::term int    termID const [pure virtual]
 

Convert a valid termID to its spelling.

Implemented in BasicIndex, BasicIndexWithCat, InvIndex, and KeyfileIncIndex.

virtual int Index::term const string &    word const [pure virtual]
 

Convert a term spelling to a termID, returns 0 if out of vocabulary. Valid index starts at 1.

Implemented in BasicIndex, BasicIndexWithCat, InvIndex, and KeyfileIncIndex.

virtual int Index::termCount   [pure virtual]
 

Total counts of all terms in collection.

Implemented in BasicIndex, BasicIndexWithCat, InvIndex, and KeyfileIncIndex.

virtual int Index::termCount int    termID const [pure virtual]
 

Total counts of a term in collection.

Implemented in BasicIndex, BasicIndexWithCat, InvIndex, and KeyfileIncIndex.

virtual int Index::termCountUnique   [pure virtual]
 

Total count of unique terms in collection, i.e., the term vocabulary size.

Implemented in BasicIndex, BasicIndexWithCat, InvIndex, and KeyfileIncIndex.

virtual TermInfoList* Index::termInfoList int    docID const [pure virtual]
 

returns a new instance of TermInfoList which represents the word entries in a document index, you must delete the instance later.

See also:
TermInfoList

Implemented in BasicIndex, BasicIndexWithCat, InvFPIndex, InvIndex, and KeyfileIncIndex.

virtual TermInfoList* Index::termInfoListSeq int    docID const [inline, virtual]
 

Reimplemented in InvFPIndex, and KeyfileIncIndex.

virtual const string Index::termLexiconID   const [inline, virtual]
 

Return a string ID for the term lexicon (usually the file name of the lexicon).

This function should be pure virtual; the default implementation is just for convenience. Appropriate implementation to be done in the future.

Reimplemented in BasicIndex.


The documentation for this class was generated from the following file:
Generated on Fri Jul 2 16:25:42 2004 for Lemur Toolkit by doxygen1.2.18