Main Page   Namespace List   Class Hierarchy   Alphabetical List   Compound List   File List   Namespace Members   Compound Members   File Members   Related Pages  

BasicIndex Class Reference

Basic Indexer (with arbitrary compressor). More...

#include <BasicIndex.hpp>

Inheritance diagram for BasicIndex:

Index List of all members.

Public Methods

 BasicIndex ()
 constructor (used when opening an index)

 BasicIndex (Compress *pc)
 constructor (used when building an index)

virtual ~BasicIndex ()
virtual bool open (const string &indexName)
 Open previously created Index, return true if opened successfully.

void build (DocStream *collectionStream, const string &file, const string &outputPrefix, int totalDocs=0x1000000, int maxMemory=0x4000000, int minimumCount=1, int maxVocSize=2000000)
Spelling and index conversion
virtual int term (const string &word) const
 Convert a term spelling to a termID.

virtual const string term (int termID) const
 Convert a termID to its spelling.

virtual int document (const string &docIDStr) const
 Convert a spelling to docID.

virtual const string document (int docID) const
 Convert a docID to its spelling.

virtual const string termLexiconID () const
 return the term lexicon ID

Summary counts
virtual int docCount () const
 Total count (i.e., number) of documents in collection.

virtual int termCountUnique () const
 Total count of unique terms in collection.

virtual int termCount (int termID) const
 Total counts of a term in collection.

virtual int termCount () const
 Total counts of all terms in collection.

virtual float docLengthAvg () const
 Average document length.

virtual int docCount (int termID) const
 Total counts of doc with a given term.

virtual int docLength (int docID) const
 Total counts of terms in a document.

Index entry access
virtual DocInfoListdocInfoList (int termID) const
 doc entries in a term index, caller should release the memory
See also:
DocList


virtual TermInfoListtermInfoList (int docID) const
 word entries in a document index, caller should release the memory
See also:
TermList



Detailed Description

Basic Indexer (with arbitrary compressor).

BasicIndex is a basic implementation of Index. It creates and manages two indices (term->doc and doc->term) as well as a term lexicon and document id lexicon. The application can pass in any compressor when calling the build function. @See Index for an example of use.


Constructor & Destructor Documentation

BasicIndex::BasicIndex  
 

constructor (used when opening an index)

BasicIndex::BasicIndex Compress   pc
 

constructor (used when building an index)

BasicIndex::~BasicIndex   [virtual]
 


Member Function Documentation

void BasicIndex::build DocStream   collectionStream,
const string &    file,
const string &    outputPrefix,
int    totalDocs = 0x1000000,
int    maxMemory = 0x4000000,
int    minimumCount = 1,
int    maxVocSize = 2000000
 

int BasicIndex::docCount int    termID const [virtual]
 

Total counts of doc with a given term.

Implements Index.

virtual int BasicIndex::docCount   const [inline, virtual]
 

Total count (i.e., number) of documents in collection.

Implements Index.

DocInfoList * BasicIndex::docInfoList int    termID const [virtual]
 

doc entries in a term index, caller should release the memory

See also:
DocList

Implements Index.

virtual int BasicIndex::docLength int    docID const [inline, virtual]
 

Total counts of terms in a document.

Implements Index.

virtual float BasicIndex::docLengthAvg   const [inline, virtual]
 

Average document length.

Implements Index.

virtual const string BasicIndex::document int    docID const [inline, virtual]
 

Convert a docID to its spelling.

Implements Index.

virtual int BasicIndex::document const string &    docIDStr const [inline, virtual]
 

Convert a spelling to docID.

Implements Index.

bool BasicIndex::open const string &    indexName [virtual]
 

Open previously created Index, return true if opened successfully.

Implements Index.

virtual const string BasicIndex::term int    termID const [inline, virtual]
 

Convert a termID to its spelling.

Implements Index.

virtual int BasicIndex::term const string &    word const [inline, virtual]
 

Convert a term spelling to a termID.

Implements Index.

virtual int BasicIndex::termCount   const [inline, virtual]
 

Total counts of all terms in collection.

Implements Index.

virtual int BasicIndex::termCount int    termID const [inline, virtual]
 

Total counts of a term in collection.

Implements Index.

virtual int BasicIndex::termCountUnique   const [inline, virtual]
 

Total count of unique terms in collection.

Implements Index.

TermInfoList * BasicIndex::termInfoList int    docID const [virtual]
 

word entries in a document index, caller should release the memory

See also:
TermList

Implements Index.

virtual const string BasicIndex::termLexiconID   const [inline, virtual]
 

return the term lexicon ID

Reimplemented from Index.


The documentation for this class was generated from the following files:
Generated on Fri Jul 2 16:25:40 2004 for Lemur Toolkit by doxygen1.2.18