Main Page   Namespace List   Class Hierarchy   Alphabetical List   Compound List   File List   Namespace Members   Compound Members   File Members   Related Pages  

BasicIndex Class Reference

Basic Indexer (with arbitrary compressor). More...

#include <BasicIndex.hpp>

Inheritance diagram for BasicIndex:

Index List of all members.

Public Methods

 BasicIndex ()
 constructor (used when opening an index)

 BasicIndex (Compress *pc)
 constructor (used when building an index)

virtual ~BasicIndex ()
virtual bool open (const char *indexName)
 Open previously created Index, return true if opened successfully.

void build (DocStream *collectionStream, const char *file, const char *outputPrefix, int totalDocs=0x1000000, int maxMemory=0x4000000, int minimumCount=1, int maxVocSize=2000000)
Spelling and index conversion
virtual int term (const char *word)
 Convert a term spelling to a termID.

virtual const char * term (int termID)
 Convert a termID to its spelling.

virtual int document (const char *docIDStr)
 Convert a spelling to docID.

virtual const char * document (int docID)
 Convert a docID to its spelling.

virtual const char * termLexiconID ()
 return the term lexicon ID

Summary counts
virtual int docCount ()
 Total count (i.e., number) of documents in collection.

virtual int termCountUnique ()
 Total count of unique terms in collection.

virtual int termCount (int termID) const
 Total counts of a term in collection.

virtual int termCount () const
 Total counts of all terms in collection.

virtual float docLengthAvg ()
 Average document length.

virtual int docCount (int termID)
 Total counts of doc with a given term.

virtual int docLength (int docID) const
 Total counts of terms in a document.

Index entry access
virtual DocInfoListdocInfoList (int termID)
 doc entries in a term index, caller should release the memory
See also:
DocList


virtual TermInfoListtermInfoList (int docID)
 word entries in a document index, caller should release the memory
See also:
TermList



Detailed Description

Basic Indexer (with arbitrary compressor).

BasicIndex is a basic implementation of Index. It creates and manages two indices (term->doc and doc->term) as well as a term lexicon and document id lexicon. The application can pass in any compressor when calling the build function. @See Index for an example of use.


Constructor & Destructor Documentation

BasicIndex::BasicIndex  
 

constructor (used when opening an index)

BasicIndex::BasicIndex Compress   pc
 

constructor (used when building an index)

BasicIndex::~BasicIndex   [virtual]
 


Member Function Documentation

void BasicIndex::build DocStream   collectionStream,
const char *    file,
const char *    outputPrefix,
int    totalDocs = 0x1000000,
int    maxMemory = 0x4000000,
int    minimumCount = 1,
int    maxVocSize = 2000000
 

int BasicIndex::docCount int    termID [virtual]
 

Total counts of doc with a given term.

Implements Index.

virtual int BasicIndex::docCount   [inline, virtual]
 

Total count (i.e., number) of documents in collection.

Implements Index.

DocInfoList * BasicIndex::docInfoList int    termID [virtual]
 

doc entries in a term index, caller should release the memory

See also:
DocList

Implements Index.

virtual int BasicIndex::docLength int    docID const [inline, virtual]
 

Total counts of terms in a document.

Implements Index.

virtual float BasicIndex::docLengthAvg   [inline, virtual]
 

Average document length.

Implements Index.

virtual const char* BasicIndex::document int    docID [inline, virtual]
 

Convert a docID to its spelling.

Implements Index.

virtual int BasicIndex::document const char *    docIDStr [inline, virtual]
 

Convert a spelling to docID.

Implements Index.

bool BasicIndex::open const char *    indexName [virtual]
 

Open previously created Index, return true if opened successfully.

Implements Index.

virtual const char* BasicIndex::term int    termID [inline, virtual]
 

Convert a termID to its spelling.

Implements Index.

virtual int BasicIndex::term const char *    word [inline, virtual]
 

Convert a term spelling to a termID.

Implements Index.

virtual int BasicIndex::termCount   const [inline, virtual]
 

Total counts of all terms in collection.

Implements Index.

virtual int BasicIndex::termCount int    termID const [inline, virtual]
 

Total counts of a term in collection.

Implements Index.

virtual int BasicIndex::termCountUnique   [inline, virtual]
 

Total count of unique terms in collection.

Implements Index.

TermInfoList * BasicIndex::termInfoList int    docID [virtual]
 

word entries in a document index, caller should release the memory

See also:
TermList

Implements Index.

virtual const char* BasicIndex::termLexiconID   [inline, virtual]
 

return the term lexicon ID

Reimplemented from Index.


The documentation for this class was generated from the following files:
Generated on Tue Nov 25 11:27:00 2003 for Lemur Toolkit by doxygen1.2.18