Main Page Namespace List Class Hierarchy Compound List File List Namespace Members Compound Members File Members Related Pages

LEMUR Compound List

Here are the classes, structs, unions and interfaces with brief descriptions:

AbsoluteDiscountDocModel (Absolute discout smoothing)
Array
ArrayAccumulator (Array Score Accumulator)
ArrayCounter (Counts stored in an array ( when the element type is int, float/double typed counts will be converted to an integer))
ArrayQueryRep (Representation of a query with a double array)
BasicDocInfo (BasicDocInfo)
BasicDocInfoList (Implementation of DocInfoList for BasicIndex)
BasicDocStream (A DocStream handler of a stream with the basic lemur format)
BasicIndex (Basic Indexer (with arbitrary compressor))
BasicIndexWithCat (A basic implementation of IndexWithCat based on two Index's)
BasicTermInfo (BasicTermInfo)
BasicTermInfoList (Basic TermInfoList)
BasicTokenDoc (Doc representation for BasicDocStream)
BasicTokenTerm (Term representation for the BasicDocStream)
clink
CollectionProps (This Abstract Class for set of Collection Properties)
Compress (Abstract Compressor)
Counter (Abstract Counter class)
CSet
ddinf_link
ddlink
ddpar_link
DirichletPriorDocModel (Bayesian smoothing with Dirichlet prior)
DirichletUnigramLM (Dirichlet prior smoothing)
DocInfo (Abstract Representation of Doc Information Entry)
DocInfoList (Abstract Interface of Doc Information List)
SimpleKLParameter::DocSmoothParam
DocStream (Abstract interface for a collection of documents)
Document (Abstract document class)
DocumentProps (Class for set of Document Properties)
DocumentRep (Representation of documents in a collection for efficient inverted index scoring)
DocUnigramCounter (Counter of unigrams in documents)
dt_entry
Exception (Default Exception class)
FastList
TFIDFParameter::FeedbackParam
OkapiParameter::FeedbackParam
FLL
FreqCount (Record with frequency information to be stored in a hash table)
FreqVector (Abstract class that represents a frequency vector accessible through an integer key)
GammaCompress (Gamma compressor)
HashFreqVector (Representation of a frequency vector with a hash table)
Index (Abstract Class for indexed document collection)
IndexCount (Class for collecting counts of an index)
IndexedReal (A list of indexed real numbers (similar to IndexProb))
IndexedRealVector
IndexManager (A group of index management functions)
IndexProb (A class for collecting probabilities for an index)
IndexReader
IndexWithCat (An abstract interface for access to an index with category information)
InterpUnigramLM (Linear interpolation smoothing)
inv_entry
InvFPDocInfo (Example Class for push method of building an index)
InvFPDocList
InvFPIndex
InvFPIndexMerge
InvFPPushIndex
InvFPTerm (Term class for InvFPIndex)
InvFPTermList
InvFPTextHandler (InvFPTextHandler builds an InvFPIndex using InvFPPushIndex. This class is a destination TextHandler)
ISet
JelinekMercerDocModel (Jelinek-Mercer interpolation)
LaplaceUnigramLM (Laplace-smoothed unigram language model)
List
LL
LocatedTerm
lt_str
ltstr
MemCache
MemList
MLUnigramLM (Maximum Likelihood Estimator)
ModifiableCounter (Modifiable counter, supports modification of counts)
Number
OkapiDocRep (Doc representation for Okapi model)
OkapiQueryRep (OkapiQueryRep carries an array to store the count of relevant docs with a term)
OkapiQueryTerm (Represent of a query term in Okapi retrieval model, the term carries a count of the number of rel docs with the term)
OkapiRetMethod (The Okapi BM25 retrieval function, as described in their TREC-3 paper)
OkapiScoreFunc (The Okapi scoring function)
OneStepMarkovChain (One step markov chain translation model, not fully tested yet)
Parser (Provides a generic parser interface. Supports the TextHandler interface as a source (so foundDoc and foundWord have empty implementations). Also assumes that the parser uses an acronym list. If, when developing your parser, you do not use an acronym list, you can just provide an empty implementation of the setAcroList function)
PorterStemmer (Provides a wrapper to the Porter stemmer that supports the Stemmer interface, and by inheritance, the TextHandler interface)
PSet
PseudoFBDocs (Representation of a subset of feedback documents)
PushIndex (Abstract Class for push method of building an index)
Query (Abstract query)
SimpleKLParameter::QueryModelParam
QueryRep (Abstract query representation)
QueryTerm (A query term is assumed to have at least an ID and a weight)
QueryTextHandler (The QueryTextHandler is designed to help parse queries. The QueryTextHandler checks query terms against an Index and adds the uppercase form of the term to the query if it occurs more frequently than the parsed form passed to the QueryTextHandler. This is to help catch query terms that are acronyms but are not capitalized in the query)
ResultEntry (Hash table entry for storing results)
ResultFile (Representation of result file)
RetrievalMethod
ReutersParser (Parses documents in TREC format. Does case folding for words that are not in the acronym list. Contraction suffixes and possessive suffixes are stripped. U.S.A., USA's, and USAs are converted to USA. Does not recognize acronyms with numbers. The following fields are parsed: text, headline, title)
ScoreAccumulator (Abstract Score Accumulator)
ScoreFunction (Abstract interface for retrieval function with a default implementation (dot product))
SimpleKLDocModel (Doc representation for simple KL divergence retrieval model)
SimpleKLQueryModel (Query model representation for the simple KL divergence model)
SimpleKLRetMethod (KL Divergence retrieval model with simple document model smoothing)
SimpleKLScoreFunc (Simple KL-divergence scoring function)
SmoothedMLEstimator (Common implementation of a (smoothed) unigram LM estimated based on a counter)
Source
Stemmer (A generic interface for Stemmers. They should support the TextHandler interface)
Stopper (Provides a stopword list that can be chained with a Parser using the TextHandler class)
string
String
String_set
Target
Term (Basic term class)
TermInfo (Abstract Representation of Term Information Entry)
TermInfoList (Abstract Interface of Term Information List)
Terms
TextHandler (This class serves as an interface for classes working with the parsers. The setTextHandler function allows chaining of TextHandlers, so that information is passed from one TextHandler to the next. This is useful for chaining things like stopword lists and stemmers. A source in the chain of TextHandlers does not need to do anything in the foundDoc and foundWord functions. An example of a source is a parser. A destination in the chain of TextHandlers does not need to forward calls or store a when the setTextHandler function is called. An example of a destination would be a class that pushes the words and documents into an InvFPPushIndex (InvFPTextHandler) or writes to file (WriterTextHandler). Classes in the middle of a chain, like Stopper or Stemmer, need to provide full functionality for all functions. When their foundDoc or foundWord is called, they will possibly manipulate the data, then forward the info via calling the foundDoc/foundWord function of their TextHandler)
TextQuery (A text query is an adaptor of Document)
TextQueryRep (Abstract representation of a text query as a sequence of weighted terms)
TextQueryRetMethod
TFIDFDocRep (Representation of a doc (as a weighted vector) in the TFIDF method)
TFIDFQueryRep (Representation of a query (as a weighted vector) in the TFIDF method)
TFIDFRetMethod (The TFIDF retrieval method with a few TF formula options)
OkapiParameter::TFParam
Timer
TokenTerm (Interface of a TokenTerm -- a term in a doc stream)
TrecParser (Parses documents in NIST's TREC format. Does case folding for words that are not in the acronym list. Contraction suffixes and possessive suffixes are stripped. U.S.A., USA's, and USAs are converted to USA. Does not recognize acronyms with numbers. The following fields are parsed: TEXT, HL, HEAD, HEADLINE, LP, TTL)
UnigramLM (Abstract Unigram Language Model class)
vector
WebParser (Parses documents in NIST's Web TREC format. Does case folding for words that are not in the acronym list. Contraction suffixes and possessive suffixes are stripped. U.S.A., USA's, and USAs are converted to USA. Does not recognize acronyms with numbers. The DOCHDR is ignored. Text in <script> tags is ignored. Text in HTML comments is ignored)
WeightedIDSet (A set of ID's with weights)
TFIDFParameter::WeightParam
WordSet (A generic class that provides a neat and easy to use wrapper to a hash_set<char *>)
WriterTextHandler (Outputs text in a format that can be used by RetEval (for queries) or BuildBasicIndex (for documents). This class is a destination TextHandler)
yy_buffer_state

Generated at Fri Jul 26 18:22:40 2002 for LEMUR by