Main Page   Namespace List   Class Hierarchy   Alphabetical List   Compound List   File List   Namespace Members   Compound Members   File Members   Related Pages  

XLingRetMethod Class Reference

#include <XLingRetMethod.hpp>

Inheritance diagram for XLingRetMethod:

RetrievalMethod List of all members.

Public Methods

 XLingRetMethod (const Index &dbIndex, const Index &background, PDict &dict, ScoreAccumulator &accumulator, double l, double b, bool cacheDR, string &sBM, string &tBM, const Stopper *stp=NULL, Stemmer *stm=NULL)
 Constructor.

virtual ~XLingRetMethod ()
 clean up.

virtual DocumentRepcomputeDocRep (DOCID_T docID)
 Create a document representation.

virtual double matchedTermWeight (TERMID_T id, double weight, const DocInfo *info, const DocumentRep *dRep) const
 Score a given term for a given document.

virtual double adjustedScore (double origScore, double pge) const
 Adjust the score for a given document.

virtual void scoreCollection (const QueryRep &qry, IndexedRealVector &results)
 Score all documents in the collection.

virtual void scoreInvertedIndex (const QueryRep &qryRep, IndexedRealVector &scores, bool scoreAll=false)
virtual QueryRepcomputeQueryRep (const Query &qry)
 compute the representation for a query, semantics defined by subclass

virtual QueryRepcomputeTargetKLRep (const QueryRep *qry)
virtual double scoreDoc (const QueryRep &qry, DOCID_T docID)
 Score a document identified by the id w.r.t. a query rep.

virtual void updateQuery (QueryRep &qryRep, const DocIDSet &relDocs)
 update the query -- noop


Protected Methods

virtual double scoreDocVector (const XLingQueryModel &qRep, DOCID_T docID, FreqVector &docVector)

Protected Attributes

double lambda
double beta
double numSource
double numTarget
bool docBasedSourceSmooth
bool docBasedTargetSmooth
ScoreAccumulatorscAcc
PDictdictionary
Stemmerstemmer
const Stopperstopper
const Indexsource
DocumentRep ** docReps
 cache document reps.

bool cacheDocReps
 whether or not to cache document representations

int docRepsSize
 number of documents plus 1, the size of the docReps array.

ScoreAccumulatortermScores

Detailed Description

Cross lingual retrieval method. Translation dictionary based retrieval, scoring queries in the source language against documents in the target language using:
P(Q_s|D_t) = Prod_w_in_Q_s(lambda(Sum_t_in_D_t P(t|D_t)P(w|t) + (1-lambda)P(w|G_s)
where G_s is the background model for the source language.


Constructor & Destructor Documentation

XLingRetMethod::XLingRetMethod const Index   dbIndex,
const Index   background,
PDict   dict,
ScoreAccumulator   accumulator,
double    l,
double    b,
bool    cacheDR,
string &    sBM,
string &    tBM,
const Stopper   stp = NULL,
Stemmer   stm = NULL
 

Constructor.

Parameters:
dbIndex  index for target language documents
background  index for source language background model
dict  PDict containing source->target translation probabilities
accumulator  ScoreAccumulator for intermediate results
l  lambda value to use for smoothing background model
b  beta value to use for smoothing P(t|D)
cacheDR  whether or not to cache document reps
sBM  whether to use term frequency (tf/|V|) or term doc frequency (docCount(t)/Sum_w_in_V(docCount(w))) for the source language background model. Default is term frequency.
tBM  whether to use term frequency (tf/|V|) or term doc frequency (docCount(t)/Sum_w_in_V(docCount(w))) for the targetlanguage background model. Default is term frequency.
stp  source language Stopper to use when getting translations.
stm  source language Stemmer to use when getting translations.

XLingRetMethod::~XLingRetMethod   [virtual]
 

clean up.


Member Function Documentation

virtual double XLingRetMethod::adjustedScore double    origScore,
double    pge
const [inline, virtual]
 

Adjust the score for a given document.

Parameters:
origScore  the original score
pge  the background probability to adjust by.
Returns:
log((lambda * origScore) + ((1 - lambda) * pge))

DocumentRep * XLingRetMethod::computeDocRep DOCID_T    docID [virtual]
 

Create a document representation.

Parameters:
docID  the internal document id to create the representation for
Returns:
An instance of XLingDocRep

virtual QueryRep* XLingRetMethod::computeQueryRep const Query   qry [inline, virtual]
 

compute the representation for a query, semantics defined by subclass

Implements RetrievalMethod.

QueryRep * XLingRetMethod::computeTargetKLRep const QueryRep   qry [virtual]
 

virtual double XLingRetMethod::matchedTermWeight TERMID_T    id,
double    weight,
const DocInfo   info,
const DocumentRep   dRep
const [inline, virtual]
 

Score a given term for a given document.

Parameters:
id  the term id
weight  the weight for this term
info  the DocInfo for this document
dRep  the DocumentRep for this document
Returns:
P(t|D) * P(s|t)

virtual void XLingRetMethod::scoreCollection const QueryRep   qry,
IndexedRealVector   results
[inline, virtual]
 

Score all documents in the collection.

The default implementation provided by this class is to call function scoreDoc, thus to score documents one by one. This is inefficient, so usually a subclass should override this method if a more efficient scoring, e.g., based on inverted index, is possible.

Reimplemented from RetrievalMethod.

double XLingRetMethod::scoreDoc const QueryRep   qry,
DOCID_T    docID
[virtual]
 

Score a document identified by the id w.r.t. a query rep.

Implements RetrievalMethod.

double XLingRetMethod::scoreDocVector const XLingQueryModel   qRep,
DOCID_T    docID,
FreqVector   docVector
[protected, virtual]
 

void XLingRetMethod::scoreInvertedIndex const QueryRep   qryRep,
IndexedRealVector   scores,
bool    scoreAll = false
[virtual]
 

virtual void XLingRetMethod::updateQuery QueryRep   qryRep,
const DocIDSet   relDocs
[inline, virtual]
 

update the query -- noop

Implements RetrievalMethod.


Member Data Documentation

double XLingRetMethod::beta [protected]
 

bool XLingRetMethod::cacheDocReps [protected]
 

whether or not to cache document representations

PDict& XLingRetMethod::dictionary [protected]
 

bool XLingRetMethod::docBasedSourceSmooth [protected]
 

bool XLingRetMethod::docBasedTargetSmooth [protected]
 

DocumentRep** XLingRetMethod::docReps [protected]
 

cache document reps.

int XLingRetMethod::docRepsSize [protected]
 

number of documents plus 1, the size of the docReps array.

double XLingRetMethod::lambda [protected]
 

double XLingRetMethod::numSource [protected]
 

double XLingRetMethod::numTarget [protected]
 

ScoreAccumulator& XLingRetMethod::scAcc [protected]
 

const Index& XLingRetMethod::source [protected]
 

Stemmer* XLingRetMethod::stemmer [protected]
 

const Stopper* XLingRetMethod::stopper [protected]
 

ScoreAccumulator* XLingRetMethod::termScores [protected]
 


The documentation for this class was generated from the following files:
Generated on Wed Nov 3 13:00:02 2004 for Lemur Toolkit by doxygen1.2.18