A B C D E F G H I K L M N O P Q R S T U V W

A

AbstractClassifier - class textminer.classification.AbstractClassifier.
The AbstractClassifier class is a super class for all classes belong to the classification package.
AbstractClassifier(SubtaskClassifiers) - Constructor for class textminer.classification.AbstractClassifier
Constructor of AbstractClassifiers
AbstractClusterer - class textminer.clustering.AbstractClusterer.
The AbstractClusterer class is an abstract class of clustering algorithms.
AbstractClusterer(SubtaskClustering) - Constructor for class textminer.clustering.AbstractClusterer
Constructor of AbstractClusterer
AbstractDataRepresentation - class textminer.task.AbstractDataRepresentation.
The AbstractDataRepresentation class is an abstract class for data representation in text learning.
AbstractDataRepresentation() - Constructor for class textminer.task.AbstractDataRepresentation
Constructor of AbstractDataRepresentation
AbstractDataSetConverter - class textminer.datarepresentation.AbstractDataSetConverter.
The AbstractDataSetConverter class is implemented to carry out the task of converting text documents belong to test data set into machine-understandable form.
AbstractDataSetConverter(CorpusIndex, Lexicon, Vector, String, String, String) - Constructor for class textminer.datarepresentation.AbstractDataSetConverter
Constructor of AbstractDataSetConverter
AbstractFeatureSelector - class textminer.featureselection.AbstractFeatureSelector.
The AbstractFeatureSelector class is a super class of all classes in the feature selection package.
AbstractFeatureSelector(CorpusIndex, Vector, Vector, int[], int[], int[], int, int, String, String, boolean) - Constructor for class textminer.featureselection.AbstractFeatureSelector
Constructor of AbstractFeatureSelector
AbstractTaskWorker - class textminer.task.AbstractTaskWorker.
The AbstractTaskWorker class
AbstractTaskWorker() - Constructor for class textminer.task.AbstractTaskWorker
 
activeClasses - Variable in class textminer.task.Subtask
Labels of (true or target) classes
add(char) - Method in class textminer.text.Stemmer
Add a character to the word being stemmed.
add(char[], int) - Method in class textminer.text.Stemmer
Add a set of characters to the word being stemmed
afterprune_size - Variable in class textminer.text.DictionaryGenerator
number of selected terms in the given data set
alias - Variable in class textminer.classification.AbstractClassifier
 
alias - Variable in class textminer.task.Subtask
Alias of given task which is used for naming a set of (intermediate) result files
alias - Variable in class textminer.text.TextDocumentConverter
alias of given task, (e.g.
alignSentences(String) - Method in class textminer.text.TextNoiseRemover
Break the specified string, src into a set of sentences and align them
ANN - class textminer.classification.ANN.
The ANN class is an implementation of the multi-layered artificial neural network.
ANN(int) - Constructor for class textminer.classification.ANN
Constructor of ANN
areFilesExist(Vector) - Static method in class textminer.util.IOUtil
Return true if all specified files exist
AVERAGE_LINK - Static variable in interface textminer.clustering.hacMethods
Definition of average (or group-average) link
average(double[]) - Static method in class textminer.util.MathUtil
Return the average value of a given array

B

BACK_PROPAGATION - Static variable in interface textminer.classification.ClassificationMethods
Multilayered neural networks with back-propagation
bagofwords - Variable in class textminer.text.TextModel
word feature set in ArrayList
bagofwords - Variable in class textminer.text.TermbyDocumentMatrix
word feature set in ArrayList
BagOfWords - class textminer.ds.BagOfWords.
The BagOfWords class is an abstract class that encapsulates the "bag of words" model of text data.
BagOfWords() - Constructor for class textminer.ds.BagOfWords
Constructor of BagOfWords
BAYESIAN_NET - Static variable in interface textminer.classification.ClassificationMethods
Bayesian Classification with a limited dependence between nodes
BayesianNet - class textminer.classification.BayesianNet.
The BayesianNet class is an implementation of k-dependence Bayesian networks (KDB).
BayesianNet(String, boolean) - Constructor for class textminer.classification.BayesianNet
Constructor of BayesianNet
body - Variable in class textminer.ds.NewsArticle
Content body of news article
BOOLEAN_MODEL - Static variable in interface textminer.datarepresentation.DataRepMethods
Boolean model
BooleanModel - class textminer.datarepresentation.BooleanModel.
The BooleanModel class is an implementation of "Boolean model."
BooleanModel(CorpusIndex, Vector, Vector, int[], int[], int[], int, int, String, Lexicon, String, String, boolean) - Constructor for class textminer.datarepresentation.BooleanModel
Constructor of BooleanModel
BooleanTDMatrix - class textminer.text.BooleanTDMatrix.
The BooleanTDMatrix class is an abstraction of term-by-document matrix which its element has boolean value.
BooleanTDMatrix(CorpusIndex, String, String, Vector, ArrayList, int[]) - Constructor for class textminer.text.BooleanTDMatrix
Constructor of BooleanTDMatrix
buildTermDictionary() - Method in class textminer.text.DictionaryTDT
Build a term dictionary for TDT pilot corpus
buildTermDictionary() - Method in class textminer.text.DictionaryReutersMultiClass
Build term dictionary of Reuters-21578 data set
buildTermDictionary() - Method in class textminer.text.DictionaryNewsgroup
Build a term dictionary
buildTermDictionary() - Method in class textminer.text.DictionaryGeneratorMultiClass
 
buildTermDictionary() - Method in class textminer.text.DictionaryGenerator
Build a term dictionary for a given data set
buildTermDictionary() - Method in class textminer.text.DictionaryFinancialMultiClass
Build (unique) term dictionary
buildTermDictionary() - Method in class textminer.text.DictionaryFinancial
Build a term dictionary for Financial news data set
buildTermDictionary(int) - Method in class textminer.text.DictionaryMaker1
Build a term dictionary for a given data set

C

calc_ltc_TFIDF(int, int, int) - Static method in class textminer.featureselection.TFIDF
Return the weight of a given term calculated by "ltc" (in the SMART notation) version of TFIDF w (t, d) = {(1 + log_2 term frequency(t, d) x log_2 (N / document frequency(t)) } / { || d || } where, || d ||: 2-norm of vector d
calc_metrics() - Method in class textminer.evaluation.EvaluationMeasure
Evaluate the performance of the applied method according to the specified metrics
calc_TFIDF(int, int, int) - Static method in class textminer.featureselection.TFIDF
Return the weight of a given term calculated by standard TFIDF
caseSensitiveIndexOf(String[], String) - Static method in class textminer.util.TextUtil
Return index of a string array, source, exactly matched the specified string, element
CCC - Static variable in interface textminer.clustering.ClusteringMethods
Competitive-Construction Clustering
CHI_STAT - Static variable in interface textminer.featureselection.FeatureSelectionMethods
Feature selection by Chi statistics
ChiStat - class textminer.featureselection.ChiStat.
The ChiStat class is an implementation of a feature selection method using Chi square statistics.
ChiStat(CorpusIndex, Vector, Vector, int[], int[], int[], int, int, String, String, boolean) - Constructor for class textminer.featureselection.ChiStat
Constructor of ChiStat
class_prior - Variable in class textminer.featureselection.AbstractFeatureSelector
prior probability of each class
classes - Variable in class textminer.datarepresentation.TextModel
class labels of given data set
classes - Variable in class textminer.featureselection.AbstractFeatureSelector
class labels of given data set
classes - Variable in class textminer.text.LexiconGenerator
name of classes in given dataset
CLASSIFICATION - Static variable in interface textminer.core.Constants
 
ClassificationMethods - interface textminer.classification.ClassificationMethods.
The ClassificationMethods interface defines a list of classification methods available in the TextMiner.
Classifier - class textminer.task.Classifier.
The Classifier class is a wrapper class which carries out the task of classification by invoking a class implemented a specific classification algorithm.
Classifier() - Constructor for class textminer.task.Classifier
Constructor of Classifier
Classifiers - class textminer.task.Classifiers.
The Classifiers class is a wrapper class that is responsible for handling the classification request from other classes which specifies the method and its parameters.
Classifiers() - Constructor for class textminer.task.Classifiers
Constructor of Classifiers
classModel - Variable in class textminer.text.TextModel
estimated model of a given data set, |number of classes| x |word feature set|
classModelFilename - Variable in class textminer.datarepresentation.TextModel
file name of estimated class model
clear() - Method in class textminer.clustering.ProximityArray
Removes all mappings from this map.
clear() - Method in class textminer.task.DataRepresentator1
Release all resources assigned to this class
clear() - Method in class textminer.text.DictionaryMaker1
Release all resources assigned to this class
clear() - Method in class textminer.text.DictionaryGeneratorMultiClass
Release all resources assigned to this class
clear() - Method in class textminer.text.DictionaryGenerator
Release all resource assigned to this class
clear() - Method in class textminer.util.FileCache
Release all resources that this class holds
clearOccurrences() - Method in class textminer.ds.InvertedMatrixTermEntry
Clear previous term occurrence info.
Clusterers - class textminer.task.Clusterers.
The Clusterers class is a wrapper class that is responsible for carrying out the task of clustering text documents.
Clusterers() - Constructor for class textminer.task.Clusterers
Constructor of Clusterers
CLUSTERING - Static variable in interface textminer.core.Constants
 
clusteringMethods - Static variable in interface textminer.clustering.ClusteringMethods
 
ClusteringMethods - interface textminer.clustering.ClusteringMethods.
The ClusteringMethods interface defines a set of clustering algorithms available in the TextMiner.
ClusterMembership - class textminer.ds.ClusterMembership.
The ClusterMembership class is intended to maintain a data structure that captures the non-overlapped membership of each instance.
ClusterMembership(int) - Constructor for class textminer.ds.ClusterMembership
Constructor of ClusterMembership
ClustersLD - class textminer.task.ClustersLD.
The ClustersLD class has the same functionality of Clusterers class, but intended to handle the clustering of larger amounts of text data.
ClustersLD() - Constructor for class textminer.task.ClustersLD
Constructor of ClustersLD
cMethodNames - Static variable in interface textminer.classification.ClassificationMethods
An array of definitions of classifcation methods
COMP_OF_CLUSTER - Static variable in interface textminer.core.Constants
 
COMP_OF_DOCUMENT - Static variable in interface textminer.core.Constants
 
Competitive - class textminer.clustering.Competitive.
The Competitive class is an implementation of the Constructive-Competition Clustering algorithm.
Competitive(SubtaskClustering) - Constructor for class textminer.clustering.Competitive
Consturctor of Competitive
COMPLETE_LINK - Static variable in interface textminer.clustering.hacMethods
Definition of complete link
Condenser - class textminer.text.Condenser.
The Condenser class is responsible for condensing the (financial) news articles into a pseudo article which has the sentences of the specified company's name.
Condenser() - Constructor for class textminer.text.Condenser
 
Condenser(CorpusIndex, String, String, String, String) - Constructor for class textminer.text.Condenser
 
CONDITIONAL_PROB - Static variable in interface textminer.datarepresentation.DataRepMethods
Conditional probability table for Bayesian Classification
ConditionalMutualInfo - class textminer.featureselection.ConditionalMutualInfo.
The ConditionalMutualInfo class is implemented for estimation of conditional mutual information.
ConditionalMutualInfo(CorpusIndex, Vector, Vector, int[], int[], int[], int, int, String, String, boolean) - Constructor for class textminer.featureselection.ConditionalMutualInfo
Constructor of ConditionalMutualInfo
ConditionalProbTable - class textminer.datarepresentation.ConditionalProbTable.
The ConditionalProbTable class is implemented to generate conditional probability table for each class in given data set.
ConditionalProbTable(CorpusIndex, Vector, Vector, int[], int[], int[], int, int, String, Lexicon, String, String, boolean) - Constructor for class textminer.datarepresentation.ConditionalProbTable
Constructor of ConditionalProbTable
CondProbTable - class textminer.ds.CondProbTable.
The CondProbTable class is a data structure for maintaining a conditional probability table.
CondProbTable() - Constructor for class textminer.ds.CondProbTable
Constructor of CondProbTable
Constants - interface textminer.core.Constants.
The Constants interface defines the global variables of which values affect all classes belong to TextMiner.
constructMatrix() - Method in class textminer.text.TermbyDocumentMatrix
Construct term-by-document matrix by using the word feature set
constructMatrix() - Method in class textminer.text.RealTDMatrix
Construct term-by-document matrix by using the word feature set.
constructMatrix() - Method in class textminer.text.BooleanTDMatrix
Construct term-by-document matrix by using the word feature set.
convert_hashmaptovector() - Method in class textminer.ds.ClusterMembership
 
convert_hashmaptovector(int) - Method in class textminer.ds.ClusterMembership
Convert the HashMap of the group at the specified index into Vector
convertArraytoVector(ArrayList, ArrayList) - Static method in class textminer.util.VectorUtil
Merge the two specified arrays and return combined result in Vector
convertArrayToVector(double[]) - Static method in class textminer.util.VectorUtil
Convert the specified source into Vector
convertDicToVector(TermDictionary) - Static method in class textminer.ds.DictionaryHandler
 
convertDicToVector(TermDictionary) - Static method in class textminer.text.DictionaryHandler
Convert the given TermDictionary into Vector
convertFileToArrayList(String) - Static method in class textminer.util.IOUtil
Return an array of the given file The returned array is an instance of java.util.ArrayList
convertFiletoRowVector(String, int) - Static method in class textminer.util.VectorUtil
Convert a file by the specified filename into a row vector
convertFileToStringArray(String) - Static method in class textminer.util.IOUtil
Convert a file into an array of java.lang.String
convertFiletoVector(String) - Static method in class textminer.util.IOUtil
Convert a file into java.util.Vector
convertFiletoVector(String, int) - Static method in class textminer.util.VectorUtil
Convert a file by the specified afile name into an instance of java.util.Vector
convertHashMaptoVector(HashMap, boolean) - Static method in class textminer.util.IOUtil
Convert HashMap into Vector
convertTextDataSet() - Method in class textminer.text.TextDocumentConverter
Convert each of text documents in given data set
convertTextDataSet() - Method in class textminer.text.TextConvertReuters
Convert all text documents of Reuters21580 dataset into machine readable form
corpus_index - Variable in class textminer.text.TextDocumentConverter
document index of given dataset
corpus_index - Variable in class textminer.text.LexiconGenerator
document index for given data set
corpus_index - Variable in class textminer.text.DictionaryGeneratorMultiClass
 
corpus_index - Variable in class textminer.text.DictionaryGenerator
document index of given data set
corpus_stat_filename - Variable in class textminer.text.LexiconGenerator
name of file containing documents statistics of given dataset.
corpusindex - Variable in class textminer.text.TextModel
document index structure of a given data set
corpusindex - Variable in class textminer.text.TermbyDocumentMatrix
document index structure of a given data set
CorpusIndex - class textminer.ds.CorpusIndex.
The CorpusIndex class is a data structure that maintains information on each of instances in the given data set.
CorpusIndex() - Constructor for class textminer.ds.CorpusIndex
Constructor of CorpusIndex
CorpusIndexEntry - class textminer.ds.CorpusIndexEntry.
The CorpusIndexEntry is an encapsulation of an entry in CorpusIndex.
CorpusIndexEntry() - Constructor for class textminer.ds.CorpusIndexEntry
Constructor of CorpusIndexEntry
CorpusIndexEntry(int, String, String, String, boolean) - Constructor for class textminer.ds.CorpusIndexEntry
Constructor of CorpusIndexEntry
correlation(double[], double[]) - Static method in class textminer.util.Similarity
Returns the (Pearson) correlation coefficient of two vectors in double.
Note: the dimensions of two given vectors must be identical
correlation(double[], double[], int) - Static method in class textminer.util.StatUtil
Returns the correlation coefficient of two vectors.
Note: the length of two vectors must be identical
correlation(int[], int[]) - Static method in class textminer.util.Similarity
Returns the (Pearson) correlation coefficient of two vectors in integer.
Note: the dimensionality of two given vectors must be identical
cosine_similarity(double[], double[]) - Static method in class textminer.util.Similarity
Return the similarity between two vectors by cosine angle Note: the dimensions of two vectors must be identical.
CPTEntry - class textminer.ds.CPTEntry.
The CPTEntry class
CPTEntry(int, int) - Constructor for class textminer.ds.CPTEntry
 
createDir(String) - Static method in class textminer.util.IOUtil
Create a directory by the specified path for temporary usage
createNetwork(int, int, int, float, float, float) - Method in class textminer.classification.ANN
Create a network with the specified parameters

D

data_repository - Variable in class textminer.classification.AbstractClassifier
 
data_repository - Variable in class textminer.clustering.AbstractClusterer
 
data_repository - Variable in class textminer.task.Subtask
Path of (supporting) data directory
data_repository - Variable in class textminer.text.LexiconGenerator
path of application data directory
data_repository - Variable in class textminer.text.DictionaryGeneratorMultiClass
 
data_repository - Variable in class textminer.text.DictionaryGenerator
path of data directory
dataFlow() - Method in class textminer.core.Main
Start to work on the specified text learning task
dataMatrix - Variable in class textminer.text.TermbyDocumentMatrix
a temporary array of all rows in the matrix
DataMerger - class textminer.util.DataMerger.
The DataMerger class is implemented to carry out the task of merging a couple of intermediate result.
DataMerger(String, String, String) - Constructor for class textminer.util.DataMerger
Constructor of DataMerger
DataRepClassification - class textminer.task.DataRepClassification.
The DataRepClassification class is responsible for the task of conversion given text data set into machine-understandable form.
DataRepClassification() - Constructor for class textminer.task.DataRepClassification
Constructor of DataRepClassification
DataRepMethods - interface textminer.datarepresentation.DataRepMethods.
The DataRepresentation interface defines constants which are used to the task of data representation.
DataRepresentator1 - class textminer.task.DataRepresentator1.
The DataRepresentator1 class is responsible for converting a given data set into a machine readable form in clustering task.
DataRepresentator1() - Constructor for class textminer.task.DataRepresentator1
 
DataRepresentClassify - class textminer.task.DataRepresentClassify.
The DataRepresentClassify class is responsible for changing a given text data set into a machine readable form and only used for classification.
DataRepresentClassify() - Constructor for class textminer.task.DataRepresentClassify
Constructor of DataRepresentClassify
DataSet - class textminer.ds.DataSet.
The DataSet class is an encapsulation of a given data set which can be convertable into (real-valued) matrix.
dataset_dir - Variable in class textminer.classification.AbstractClassifier
 
dataset_dir - Variable in class textminer.clustering.AbstractClusterer
 
dataset_dir - Variable in class textminer.task.Subtask
Path of data set directory
dataset_dir - Variable in class textminer.text.TextDocumentConverter
path of data set directory
dataset_dir - Variable in class textminer.text.Indexer
path of data set directory
dataset_dir - Variable in class textminer.text.DictionaryGeneratorMultiClass
 
dataset_dir - Variable in class textminer.text.DictionaryGenerator
path of data set directory
dataset_name - Variable in class textminer.text.TextModel
name of data set
dataset_name - Variable in class textminer.text.TermbyDocumentMatrix
name of data set
dataset_repository - Variable in class textminer.text.LexiconGenerator
path of data set directory
DataSet() - Constructor for class textminer.ds.DataSet
Constructor of DataSet
DataSet(String) - Constructor for class textminer.ds.DataSet
Constructor of DataSet
DataUtil - class textminer.util.DataUtil.
The DataUtil class provides a set of utilities for manipulating numerial data in their raw-format.
DataUtil() - Constructor for class textminer.util.DataUtil
 
date - Variable in class textminer.ds.NewsArticle
Published date of news article
defClassificationMethod - Static variable in interface textminer.core.Constants
 
defClusteringMethod - Static variable in interface textminer.core.Constants
 
defDataSetNames - Static variable in interface textminer.core.Constants
 
defDataSetType - Static variable in interface textminer.core.Constants
 
defEvaluationMethod - Static variable in interface textminer.core.Constants
 
defFeatureSelection - Static variable in interface textminer.core.Constants
 
defRemoveNoises - Static variable in interface textminer.core.Constants
 
defRepModels - Static variable in interface textminer.core.Constants
 
defTaskNames - Static variable in interface textminer.core.Constants
 
defTaskOutputs - Static variable in interface textminer.core.Constants
 
df - Variable in class textminer.ds.tmTerm
Document frequency
DictionaryFinancial - class textminer.text.DictionaryFinancial.
The DictionaryFinancial class is responsible for generating a (unique) term dictionary of the Financial news data set.
DictionaryFinancial(CorpusIndex, String, String, String, String, String, boolean) - Constructor for class textminer.text.DictionaryFinancial
Constructor of DictionaryFinancial
DictionaryFinancialMultiClass - class textminer.text.DictionaryFinancialMultiClass.
The DictionaryFinancialMultiClass class is responsible for generating a (unique) term dictionary of the Financial news article data set.
DictionaryFinancialMultiClass(CorpusIndex, String, String, String, String, String, Vector, boolean) - Constructor for class textminer.text.DictionaryFinancialMultiClass
Constructor of DictionaryFinancialMultiClass
DictionaryGenerator - class textminer.text.DictionaryGenerator.
The DictionaryGenerator class is an abstract class that encapsulates the job of generating the unique term (word or phrase) dictionary for a given data set.
DictionaryGenerator(CorpusIndex, String, String, String, String, String, boolean) - Constructor for class textminer.text.DictionaryGenerator
Constructor of DictionaryGenerator
DictionaryGeneratorMultiClass - class textminer.text.DictionaryGeneratorMultiClass.
The DictionaryGeneratorMultiClass class is an abstract class which encapsulates the process of making the unique term (word or phrase) dictionary.
DictionaryGeneratorMultiClass(CorpusIndex, String, String, String, String, String, Vector, boolean) - Constructor for class textminer.text.DictionaryGeneratorMultiClass
Constructor of DictionaryGeneratorMultiClass
DictionaryHandler - class textminer.ds.DictionaryHandler.
The DictionaryHandler clas provides a set of funtions which facilitate the job of building dictionary (ries) for given textual data set.
DictionaryHandler - class textminer.text.DictionaryHandler.
The DictionaryHandler class provides a set of utility functions of manuplating TermDictionary (ries).
DictionaryHandler() - Constructor for class textminer.ds.DictionaryHandler
 
DictionaryHandler() - Constructor for class textminer.text.DictionaryHandler
 
DictionaryMaker - class textminer.text.DictionaryMaker.
The DictionaryMaker class is responsible for generating the unique term dictionary of the given text document collection.
DictionaryMaker(CorpusIndex, Vector, String, String, String, boolean) - Constructor for class textminer.text.DictionaryMaker
Constructor of DictionaryMaker
DictionaryMaker1 - class textminer.text.DictionaryMaker1.
The DictionaryMaker1 class is responsible for generating the unique term dictionary of the given text document collection.
DictionaryMaker1(CorpusIndex, String, String, String, String, boolean) - Constructor for class textminer.text.DictionaryMaker1
Constructor of DictionaryFinancial
DictionaryNewsgroup - class textminer.text.DictionaryNewsgroup.
The DictionaryNewsgroup class is responsible for generating a (unique) term dictionary of the 20 News group data set.
DictionaryNewsgroup(CorpusIndex, String, String, String, String, String, boolean) - Constructor for class textminer.text.DictionaryNewsgroup
Constructor of DictionaryFinancial
DictionaryReutersMultiClass - class textminer.text.DictionaryReutersMultiClass.
The DictionaryReutersMultiClass class is implemented to to build term dictionaries for categories in Reuters-21578 data set.
DictionaryReutersMultiClass(CorpusIndex, String, String, String, String, String, Vector, boolean) - Constructor for class textminer.text.DictionaryReutersMultiClass
Constructor of DictionaryReutersMultiClass
DictionaryTDT - class textminer.text.DictionaryTDT.
The DictionaryTDT class is responsible for generating a (unique) term dictionary of the Topic Detection and Tracking (TDT) pilot corpus.
DictionaryTDT(CorpusIndex, String, String, String, String, String, boolean) - Constructor for class textminer.text.DictionaryTDT
Constructor of DictionaryTDT
display_network() - Method in class textminer.clustering.SOM
Print winner and its array
display_results() - Method in class textminer.clustering.emLD
 
display_results() - Method in class textminer.clustering.EM
Display the clustering result
display_status() - Method in class textminer.classification.ANN
Display current network
displayAll() - Method in class textminer.ds.Lexicon
Display all elements in the Lexicon
displayAll() - Method in class textminer.ds.DataSet
Display all instances in the dataset
DIST - Static variable in interface textminer.featureselection.FeatureSelectionMethods
Feature selection by distributional clustering of words
distance(double[], double[]) - Static method in class textminer.util.Similarity
Return the Euclidean distance value between two vectors.
Euclidean distance (p = 2) is a special form of the Minkowski distances with its definition by:
D(x, y) = (sum_{i=1}^{d} |x_i - y_i|^p)^{1/p}
Note: the dimensions of two vectors must be identical
DISTRIBUTIONAL_MODEL - Static variable in interface textminer.datarepresentation.DataRepMethods
Distributional model
DistributionalModel - class textminer.text.DistributionalModel.
The DistributionalModel class is an implementation of the language model is intended to generate a probability distribution model of text data with the "Laplace smoothing" estimator.
DistributionalModel(CorpusIndex, String, String, ArrayList) - Constructor for class textminer.text.DistributionalModel
Constructor of DistributionalModel
DistributionalWordClustering - class textminer.featureselection.DistributionalWordClustering.
The DistributionalWordClustering class is an implementation of an algorithm that groups words according to their distributions.
DistributionalWordClustering(CorpusIndex, Vector, Vector, int[], int[], int[], int, int, String, String, boolean) - Constructor for class textminer.featureselection.DistributionalWordClustering
Constructor of DistributionalWordClustering
dMethodNames - Static variable in interface textminer.datarepresentation.DataRepMethods
String definitions of data representation methods
doClassification() - Method in class textminer.task.Classifiers
Perform text classification task with the specified method
doClassification() - Method in class textminer.task.Classifier
Perform text classification task with the specified method
doClassify() - Method in class textminer.classification.AbstractClassifier
Perform text classification with the specified method
doClassify(int[]) - Method in class textminer.classification.NaiveBayes1
Classify the given document, doc, which is an array of term location according to the global lexicon
doClassify(int[]) - Method in class textminer.classification.NaiveBayes
Classify a given example into one of target classes The example is represented by an array of containing elements location
doClassify(int[]) - Method in class textminer.classification.BayesianNet
Classify the given document, doc, which is an array of term location according to the global lexicon
doClassify(int[], double[]) - Method in class textminer.classification.WidrowHoff
Classify the specified document which is represented by terms' location and their weights
doClassify(int[], double[]) - Method in class textminer.classification.EGradient
Classify the specified document which is represented by terms' location and their weights
doClustering() - Method in class textminer.clustering.SOM
Perform SOM clustering
doClustering() - Method in class textminer.clustering.PDDP
Perform PDDP clustering.
Maximize Fisher's linear discrimination, |distance of Between centroids| / |distance of Within centroids|
doClustering() - Method in class textminer.clustering.kMeansLD
Perform kmean clustering for relatively larger data set
doClustering() - Method in class textminer.clustering.kMeans
Perform kMeans
doClustering() - Method in class textminer.clustering.hacSM
Perform HAC clustering for relatively small data set
doClustering() - Method in class textminer.clustering.hacLD
Perform HAC clustering for large size of data set
doClustering() - Method in class textminer.clustering.HAC
Perform HAC clustering
doClustering() - Method in class textminer.clustering.GAC
Perform GAC clustering
doClustering() - Method in class textminer.clustering.emLD
Perform EM clustering algorithm
doClustering() - Method in class textminer.clustering.EM
Perform EM algorithm
doClustering() - Method in class textminer.clustering.Competitive
Performing the constructive-competition clustering while the condition is met
doClustering() - Method in class textminer.clustering.AbstractClusterer
 
doClustering() - Method in class textminer.task.ClustersLD
 
doClustering() - Method in class textminer.task.Clusterers
Perform clustering
doCondensing() - Method in class textminer.text.Condenser
 
doConversion() - Method in class textminer.datarepresentation.ReutersDataSetConverter
Return true if the task of conversion is done successfully
doConversion() - Method in class textminer.datarepresentation.AbstractDataSetConverter
Return true if the task of conversion is done successfully
documentFrequency - Variable in class textminer.text.TextModel
array of containing document frequency corresponding array of bagofwords
documentFrequency - Variable in class textminer.text.TermbyDocumentMatrix
array of document frequency
DocumentFrequency - class textminer.featureselection.DocumentFrequency.
The DocumentFrequency class is an implementation of a feature selection method by document frequency.
DocumentFrequency(CorpusIndex, Vector, Vector, int[], int[], int[], int, int, String, String, boolean) - Constructor for class textminer.featureselection.DocumentFrequency
Constructor of DocumentFrequency
documentIndex - Variable in class textminer.datarepresentation.TextModel
document index of given data set
documentIndex - Variable in class textminer.featureselection.AbstractFeatureSelector
document index of given data set
docvec_repository - Variable in class textminer.classification.AbstractClassifier
 
docvec_repository - Variable in class textminer.clustering.AbstractClusterer
 
docvec_repository - Variable in class textminer.text.TextModel
path of directory containing vectorized text document set
doDataRepresentation() - Method in class textminer.task.DataRepresentClassify
Represent a given data set by machine-readable form.
doDataRepresentation() - Method in class textminer.task.DataRepresentator1
Perform data representation
doDataRepresentation() - Method in class textminer.task.DataRepClassification
Return true if the task of data representation is done successfully
doDataRepresentation() - Method in class textminer.task.AbstractDataRepresentation
Perform the task of data representation based on the specified text model
doFCP() - Method in class textminer.featureselection.FCP
Perform FCP
doFeatureSelection() - Method in class textminer.task.FeatureSelector
Perform the task of feature selection with the specified methods
DOMAIN_EXPERTS - Static variable in interface textminer.classification.ClassificationMethods
Domain Experts classification
DomainExperts - class textminer.classification.DomainExperts.
The DomainExperts class is a classification algorithm and a variant of the weighted-majority algorithm for classifying textual data set.
DomainExperts(String) - Constructor for class textminer.classification.DomainExperts
 
doPreprocessing() - Method in class textminer.task.Preprocessor
Perform pre-processing according to the specification
doTasks() - Method in class textminer.task.TaskManager
Perform sub-procedures as specified.

E

EGradient - class textminer.classification.EGradient.
The EGradient class is an implementation of the Exponentiated Gradient (EG).
EGradient(int, int, double[][], double) - Constructor for class textminer.classification.EGradient
Constructor of WidrowHoff
elements_afterprun_class - Variable in class textminer.text.DictionaryGeneratorMultiClass
 
elements_per_class - Variable in class textminer.text.DictionaryGeneratorMultiClass
 
EM - class textminer.clustering.EM.
The EM class is an implementation of the Expectation-Maximization (EM).
EM - Static variable in interface textminer.clustering.ClusteringMethods
Expectation-Maximization
EM(SubtaskClustering) - Constructor for class textminer.clustering.EM
Constructor of EM
emLD - class textminer.clustering.emLD.
The emLD class is an extended version of EM.
emLD() - Constructor for class textminer.clustering.emLD
Constructor of emLD
entropy(double[]) - Static method in class textminer.util.DataUtil
Return entropy of given array of probabilities
entropy(int[]) - Static method in class textminer.util.DataUtil
Return entropy of given array of frequencies for each element
entry_class - Variable in class textminer.ds.tmDocVecIndexEntry
label of class which this instance belongs to
entry_class - Variable in class textminer.ds.CorpusIndexEntry
target value (class) of this entry
entry_key - Variable in class textminer.ds.tmDocVecIndexEntry
unique identifier
entry_key - Variable in class textminer.ds.CorpusIndexEntry
key (i.e.
entry_location - Variable in class textminer.ds.CorpusIndexEntry
path of file where this entry exists
entry_num - Variable in class textminer.ds.CorpusIndexEntry
serial number of this entry
entry_path - Variable in class textminer.ds.tmDocVecIndexEntry
path of this instance
entry_purpose - Variable in class textminer.ds.tmDocVecIndexEntry
purpose of this instance: for training or for evaluation
entry_purpose - Variable in class textminer.ds.CorpusIndexEntry
indicate if this entry belongs to train data set true: belongs to train set, false: otherwise
Env - class textminer.core.Env.
The Env class captures the environmental properties in which the TextMiner is deployed.
Env(String) - Constructor for class textminer.core.Env
Constructor of Env
estimate_performance() - Method in class textminer.evaluation.Evaluator
Estimate the performance of an output by using the specified measures
estimate_weight() - Method in class textminer.featureselection.ModifiedTF
Calculate each term's weight
estimateClassModel() - Method in class textminer.text.VectorSpaceModel
Estimate text models for each class from train data set.
estimateClassModel() - Method in class textminer.text.TFIDFModel
Return estimated text model in array of doubles
estimateClassModel() - Method in class textminer.text.TextModel
Estimate text models of a given data set for each class
estimateClassModel() - Method in class textminer.text.MultinomialGenerativeModel
Estimate text models for each class of a given data set from train data set In order to avoid zero probabilities, the Laplace estimator is applied.
estimateClassModel() - Method in class textminer.text.DistributionalModel
Estimate text models for each class based on the word feature set
estimateModel() - Method in class textminer.text.VectorSpaceModel
Return estimated text model in array of doubles
estimateModel() - Method in class textminer.text.TFIDFModel
Return text model of given data set using TFIDF
estimateModel() - Method in class textminer.text.TextModel
Estimate a text model of a given data set
estimateModel() - Method in class textminer.text.MultinomialGenerativeModel
Estimate a text model for a given data set from train data set
estimateModel() - Method in class textminer.text.DistributionalModel
Estimate a text model of a given data set based on the word feature set
evaluation_measures - Static variable in interface textminer.core.Constants
 
EvaluationMeasure - class textminer.evaluation.EvaluationMeasure.
The EvaluationMeasure class is an abstraction of the standard evaluation metrics from the text learning domain.
EvaluationMeasure() - Constructor for class textminer.evaluation.EvaluationMeasure
 
Evaluator - class textminer.evaluation.Evaluator.
The Evaluator class is responsible for generating the contingency table for each class.
Evaluator() - Constructor for class textminer.evaluation.Evaluator
Constructor of Evaluator
examples_per_class - Variable in class textminer.classification.AbstractClassifier
 
examples_per_class - Variable in class textminer.datarepresentation.TextModel
examples (e.g.
examples_per_class - Variable in class textminer.featureselection.AbstractFeatureSelector
examples (e.g.
examples_per_class - Variable in class textminer.text.TextModel
number of examples (e.g.
examples_per_class - Variable in class textminer.text.TermbyDocumentMatrix
number of examples in each class of given data set
examples_per_class - Variable in class textminer.text.DictionaryGeneratorMultiClass
 
ExpectedCrossEntropy - class textminer.featureselection.ExpectedCrossEntropy.
The ExpectedCrossEntropy class is an implementation of the feature selection by the expected Cross Entropy measure.
ExpectedCrossEntropy - Static variable in interface textminer.featureselection.FeatureSelectionMethods
Feature selection by expected cross entropy
ExpectedCrossEntropy(CorpusIndex, Vector, Vector, int[], int[], int[], int, int, String, String, boolean) - Constructor for class textminer.featureselection.ExpectedCrossEntropy
Constructor of ExpectedCrossEntropy
EXPO_GRADIENT - Static variable in interface textminer.classification.ClassificationMethods
Exponentiated Gradient
ext_condensed_index_file - Static variable in interface textminer.core.InterOutcomes
 
ext_corpus_stat_file - Static variable in interface textminer.core.InterOutcomes
 
ext_dvec_file - Static variable in interface textminer.core.InterOutcomes
 
ext_fsmethod_file - Static variable in interface textminer.core.InterOutcomes
 
ext_index_file - Static variable in interface textminer.core.InterOutcomes
 
ext_judgment_file - Static variable in interface textminer.core.InterOutcomes
 
ext_lexicon_file - Static variable in interface textminer.core.InterOutcomes
 
ext_matrix_file - Static variable in interface textminer.core.InterOutcomes
 
ext_model_file - Static variable in interface textminer.core.InterOutcomes
 
ext_output_file - Static variable in interface textminer.core.InterOutcomes
 
ext_result_file - Static variable in interface textminer.core.InterOutcomes
 
ext_termdic_file - Static variable in interface textminer.core.InterOutcomes
 
ext_vec_index_file - Static variable in interface textminer.core.InterOutcomes
 
extractBody() - Method in class textminer.ds.TDTData
 

F

FCP - class textminer.featureselection.FCP.
The FCP class is intended to select the frequently co-occurred phrases (FCP) from the text document collection.
FCP - Static variable in interface textminer.featureselection.FeatureSelectionMethods
Feature selection by Frequently Cooccurred Phrase (FCP)
FCP() - Constructor for class textminer.featureselection.FCP
Construction of FCP
FeatureSelectionMethods - interface textminer.featureselection.FeatureSelectionMethods.
The FeatureSelectionMethods interface defines a set of constants to be used for feature selection.
FeatureSelector - class textminer.task.FeatureSelector.
The FeatureSelector class is designed to perform the job of feature (subset) selection.
FeatureSelector() - Constructor for class textminer.task.FeatureSelector
Constructor of FeatureSelector
FileCache - class textminer.util.FileCache.
The FileCache class is an implementation of software cache that speed up I/O.
FileCache(String) - Constructor for class textminer.util.FileCache
Constructor of FileCache
fileseparator - Variable in class textminer.text.TextDocumentConverter
platform-dependent file separator
fileseparator - Variable in class textminer.text.Indexer
platform-dependent file separator
fileseparator - Variable in class textminer.text.DictionaryGeneratorMultiClass
 
fileseparator - Variable in class textminer.text.DictionaryGenerator
platform-dependent file separator
FinancialData - class textminer.ds.FinancialData.
The FinancialData class is the abstraction of an instance for a financial news data.
FinancialData() - Constructor for class textminer.ds.FinancialData
Constructor of FinancialData
FinancialData(String, String, String, String, String, String, String, String) - Constructor for class textminer.ds.FinancialData
Constructor of FinancialData
FinancialDataMaker - class textminer.util.FinancialDataMaker.
The FinancialDataMaker class is implemented to generate a data structure which contains an instance or a set of instances of the "Financial" data set in a structural and machine-readable form.
FinancialDataMaker() - Constructor for class textminer.util.FinancialDataMaker
 
FinancialDataSet - class textminer.ds.FinancialDataSet.
The FinancialDataSet is the abstraction of a collection of financial news articles.
FinancialDataSet() - Constructor for class textminer.ds.FinancialDataSet
Constructor of FinancailDataSet
find_cluster(String) - Method in class textminer.ds.HACClusterSet
Search the cluster by the specified key
find_data(String) - Method in class textminer.ds.DataSet
Return the index of a specific data (i.e.
find(int) - Method in class textminer.ds.CorpusIndex
Return an item at the specified index
find(String) - Method in class textminer.clustering.ProximityArray
Return the proximity value of the specified key.
find(String) - Method in class textminer.ds.ReutersDataSet
Search an instance associated with the specified id
find(String) - Method in class textminer.ds.FinancialDataSet
Find a news article associated with the specified id
find(String) - Method in class textminer.ds.CorpusIndex
Return an item matched with the specified key
fmethod - Variable in class textminer.task.SubtaskSelector
String expression of current feature selection method
fMethodNames - Static variable in interface textminer.featureselection.FeatureSelectionMethods
Definition of feature selection methods

G

GAC - class textminer.clustering.GAC.
The GAC class is the implementation of an variant of hierarchical agglomerative clustering (HAC) algorithms that was implemented in [Yang et al., 1999].
GAC(SubtaskClustering) - Constructor for class textminer.clustering.GAC
Constructor of GAC
generate_subgroup(String) - Method in class textminer.evaluation.Result
Generate a subset according to the specified label
generateIndex() - Method in class textminer.text.IndexTDT
Generate index for TDT dataset
generateIndex() - Method in class textminer.text.IndexReuters
Generate index for Reuters-21580 data set
generateIndex() - Method in class textminer.text.IndexNewsgroup
Generate an index for 20 news groups
generateIndex() - Method in class textminer.text.IndexFinancial
Return true if the job of generating the index of financial news data set is done successfully
generateIndex() - Method in class textminer.text.Indexer
Return true if the job of generating the index is done successfully
generateInvertedMatrix() - Method in class textminer.task.InvertedMatrixGenerator
Perform the task of generating an Inverted Matrix
generateLexicon() - Method in class textminer.text.LexiconGeneratorReuters
Generate a Lexicon for Reuters-21578
generateLexicon() - Method in class textminer.text.LexiconGenerator
Generate a Lexicon of a given data set
get_cluster(int) - Method in class textminer.ds.HACClusterSet
Return the cluster at the specified index
get_clusters_size() - Method in class textminer.ds.HACClusterSet
Return number of clusters
get_clusters() - Method in class textminer.clustering.hacSM
Retur clusters
get_flag_feature() - Method in class textminer.task.TaskOrder
Return value of flag for feature selection
get_flag_learning() - Method in class textminer.task.TaskOrder
Return value of flag for learning method
get_flag_num_preprocess_task() - Method in class textminer.task.TaskOrder
Return number of flags
get_flag_preprocess(int) - Method in class textminer.task.TaskOrder
Return value of flag at the specified index
get_freq() - Method in class textminer.ds.Term
Return the frequency of this term
get_id() - Method in class textminer.evaluation.ResultEntry
Return the identifier of this class
get_label() - Method in class textminer.evaluation.ResultEntry
Return the label of this class
get_nAttrs() - Method in class textminer.ds.DataSet
Return the number of actively used attributes in an instance
get_nInstances() - Method in class textminer.ds.DataSet
Return the number of instances in the given dataset
get_numMembers() - Method in class textminer.ds.ClusterMembership
Return number of instances
get_results() - Method in class textminer.clustering.kMeansLD
 
get_results() - Method in class textminer.clustering.hacLD
 
get_results() - Method in class textminer.clustering.emLD
 
get_size_bow() - Method in class textminer.task.DataRepresentator1
Return the number of components in bag of words
get_term() - Method in class textminer.ds.Term
Return the term in this class
get_weight() - Method in class textminer.ds.weightedTerm
Return weight
getActiveClasses() - Method in class textminer.task.SubtaskClassifiers
Return target labels of a given data set
getAlias() - Method in class textminer.task.SubtaskClustering
Return task alias
getAlias() - Method in class textminer.task.SubtaskClassifiers
Return the alias of current task
getAppName() - Method in class textminer.core.Env
Return name of application
getArray(Vector, boolean) - Method in class textminer.datarepresentation.TextModel
 
getArray(Vector, boolean) - Method in class textminer.featureselection.AbstractFeatureSelector
 
getAvgRMSerror() - Method in class textminer.classification.ANN
 
getBody() - Method in class textminer.ds.TDTData
 
getBody() - Method in class textminer.ds.NewsgroupInstance
 
getBody() - Method in class textminer.ds.FinancialData
Get the content body of this news article
getBodyField() - Method in class textminer.ds.ReutersInstance
Return body text assigned to this instance
getChildNode(int) - Method in class textminer.ds.CondProbTable
 
getClassesAssociatedDocs(int[]) - Method in class textminer.datarepresentation.TextModel
 
getClassesAssociatedDocs(int[]) - Method in class textminer.featureselection.AbstractFeatureSelector
 
getClassesAssociatedTerm(int[]) - Method in class textminer.datarepresentation.TextModel
 
getClassesAssociatedTerm(int[]) - Method in class textminer.featureselection.AbstractFeatureSelector
 
getClassificationMethod() - Method in class textminer.task.SubtaskClassifiers
Return current method for text classification
getCompany() - Method in class textminer.ds.FinancialData
Get the company that this news article mainly discussed about
getCompleteVector() - Method in class textminer.ds.TextDocument
Return TextDocument in Vector of containing all words and their weights
getCondProbOneParent(boolean) - Method in class textminer.ds.CPTEntry
 
getCondProbTwoParents(boolean, boolean) - Method in class textminer.ds.CPTEntry
 
getDataDir() - Method in class textminer.core.Env
Return (relative) path of data directory
getDataRepository() - Method in class textminer.task.SubtaskClustering
Return path of data directory
getDataRepository() - Method in class textminer.task.SubtaskClassifiers
Return directory of containing data
getDataSetClasses() - Method in class textminer.task.Task
Return classes (target, true) labels of a given data set
getDataSetDir() - Method in class textminer.task.Task
Return path of a given data set
getDataSetDir() - Method in class textminer.task.SubtaskClustering
Return path of data set directory
getDataSetDir() - Method in class textminer.task.SubtaskClassifiers
Return directory of containing data set
getDataSetFormat() - Method in class textminer.task.Task
Return format of a given data set
getDataSetName() - Method in class textminer.task.Task
Return name of a given data set
getDataSetTratio() - Method in class textminer.task.Task
Return proportion of training set to a given data set
getDataSetType() - Method in class textminer.task.Task
Return type of a given data set
getDate() - Method in class textminer.ds.TDTData
 
getDate() - Method in class textminer.ds.FinancialData
Get the published date of this news article
getDateField() - Method in class textminer.ds.ReutersInstance
Return date of this instance
getDesiredClusters() - Method in class textminer.task.Task
Return desired number of clusters
getDesiredClusters() - Method in class textminer.task.SubtaskClustering
Return desired number of clusters
getDocFreq(int[]) - Method in class textminer.datarepresentation.TextModel
 
getDocFreq(int[]) - Method in class textminer.featureselection.AbstractFeatureSelector
 
getDocIDArray() - Method in class textminer.ds.IntermediateTextFileEntry
Return an array of integers containing document ids' which the term occurred
getElement(int) - Method in class textminer.ds.SequentialHashMap
Return an element at the specified index
getElement(Object) - Method in class textminer.ds.SequentialHashMap
Return an element associated with the specified key
getEntry(int) - Method in class textminer.ds.InvertedIndex
 
getErrorMsg() - Method in class textminer.task.Task
Return error messages occurred while parsing task specification file
getExchangeField() - Method in class textminer.ds.ReutersInstance
Return exchanges field assigned to this instance
getFeatureSelectionMethod() - Method in class textminer.task.SubtaskClassifiers
Return current method of feature selection
getFieldValues(String) - Method in class textminer.datarepresentation.TextModel
 
getFieldValues(String) - Method in class textminer.featureselection.AbstractFeatureSelector
 
getHashMapOccurrence() - Method in class textminer.ds.InvertedMatrixTermEntry
Return info about term occurrences in HashMap
getId() - Method in class textminer.ds.NewsgroupInstance
 
getID() - Method in class textminer.ds.TDTData
 
getID() - Method in class textminer.ds.LexiconEntry
Return identifier
getID() - Method in class textminer.ds.FinancialData
Get the identifier of this news article
getIdField() - Method in class textminer.ds.ReutersInstance
Return id of this instance
getKeySet() - Method in class textminer.ds.SequentialHashMap
Return keyset
getLabel() - Method in class textminer.ds.FinancialData
Get the target label of this news article
getLearnMethod() - Method in class textminer.task.Task
Return name of learning method for given task
getLexiconVector() - Method in class textminer.ds.Lexicon
Return Lexicon in Vector
getMethod() - Method in class textminer.task.SubtaskClustering
Return current clustering method
getNodeID() - Method in class textminer.ds.CPTEntry
 
getNumberOfDistinctTerms() - Method in class textminer.ds.Lexicon
Return number of distinct terms in the Lexicon
getNumberOfFiles(String) - Static method in class textminer.util.IOUtil
Return number of files in the specified path
getNumberOfInstances() - Method in class textminer.ds.SequentialHashMap
 
getNumberOfInstances() - Method in class textminer.ds.CorpusIndex
Return number of instances in this class
getNumberOfParents() - Method in class textminer.ds.CPTEntry
 
getNumberOfTerms() - Method in class textminer.ds.IntermediateTextFile
Return number of terms in this class
getNumOfLabels() - Method in class textminer.evaluation.ResultEntry
 
getNumOfTerms() - Method in class textminer.core.Env
Return the number of components (terms) per a (docuemtn) vector
getNumOfTerms() - Method in class textminer.task.Preprocessor
Return number of terms in the term dictionary
getNumOfTerms() - Method in class textminer.text.DictionaryGeneratorMultiClass
Return the number of terms in the master term dictionary
getOrgField() - Method in class textminer.ds.ReutersInstance
Get org field
getParentID(int) - Method in class textminer.ds.CPTEntry
 
getParentNode(int) - Method in class textminer.ds.CondProbTable
 
getPeopleField() - Method in class textminer.ds.ReutersInstance
Return people field assigned to this instance
getPlaceField() - Method in class textminer.ds.ReutersInstance
Return place string assigned to this instance
getPrepNoise(int) - Method in class textminer.task.Task
Return true if preprocessing of noise is specified
getPrepStemming() - Method in class textminer.task.Task
Return true if stemming option is specified
getProximityArrayVector() - Method in class textminer.clustering.ProximityArray
Return ProximityArray in Vector
getRandomNumber(double) - Static method in class textminer.util.MathUtil
Return a random number of double
getRandomNumber(int) - Static method in class textminer.util.MathUtil
Return a random number of integer
getRepFS() - Method in class textminer.task.Task
Return name of method for feature selection
getRepModel() - Method in class textminer.task.Task
Return name of model for data representation
getRepModel() - Method in class textminer.task.AbstractDataRepresentation
Get the current model of data representation
getResultBuffer() - Method in class textminer.text.Stemmer
Returns a reference to a character buffer containing the results of the stemming process.
getResultDir() - Method in class textminer.core.Env
Return (relvative) path of result directory
getResultLength() - Method in class textminer.text.Stemmer
Returns length of the word resulting from the stemming
getResultRepository() - Method in class textminer.task.SubtaskClustering
Return path of result directory
getResultRepository() - Method in class textminer.task.SubtaskClassifiers
Return directory of containing intermediate results
getSize() - Method in class textminer.util.FileCache
Return the size of the cached file.
getSource() - Method in class textminer.ds.TDTData
 
getSource() - Method in class textminer.ds.FinancialData
Get the news provider of this news article
getStringOccurrence() - Method in class textminer.ds.InvertedMatrixTermEntry
Return a string representation of term occurrences
getStringOccurrence() - Method in class textminer.ds.IntermediateTextFileEntry
Return a string expression of term occurrences
getSubArray(int[], int, int) - Static method in class textminer.util.DataUtil
Return subarray of given source array which is starting from the specified index, from, and size
getSubject() - Method in class textminer.ds.NewsgroupInstance
 
getTaskAssignment() - Method in class textminer.task.Task
Return numerical representation of current task
getTaskName() - Method in class textminer.task.Task
Return name of task
getTaskName() - Method in class textminer.task.SubtaskClustering
Return task name
getTaskName() - Method in class textminer.task.SubtaskClassifiers
Return name of a given task
getTaskOutDir() - Method in class textminer.task.Task
Return path of task output
getTaskOutputFilename() - Method in class textminer.task.Task
Return file name of task output
getTaskOutType() - Method in class textminer.task.Task
Return output type of current task
getTerm() - Method in class textminer.ds.LexiconEntry
Return term
getTermArray() - Method in class textminer.ds.IntermediateTextFile
Return an array of integers containing term identifier
getTermDictionary(int) - Method in class textminer.featureselection.ModifiedTF
Return an instance of term dictionary at the specified index
getTermID() - Method in class textminer.ds.InvertedMatrixTermEntry
Return term identifier
getTermID() - Method in class textminer.ds.IntermediateTextFileEntry
Return term identifier
getTermID(String) - Method in class textminer.featureselection.AbstractFeatureSelector
 
getTitle() - Method in class textminer.ds.TDTData
 
getTitle() - Method in class textminer.ds.FinancialData
Get the title of this news article
getTitleField() - Method in class textminer.ds.ReutersInstance
Return title field assigned to this instance
getTopicField() - Method in class textminer.ds.ReutersInstance
Return topic assigned to this instance
getTotalNumberOfTerms() - Method in class textminer.ds.Lexicon
Return total number of terms in the Lexicon
getURL() - Method in class textminer.ds.FinancialData
Get the url of this news article
getVecSize() - Method in class textminer.task.SubtaskClustering
Return size of (document) vector
getVector() - Method in class textminer.ds.TextDocument
Return TextDocument in Vector of containing all words The returned vector is used to represent a text document as the combination of words which are occurred once in a document regardless of its term frequencies
getVerbose() - Method in class textminer.task.Task
Return true if verbose is on
getVerbose() - Method in class textminer.task.SubtaskClustering
Return states of verbosity
global_termdic - Variable in class textminer.text.DictionaryGenerator
term dictionary for given data set
globalLexicon - Variable in class textminer.datarepresentation.TextModel
global lexicon
globalLexicon - Variable in class textminer.featureselection.AbstractFeatureSelector
Global Lexicon of the given data set
globalLexiconFilename - Variable in class textminer.datarepresentation.TextModel
file name of global lexicon file

H

HAC - class textminer.clustering.HAC.
The HAC class is an implementation of Hierarchical Agglomerative Clustering (HAC).
HAC - Static variable in interface textminer.clustering.ClusteringMethods
Hierarchical Aggolmerative Clsutering
HAC(SubtaskClustering) - Constructor for class textminer.clustering.HAC
Constructor of HAC
HACClusterSet - class textminer.ds.HACClusterSet.
The HACClusterSet class is intended to maintain a tree structure which contains intermediate results produced by Hierarchical Agglomerative Clustering.
HACClusterSet() - Constructor for class textminer.ds.HACClusterSet
Constructor of HACClusterSet
HACClusterSet(int) - Constructor for class textminer.ds.HACClusterSet
Constructor of HACClusterSet
hacLD - class textminer.clustering.hacLD.
The hacLD class is an extended version of HAC.
hacLD() - Constructor for class textminer.clustering.hacLD
Constructor of hacLD
hacMethods - interface textminer.clustering.hacMethods.
The hacMethods interface defines three binding methods for Hierarchical Agglomerative Clustering (HAC).
hacSM - class textminer.clustering.hacSM.
The hacSM class is an extended version of HAC.
hacSM() - Constructor for class textminer.clustering.hacSM
Constructor of hacSM
hamming_distance(String, String) - Static method in class textminer.util.TextUtil
Return the hamming distance between str1 and str2.

I

identifySentenceBoundary(String) - Static method in class textminer.util.TextUtil
Detect the sentence boundary from the specified text and return a sentence with boundaries
index_filename - Variable in class textminer.text.Indexer
file name of index to be written
Indexer - class textminer.text.Indexer.
The Indexer class is an abstract class that provides a set of constants and common functions required for the task of generating the index of given data set.
Indexer(String, String, String, int, boolean) - Constructor for class textminer.text.Indexer
Constructor of Indexer
IndexFinancial - class textminer.text.IndexFinancial.
The IndexFinancial class is implemented to perform the job of generating the index file for the Financial news data set.
IndexFinancial(String, String, String, int, boolean) - Constructor for class textminer.text.IndexFinancial
Constructor of IndexFinancial
IndexNewsgroup - class textminer.text.IndexNewsgroup.
The IndexNewsgroup class is responsible for generating the index file of the 20 Newsgroups data set.
IndexNewsgroup(String, String, String, int, boolean) - Constructor for class textminer.text.IndexNewsgroup
Constructor of IndexNewsgroup
indexOf(String[], String) - Static method in class textminer.util.TextUtil
Return index of a string array matched the specified string, element
IndexReuters - class textminer.text.IndexReuters.
The IndexReuters class is responsible for generating the index file for the Reuters-21580 data set.
IndexReuters(String, String, String, int, boolean) - Constructor for class textminer.text.IndexReuters
Constructor of IndexReuters
IndexTDT - class textminer.text.IndexTDT.
The IndexTDT class is responsible for generating the index file of the Topic Detection and Tracking (TDT)dataset.
IndexTDT(String, String, String, int, boolean) - Constructor for class textminer.text.IndexTDT
Constructor of IndexTDT
INFO_GAIN - Static variable in interface textminer.featureselection.FeatureSelectionMethods
Feature selection by Information Gain
InfoGain - class textminer.featureselection.InfoGain.
The InfoGain class is an implementation of a feature selection method by information gain.
InfoGain(CorpusIndex, Vector, Vector, int[], int[], int[], int, int, String, String, boolean) - Constructor for class textminer.featureselection.InfoGain
Constructor of InfoGain
init_classifier() - Method in class textminer.classification.AbstractClassifier
 
init_clusterer() - Method in class textminer.clustering.AbstractClusterer
Initialize a clustering algorithm
init_components(int) - Method in class textminer.ds.Instance
Initialize this instance's components
init_network(int, int) - Method in class textminer.clustering.SOM
Create a network structure for SOM with the specified parameters
init() - Method in class textminer.featureselection.FCP
Initialize
init() - Method in class textminer.task.TaskManager
Initialize the TaskManager and specify task orders for sub-procedures
init(dataSet, int) - Method in class textminer.clustering.PDDP
Initialize
init(double, double) - Method in class textminer.clustering.GAC
Initialize
init(double, int, DataSet, int) - Method in class textminer.clustering.hacSM
Initialize
init(double, int, int, int) - Method in class textminer.clustering.HAC
Initialize
init(int, double, int, int, int, boolean) - Method in class textminer.clustering.hacLD
Initialize
init(int, int) - Method in class textminer.clustering.Competitive
Initialize
init(int, int, int) - Method in class textminer.clustering.kMeans
Initialize
init(int, int, int) - Method in class textminer.clustering.EM
Initialize EM class
init(int, int, int, boolean) - Method in class textminer.clustering.kMeansLD
Initialize
init(int, int, int, int, boolean) - Method in class textminer.clustering.emLD
Initialize
init(Result, Result, Vector, int, int, String, String, String) - Method in class textminer.evaluation.EvaluationMeasure
Get values and initialize the EvaluationMeasure class
init(String) - Method in class textminer.core.Main
Initialize configuration for current task
init(String, String, boolean, String, String, String, boolean) - Method in class textminer.evaluation.Evaluator
Initialization of Evaluator Note: the output file shoud be formatted as follows:
identifier, true label e.g.) 1, 0
init(SubtaskClassifiers) - Method in class textminer.task.Classifiers
Initialization of Classifiers
init(SubtaskClassifiers) - Method in class textminer.task.Classifier
Initialization of Classifiers
init(SubtaskClustering) - Method in class textminer.task.ClustersLD
Initialize by specification, subtasks
init(SubtaskClustering) - Method in class textminer.task.Clusterers
Initialize with specification, subtasks
init(SubtaskPreprocess) - Method in class textminer.task.Preprocessor
Initialize Preprocessor
init(SubtaskRepresentator) - Method in class textminer.task.DataRepresentClassify
Initialization of DataRepresentClassify
init(SubtaskRepresentator) - Method in class textminer.task.DataRepresentator1
Initialize DataRepresentator1
init(SubtaskRepresentator) - Method in class textminer.task.DataRepClassification
Initialize DataRepClassification
init(SubtaskSelector) - Method in class textminer.task.FeatureSelector
Initialize FeatureSelector using given task specification, subtask
init(tmTermDictionary[], int[], int[]) - Method in class textminer.featureselection.ModifiedTF
Initialization of ModifiedTF
initCriterion() - Method in class textminer.classification.DomainExperts
 
initializeWeights() - Method in class textminer.classification.ANN
Initialize weights in range of -1 to +1
initWeights() - Method in class textminer.classification.DomainExperts
 
insert(int, int, int) - Method in class textminer.ds.IntermediateTextFile
Insert a term with the specified info
insert(int, String) - Method in class textminer.ds.Lexicon
Insert the specified term and its identifier in this lexicon
insert(Object, Object) - Method in class textminer.ds.SequentialHashMap
Insert a given item by associating the specified value with the specified key
insert(ReutersInstance) - Method in class textminer.ds.ReutersDataSet
Insert the specified instance to this data set
insert(String) - Method in class textminer.ds.TextDocument
Insert the specified source into this class
insert(String) - Method in class textminer.ds.TermDictionary
Inserts the specified element in this TermDictionary.
insert(String) - Method in class textminer.ds.Lexicon
Insert the specified term in this Lexicon
insert(String) - Method in class textminer.ds.InvertedIndex
 
insert(String, double) - Method in class textminer.clustering.ProximityArray
Insert the specified key and value by associating the pair in this HashMap.
insert(String, int) - Method in class textminer.ds.TermDictionary
Insert the specified element with df in this TermDictionary
insert(String, String) - Method in class textminer.evaluation.Result
Insert the specified id and its label to this class
insert(String, String, String, String, String, String, String, String) - Method in class textminer.ds.FinancialDataSet
Insert the specified element to this FinancialDataSet
insert(String, String, String, String, String, String, String, String, String) - Method in class textminer.ds.ReutersDataSet
Insert the specified element to this data set
insert(Term) - Method in class textminer.ds.TermDictionary
Inserts the specified element in this TermDictionary.
insertChildNode(int, int) - Method in class textminer.ds.CondProbTable
Insert the specified node_id as one of child node with given num of parents, num_parents
insertChildNode(int, int, int, boolean, boolean, double) - Method in class textminer.ds.CondProbTable
Insert the specified node_id with its two parent (parent1 and parent2), flags of their presences, and conditional probability
insertParentNode(int, double) - Method in class textminer.ds.CondProbTable
Insert the given node as a parent node with conditional probability
Instance - class textminer.ds.Instance.
The Instance class is an encapsulation of the instance of given data set.
Instance() - Constructor for class textminer.ds.Instance
Constructor of Instance
instances_per_class - Variable in class textminer.text.LexiconGenerator
number of instances per each of classes
IntermediateTextFile - class textminer.ds.IntermediateTextFile.
The IntermediateTextFile class is an encapsulation of a text document.
IntermediateTextFile() - Constructor for class textminer.ds.IntermediateTextFile
Constructor of IntermediateTextFile
IntermediateTextFileEntry - class textminer.ds.IntermediateTextFileEntry.
The IntermediateTextFileEntry class is an encapsulation of a term in the inverted index.
IntermediateTextFileEntry() - Constructor for class textminer.ds.IntermediateTextFileEntry
Constructor of IntermediateTextFileEntry
IntermediateTextFileEntry(int, int, int) - Constructor for class textminer.ds.IntermediateTextFileEntry
Constructor of IntermediateTextFileEntry
InterOutcomes - interface textminer.core.InterOutcomes.
The InterOutcomes interface defines a set of file extensions which are used to name intermediate result files.
interpretFinalResult() - Method in class textminer.util.DataMerger
 
inverted_index_size - Variable in class textminer.datarepresentation.TextModel
size of inverted index.
inverted_index_size - Variable in class textminer.featureselection.AbstractFeatureSelector
size of inverted index.
invertedIndex - Variable in class textminer.datarepresentation.TextModel
inverted index in Vector
invertedIndex - Variable in class textminer.featureselection.AbstractFeatureSelector
inverted index in Vector
InvertedIndex - class textminer.ds.InvertedIndex.
The InvertedIndex class
InvertedIndex() - Constructor for class textminer.ds.InvertedIndex
 
InvertedIndexEntry - class textminer.ds.InvertedIndexEntry.
The InvertedIndexEntry class
InvertedIndexEntry() - Constructor for class textminer.ds.InvertedIndexEntry
 
InvertedMatrixGenerator - class textminer.task.InvertedMatrixGenerator.
The InvertedMatrixGenerator class is intended to perform the task of generating an inverted matrix.
InvertedMatrixGenerator(String, String, String, String, String, String, String, boolean) - Constructor for class textminer.task.InvertedMatrixGenerator
Constructor of InvertedMatrixGenerator
InvertedMatrixTermEntry - class textminer.ds.InvertedMatrixTermEntry.
The InvertedMatrixTermEntry class
InvertedMatrixTermEntry() - Constructor for class textminer.ds.InvertedMatrixTermEntry
Constructor of InvertedMatrixTermEntry
InvertedMatrixTermEntry(int) - Constructor for class textminer.ds.InvertedMatrixTermEntry
Constructor of InvertedMatrixTermEntry
IOUtil - class textminer.util.IOUtil.
The IOUtil class provides a set of utilities related to disk I/O.
IOUtil() - Constructor for class textminer.util.IOUtil
 
isExistTdicfiles() - Method in class textminer.text.DictionaryMaker1
Return true if term dictionary files exist
isExistTdicfiles() - Method in class textminer.text.DictionaryMaker
Return true if term dictionary files exist
isExistTdicfiles() - Method in class textminer.text.DictionaryGeneratorMultiClass
Verify whether files of selected term dictionaries exist in the result repository in order to avoid duplicate job
isExistTdicfiles() - Method in class textminer.text.DictionaryGenerator
Return true if term dictionary files exist
isFileExist(String) - Static method in class textminer.util.IOUtil
Return true if the specified filename exists
IsItOccur(int) - Method in class textminer.ds.CondProbTable
 
IsLexiconExist() - Method in class textminer.text.LexiconGenerator
Return true if the lexicon file exists
isMatrixfileExist() - Method in class textminer.text.TermbyDocumentMatrix
Return true if the task of constructing term-by-document matrix is already done
isTerminator(char) - Static method in class textminer.util.TextUtil
Return true if the specified c is the sentence terminator

K

KLDivergence(double[], double[]) - Static method in class textminer.util.Similarity
Return Kullback-Leibler (KL) divergence between two given probability distributions.
KL divergence is a measure of how different two probability distributions (over the same event space).
kMeans - class textminer.clustering.kMeans.
The kMeans class is an implementation of k-means clustering.
KMEANS - Static variable in interface textminer.clustering.ClusteringMethods
k-means
kMeans(SubtaskClustering) - Constructor for class textminer.clustering.kMeans
Constructor of kMeans
kMeansLD - class textminer.clustering.kMeansLD.
The kMeansLD class is an extended version of k-Means.
kMeansLD() - Constructor for class textminer.clustering.kMeansLD
Constructor of kMeansLD
KNN - class textminer.classification.KNN.
The KNN class is an implementation of k-Nearest Neighbors (kNN).
KNN() - Constructor for class textminer.classification.KNN
Constructor of KNN
KNN(TermDictionary, String) - Constructor for class textminer.classification.KNN
Constructor of KNN

L

learning() - Method in class textminer.classification.ANN
Running network
learnWeight(int[], double[], int) - Method in class textminer.classification.WidrowHoff
Modify weight vectors by using the specified document which is represented by terms' location and their weights
learnWeight(int[], double[], int) - Method in class textminer.classification.EGradient
Modify weight vectors by using the specified document which is represented by terms' location and their weights
lexicon - Variable in class textminer.text.TextDocumentConverter
(global) lexicon of given data set
lexicon - Variable in class textminer.text.LexiconGenerator
a structure for global lexicon
Lexicon - class textminer.ds.Lexicon.
The Lexicon class is a data structure of maintaining all unique words that appears in the given data set.
lexicon_filename - Variable in class textminer.text.LexiconGenerator
name of global lexicon file
lexicon_stat_filename - Variable in class textminer.text.LexiconGenerator
name of file containing lexicon statistics of given dataset.
Lexicon() - Constructor for class textminer.ds.Lexicon
Constructor of Lexicon
LexiconEntry - class textminer.ds.LexiconEntry.
The LexiconEntry class is a data structure of an entry in Lexicon for a given data set.
LexiconEntry() - Constructor for class textminer.ds.LexiconEntry
Constructor of LexiconEntry
LexiconEntry(String, int) - Constructor for class textminer.ds.LexiconEntry
Constructor of LexiconEntry
LexiconGenerator - class textminer.text.LexiconGenerator.
The LexiconGenerator is designed to carry out the task of generating a lexicon for a given data set.
LexiconGenerator(CorpusIndex, String, String, String, String, String, String, String, Vector, boolean) - Constructor for class textminer.text.LexiconGenerator
Constructor of LexiconGenerator
LexiconGeneratorReuters - class textminer.text.LexiconGeneratorReuters.
The LexiconGeneratorReuters class is responsible for generating a lexicon for Reuters-21578 data sets.
LexiconGeneratorReuters(CorpusIndex, String, String, String, String, String, String, String, Vector, boolean) - Constructor for class textminer.text.LexiconGeneratorReuters
Constructor of LexiconGeneratorReuters
LinearClassifier - class textminer.classification.LinearClassifier.
The LinearClassifier class is an abstract class for linear classifiers.
LinearClassifier(int, int, double[][], double) - Constructor for class textminer.classification.LinearClassifier
Constructor of LinearClassifier
lineseparator - Variable in class textminer.text.Indexer
platform-dependent line separator
load_datafile() - Method in class textminer.ds.DataSet
Load the dataset file from the hard disk Note: the separator of attributes should be comma (,).
loadCPT() - Method in class textminer.classification.BayesianNet
Load conditional probability tables for target classes
loadData() - Method in class textminer.task.Classifiers
Load various data for text classification from disk.
loadGlobalLexicon() - Method in class textminer.featureselection.AbstractFeatureSelector
 
loadIndex() - Method in class textminer.task.ClustersLD
 
loadIndex() - Method in class textminer.task.AbstractDataRepresentation
Generate the structure of index from disk
loadIndexfile(String, int) - Method in class textminer.ds.CorpusIndex
Load an index from the specified file according to the option [option] 0: classification, 1: clustering
loadLexiconfile(String) - Method in class textminer.ds.Lexicon
Load lexicon from the specified filename
loadModels() - Method in class textminer.classification.AbstractClassifier
Load estimated the word feature set, class text models and term-by-matrix from disk
loadProximityArray(String) - Method in class textminer.clustering.ProximityArray
 
loadStatistics(String, int) - Static method in class textminer.util.IOUtil
Load the statistics of a given data set from the specified stat_filename on the disk
loadTestData(String, Vector) - Static method in class textminer.util.IOUtil
Load data from the specified filename.
Convert a textual data file into an array of java.util.Vector
local_lexicon_filename - Variable in class textminer.text.LexiconGenerator
common name of local (each class) lexicon file
local_lexicons - Variable in class textminer.text.LexiconGenerator
array of lexicons for each of classes
localLexiconFilename - Variable in class textminer.datarepresentation.TextModel
common file name of local lexicon file
log(String) - Method in class textminer.clustering.AbstractClusterer
 
log(String) - Method in class textminer.datarepresentation.TextModel
 
log(String) - Method in class textminer.datarepresentation.AbstractDataSetConverter
Print the given msg
log(String) - Method in class textminer.ds.BagOfWords
 
log(String) - Method in class textminer.featureselection.AbstractFeatureSelector
 
log(String) - Method in class textminer.task.AbstractDataRepresentation
 
log(String) - Method in class textminer.text.TextModel
Print the specified msg out at the standard output
log(String) - Method in class textminer.text.Indexer
 
log(String) - Method in class textminer.text.DictionaryGeneratorMultiClass
 
log(String) - Method in class textminer.text.DictionaryGenerator
 
log2(double) - Static method in class textminer.util.MathUtil
Return logarithm of x for base 2.
LOWER_BOUND - Static variable in class textminer.text.DictionaryGeneratorMultiClass
Definition of the threshold for infrequent word

M

Main - class textminer.core.Main.
The Main class is intended to set up the environment for TextMiner and make TextMiner ready to go.
Main(String) - Constructor for class textminer.core.Main
Constructor of Main
make_FinancialData(String, String) - Static method in class textminer.util.FinancialDataMaker
Return a data structure for a financial data associated with the specified key
make_NewsgroupInstance(String) - Static method in class textminer.util.NewsgroupdataMaker
Return NewsgroupInstance from the specified filename
make_results(String) - Method in class textminer.clustering.PDDP
 
make_results(String) - Method in class textminer.clustering.kMeans
Write the clustering resutls the specified filename
make_results(String) - Method in class textminer.clustering.hacLD
Write the clustering result the specified filename
make_results(String) - Method in class textminer.clustering.HAC
Write the clustering result the specified filename
make_results(String) - Method in class textminer.clustering.emLD
 
make_results(String) - Method in class textminer.clustering.EM
Generate the output file which has the pair of instance and its predicted cluster
make_results(String) - Method in class textminer.clustering.Competitive
Generate the output file which has the pair of instance and its predicted cluster
make_results(String, boolean) - Method in class textminer.clustering.hacSM
 
make_results(String, Vector[]) - Method in class textminer.clustering.kMeansLD
Write the clustering result the specified filename
make_TDTData(String) - Static method in class textminer.util.TDTdataMaker
Make an instance of TDTData from the specified filename
makeDataset(String) - Static method in class textminer.util.FinancialDataMaker
Make FinancialDataSet from the specified filename
makeDataSet(String) - Static method in class textminer.util.ReutersDataMaker
Make ReutersDataSet from the specified filename
makeDataset(Vector) - Static method in class textminer.util.FinancialDataMaker
Make FinancialDataSet from the specified vector
makeDecision(NewsArticle) - Method in class textminer.classification.KNN
Classify the specified article
makeDecision(NewsArticle, String) - Method in class textminer.classification.DomainExperts
 
makeDecision(TermDictionary) - Method in class textminer.classification.KNN
Classify the specified test document
makeInstance(String, String) - Static method in class textminer.util.ReutersDataMaker
Make an instance of ReutersInstance associated with the specified key from the specified file.
makeTextDocument(String) - Static method in class textminer.util.TextUtil
Return an instance of textminer.ds.TextDocument from a given text, source
MarkovBlanket - class textminer.featureselection.MarkovBlanket.
The MarkovBlanket class is an implementation of the feature selection by the Markov Blanket measure.
MarkovBlanket - Static variable in interface textminer.featureselection.FeatureSelectionMethods
Feature selection by Markov blanket
MarkovBlanket(CorpusIndex, Vector, Vector, int[], int[], int[], int, int, String, String, boolean) - Constructor for class textminer.featureselection.MarkovBlanket
Constructor of MarkovBlanket
markPurpose(int) - Method in class textminer.text.IndexFinancial
Mark each data whether it belongs to which data set, training or test
MathUtil - class textminer.util.MathUtil.
The MathUtil class provides a set of mathematical utility functions.
MathUtil() - Constructor for class textminer.util.MathUtil
 
matrix - Variable in class textminer.text.TermbyDocumentMatrix
array of full-matrix
matrix_filename - Variable in class textminer.text.TermbyDocumentMatrix
name of term-by-document matrix file
max(double[]) - Static method in class textminer.util.MathUtil
Return the index of which value is the maximum in the given array
mean(double[]) - Static method in class textminer.util.StatUtil
Returns the mean of the specified vector
merge_cluster(String, String, double) - Method in class textminer.ds.HACClusterSet
Merge the specified cluster1 and cluster2 with the specified similarity value
Merge(int) - Method in class textminer.util.DataMerger
Perform merge at the first phase
mergeArrays(int[], int[]) - Static method in class textminer.util.DataUtil
Merge two given arrays into one
mergesort(int[], int[]) - Static method in class textminer.util.DataUtil
Return an array resulted from the merging and sorting of two given arrays.
min(double[]) - Static method in class textminer.util.MathUtil
Return the index of which value is the minimum in the given array
model - Variable in class textminer.text.TextModel
estimated model of a given data set
MODIFIED_TF - Static variable in interface textminer.featureselection.FeatureSelectionMethods
Feature selection by modified Term Frequency
ModifiedTF - class textminer.featureselection.ModifiedTF.
The ModifiedTF class is an extension of term-frequency based feature selection method.
ModifiedTF() - Constructor for class textminer.featureselection.ModifiedTF
Constructor of ModifiedTF
MultinomialGenerativeModel - class textminer.text.MultinomialGenerativeModel.
The MultinomialGenerativeModel is an abstraction of text model.
MultinomialGenerativeModel(CorpusIndex, String, String, Vector, ArrayList, int[]) - Constructor for class textminer.text.MultinomialGenerativeModel
Constructor of MultinomialGenerativeModel
MultinomialModel - class textminer.datarepresentation.MultinomialModel.
The MultinomialModel class is an implementation of a language model that considers words to be "events."
MultinomialModel(CorpusIndex, Vector, Vector, int[], int[], int[], int, int, String, Lexicon, String, String, boolean) - Constructor for class textminer.datarepresentation.MultinomialModel
Constructor of MultinomialModel
MUTUAL_INFO - Static variable in interface textminer.featureselection.FeatureSelectionMethods
Feature selection by Mutual Information
MutualInfo - class textminer.featureselection.MutualInfo.
The MutualInfo class is an implementation of a feature selection method by mutual information.
MutualInfo(CorpusIndex, Vector, Vector, int[], int[], int[], int, int, String, String, boolean) - Constructor for class textminer.featureselection.MutualInfo
Constructor of MutualInfo

N

NAIVE_BAYES - Static variable in interface textminer.classification.ClassificationMethods
naive Bayes classification
NaiveBayes - class textminer.classification.NaiveBayes.
The NaiveBayes class is an implementation of the naive Bayes classification for text data.
NaiveBayes(double[][], double[]) - Constructor for class textminer.classification.NaiveBayes
Constructor of NaiveBayes
NaiveBayes1 - class textminer.classification.NaiveBayes1.
The NaiveBayes class1 is an implementation of naive Bayes classification for text data.
NaiveBayes1(int, double[], Vector) - Constructor for class textminer.classification.NaiveBayes1
Constructor of NaiveBayes
NewsArticle - class textminer.ds.NewsArticle.
The NewsArticle class is the abstraction of a news article.
NewsArticle() - Constructor for class textminer.ds.NewsArticle
Constructor of NewsArticle
NewsArticle(String, String, String, String) - Constructor for class textminer.ds.NewsArticle
Constructor of NewsArticle
NewsgroupdataMaker - class textminer.util.NewsgroupdataMaker.
The NewsgroupdataMaker is implemented to generate a data structure which contains an instance or a set of instances of "20 Newsgroup" data set in a structural and machine-readable form.
NewsgroupdataMaker() - Constructor for class textminer.util.NewsgroupdataMaker
 
NewsgroupInstance - class textminer.ds.NewsgroupInstance.
The NewsgroupInstance class is a data structure of an instance of 20 News group data set.
NewsgroupInstance() - Constructor for class textminer.ds.NewsgroupInstance
Constructor of NewsgroupInstance
NewsgroupInstance(String, String, String) - Constructor for class textminer.ds.NewsgroupInstance
Constructor of NewsgroupInstance
NOFS - Static variable in interface textminer.featureselection.FeatureSelectionMethods
No feature selection
noiseRemover - Variable in class textminer.text.TextDocumentConverter
instance of TextNoiseRemover
noiseRemover - Variable in class textminer.text.LexiconGenerator
instance of TextNoiseRemover
noiseRemover - Variable in class textminer.text.DictionaryGeneratorMultiClass
 
noiseRemover - Variable in class textminer.text.DictionaryGenerator
instance of TextNoiseRemover
normal_density(double, double, double) - Static method in class textminer.util.StatUtil
Returns the value of (univariate) normal distribution density function
normalization() - Method in class textminer.ds.DataSet
Copy the original data, in order to apply normalization
normalization1() - Method in class textminer.ds.DataSet
Normalize each instances by dividing each components by the maximum value in the field.
normalization2() - Method in class textminer.ds.DataSet
Normalize each instance by Euclidean norm
normalization3() - Method in class textminer.ds.DataSet
Normalize each attribute value by subtracting from the attribute's mean and dividing the standard deviation
normalize_euclideannorm(double[]) - Static method in class textminer.util.DataUtil
Normalize a given vector by Euclidean Norm.
normalize_sum(double[]) - Static method in class textminer.util.DataUtil
Normalize a given vector by their sum
normalized - Variable in class textminer.ds.DataSet
 
num_instances - Variable in class textminer.clustering.AbstractClusterer
 
num_of_class - Variable in class textminer.classification.AbstractClassifier
 
num_of_classes - Variable in class textminer.datarepresentation.TextModel
number of classes in given data set
num_of_classes - Variable in class textminer.featureselection.AbstractFeatureSelector
number of classes in given data set
num_of_classes - Variable in class textminer.text.TextModel
number of classes (target labels)
num_of_classes - Variable in class textminer.text.TermbyDocumentMatrix
number of classes in a given data set
num_of_classes - Variable in class textminer.text.LexiconGenerator
number of classes in given data set
num_of_classes - Variable in class textminer.text.DictionaryGeneratorMultiClass
 
num_of_examples - Variable in class textminer.classification.AbstractClassifier
 
num_of_examples - Variable in class textminer.text.TermbyDocumentMatrix
number of examples in a given data set
num_of_instances - Variable in class textminer.text.TextDocumentConverter
number of instances (e.g.
num_of_instances - Variable in class textminer.text.LexiconGenerator
number of instances in given data set
num_of_instances - Variable in class textminer.text.Indexer
total number of instances (e.g.
num_of_instances - Variable in class textminer.text.DictionaryGeneratorMultiClass
 
num_of_instances - Variable in class textminer.text.DictionaryGenerator
number of instances in the given data set
num_of_terms - Variable in class textminer.text.TermbyDocumentMatrix
number of terms in the word feature set
num_of_terms - Variable in class textminer.text.DictionaryGeneratorMultiClass
 
num_of_train - Variable in class textminer.text.TermbyDocumentMatrix
total number of examples in the train set
num_of_train - Variable in class textminer.text.Indexer
number of train data set
num_of_unique_terms - Variable in class textminer.datarepresentation.TextModel
number of unique terms in given data set
num_of_unique_terms - Variable in class textminer.featureselection.AbstractFeatureSelector
number of unique terms in given data set

O

offStopWatch(String) - Method in class textminer.util.ResourceManager
Stop to measure the elapsed time for a sub-task
onStopWatch() - Method in class textminer.util.ResourceManager
Start to measure the elapsed time for a sub-task
optimizeMemory(long) - Method in class textminer.util.ResourceManager
Optimize memory usage by the specified ratio
original_size - Variable in class textminer.text.DictionaryGenerator
number of original terms in the given data set

P

patternMatch(char[], char[]) - Static method in class textminer.util.TextUtil
Returns the number of occurrences of pattern within string.
It is an implementation of Boyer-Moore String matching algorithm.
PDDP - class textminer.clustering.PDDP.
The PDDP class is an implementation of the Principal Direction Divisive Partitioning (PDDP).
PDDP() - Constructor for class textminer.clustering.PDDP
Constructor of PDDP
performDataRepresentation() - Method in class textminer.datarepresentation.VectorSpaceModel
Return true if the task of data representation is done successfully
performDataRepresentation() - Method in class textminer.datarepresentation.TextModel
Perform the task of feature selection by using the specified method
performDataRepresentation() - Method in class textminer.datarepresentation.MultinomialModel
Return true if the task of data representation is done successfully
performDataRepresentation() - Method in class textminer.datarepresentation.ConditionalProbTable
Return true if the job of generating conditional probability table for given data set is done successfully
performDataRepresentation() - Method in class textminer.datarepresentation.BooleanModel
Return true if the task of data representation is done successfully
performFeatureSelection() - Method in class textminer.featureselection.MutualInfo
Perform the job of feature selection by Mutual Information
performFeatureSelection() - Method in class textminer.featureselection.MarkovBlanket
Perform the job of feature selection by the Markov Blanket measure
performFeatureSelection() - Method in class textminer.featureselection.InfoGain
Perform the task of feature selection by Information Gain
performFeatureSelection() - Method in class textminer.featureselection.ExpectedCrossEntropy
Perform the job of feature selection by the expected cross entropy measure
performFeatureSelection() - Method in class textminer.featureselection.DocumentFrequency
Perform the task of feature selection by Information Gain
performFeatureSelection() - Method in class textminer.featureselection.DistributionalWordClustering
Perform the task of feature selection by Information Gain
performFeatureSelection() - Method in class textminer.featureselection.ConditionalMutualInfo
Perform the job of feature selection by Mutual Information
performFeatureSelection() - Method in class textminer.featureselection.ChiStat
Perform the job of feature selection by Chi square statistics
performFeatureSelection() - Method in class textminer.featureselection.AbstractFeatureSelector
Perform the task of feature selection by using the specified method
Preprocessor - class textminer.task.Preprocessor.
The Preprocessor class is primarily responsible for making a given data set ready for next steps.
Preprocessor() - Constructor for class textminer.task.Preprocessor
Constructor of Preprocessor
Pretokenizer - class textminer.text.Pretokenizer.
The Pretokenizer class is intended to separate a string into a set of grammatical unit.
Pretokenizer() - Constructor for class textminer.text.Pretokenizer
 
print_clusters() - Method in class textminer.clustering.kMeansLD
Print the clustering result
print_clusters() - Method in class textminer.clustering.kMeans
Print the clustering results
print_clusters() - Method in class textminer.ds.HACClusterSet
Print all information about final discovered clusters
print_components() - Method in class textminer.ds.Instance
Print all components
print_performance(boolean) - Method in class textminer.evaluation.EvaluationMeasure
Print the result on the standard output or a file
print_result() - Method in class textminer.ds.HACClusterSet
Print final members of each of clusters
printAll() - Method in class textminer.clustering.ProximityArray
Print all elements in this array
printAll() - Method in class textminer.ds.TextDocument
Print out all elements of this class
printAll() - Method in class textminer.ds.TermDictionary
Display all elements in this TermDictionary
printAll() - Method in class textminer.ds.ReutersDataSet
Print all elements belong to this ReutersDataSet
printAll() - Method in class textminer.ds.FinancialDataSet
Print all elements of this FinancialDataSet
printAll() - Method in class textminer.ds.BagOfWords
 
printAll() - Method in class textminer.evaluation.Result
Print all elements of this class
printAll() - Method in class textminer.task.Task
Print specification of task
printAll(int) - Method in class textminer.ds.CorpusIndex
Print all items in this CorpusIndex object
printArray(String, float[]) - Method in class textminer.classification.ANN
Print network with weights
printEnvVariables() - Method in class textminer.core.Env
Print all environmental variables assigned to TextMiner
printHashMap(HashMap) - Static method in class textminer.util.IOUtil
Print all members of the specified source
prob_each_class - Variable in class textminer.classification.AbstractClassifier
 
ProximityArray - class textminer.clustering.ProximityArray.
The ProximityArray class is a data structure for maintaining the proximity matrix.
ProximityArray() - Constructor for class textminer.clustering.ProximityArray
Constructor of ProximityArray
ProximityArray(int) - Constructor for class textminer.clustering.ProximityArray
Constructs an empty ProximityArray with the specified initial capacity.
pruneEntry(TermDictionary, int, int) - Static method in class textminer.ds.DictionaryHandler
 
pruneEntry(TermDictionary, int, int) - Static method in class textminer.text.DictionaryHandler
Remove a set of entries which its frequency didn't satisfy the predefined thresholds (lower or upper)
pruneTerms(tmTermDictionary, int) - Static method in class textminer.util.VectorUtil
Prune the specified TermDictionary, source, according to the specified size
PUNCT - Static variable in class textminer.text.Pretokenizer
Definition of punctional separators

Q

quicksort(int[]) - Static method in class textminer.util.DataUtil
Sorts a given array of integers in ascending order.

R

RANDOM_GUESS - Static variable in interface textminer.classification.ClassificationMethods
Random guess
ratio_of_trainset - Variable in class textminer.task.SubtaskPreprocess
 
readIndex(int) - Method in class textminer.text.DictionaryMaker
Read instances from CorpusIndex
readLine(int) - Method in class textminer.util.FileCache
Return a character string at the specified index
RealTDMatrix - class textminer.text.RealTDMatrix.
The RealTDMatrix class is an abstraction of term-by-document matrix.
RealTDMatrix(CorpusIndex, String, String, Vector, ArrayList, int[]) - Constructor for class textminer.text.RealTDMatrix
Constructor of RealTDMatrix
removeAll() - Method in class textminer.ds.ReutersDataSet
Remove all elements from this instance
removeAll() - Method in class textminer.ds.FinancialDataSet
Remove all elements from this FinancialDataSet
removeDir(String) - Static method in class textminer.util.IOUtil
Remove the directory by the specified path
removeHTMLTAG(String) - Method in class textminer.text.TextNoiseRemover
Remove HTML tags from the specified string, src
removeNoise(String, int) - Method in class textminer.text.TextNoiseRemover
Remove noise which have nothing with the performance of text learning.
removeSTOPWORD(String) - Method in class textminer.text.TextNoiseRemover
Remove the stop-words from the specified string, src
removeSymbolTerm(String) - Method in class textminer.text.TextNoiseRemover
Remove any words that started with numerical-expressions or other symbols
reportMemoryStatus(String) - Method in class textminer.util.ResourceManager
Display current memorey availability for the specified job
representDocument(int[], double[]) - Method in class textminer.classification.LinearClassifier
Represent a text document as a multidimensional real-valued vector
ResourceManager - class textminer.util.ResourceManager.
The ResourceManager class is intended to provide an efficient management of computing resources, such as memory, disk-space, others, which are assigned to TextMiner.
ResourceManager(boolean) - Constructor for class textminer.util.ResourceManager
Constructor of ResourceManager
Result - class textminer.evaluation.Result.
The Result class is an abstraction of a pair of an instance and its labels (target or output) in a given data set.
result_dir - Variable in class textminer.text.TermbyDocumentMatrix
path of result directory
result_repository - Variable in class textminer.classification.AbstractClassifier
 
result_repository - Variable in class textminer.clustering.AbstractClusterer
 
result_repository - Variable in class textminer.task.Subtask
Path of result directory
result_repository - Variable in class textminer.text.TextModel
path of result directory
result_repository - Variable in class textminer.text.LexiconGenerator
directory of containing intermediate results
result_repository - Variable in class textminer.text.DictionaryGeneratorMultiClass
 
result_repository - Variable in class textminer.text.DictionaryGenerator
path of result directory
Result() - Constructor for class textminer.evaluation.Result
Constructor of Result
resultDir - Variable in class textminer.datarepresentation.TextModel
Path of result directory
resultDir - Variable in class textminer.featureselection.AbstractFeatureSelector
Path of result directory
ResultEntry - class textminer.evaluation.ResultEntry.
The ResultEntry is an abstraction of an instance in a given data set.
ResultEntry() - Constructor for class textminer.evaluation.ResultEntry
Constructor of ResultEntry
ResultEntry(String, String) - Constructor for class textminer.evaluation.ResultEntry
Constructor of ResultEntry
resultFileName - Variable in class textminer.datarepresentation.TextModel
name of file containing results of feature selection
resultFileName - Variable in class textminer.featureselection.AbstractFeatureSelector
name of file containing results of feature selection
ReutersDataMaker - class textminer.util.ReutersDataMaker.
The ReutersDataMaker class is primarily used to generate a data structure for containing an instance (a document) or a set of instances in a structural and machine-readable form of the "Reuters-21578" data set.
ReutersDataMaker() - Constructor for class textminer.util.ReutersDataMaker
 
ReutersDataSet - class textminer.ds.ReutersDataSet.
The ReutersDataSet is an encapsulation of a collection of Reuters-21578 data.
ReutersDataSet() - Constructor for class textminer.ds.ReutersDataSet
Constructor of ReutersDataSet
ReutersDataSetConverter - class textminer.datarepresentation.ReutersDataSetConverter.
The ReutersDataSetConverter class is implemented to convert text documents belong to test data set of Reuters-21578 dataset into machine-understandable form.
ReutersDataSetConverter(CorpusIndex, Lexicon, Vector, String, String, String) - Constructor for class textminer.datarepresentation.ReutersDataSetConverter
Constructor of ReutersDataSetConverter
ReutersInstance - class textminer.ds.ReutersInstance.
The ReutersInstance class is an abstraction of an instance of "Reuters-21578" data set.
ReutersInstance() - Constructor for class textminer.ds.ReutersInstance
Constructor of ReutersInstance
ReutersInstance(String, String, String, String, String, String, String, String, String) - Constructor for class textminer.ds.ReutersInstance
Constructor of ReutersInstance
round(double) - Static method in class textminer.util.MathUtil
Rounds a double to the next nearest integer value.
roundDouble(double, int) - Static method in class textminer.util.MathUtil
Rounds a double to the given number of decimal places.

S

search(int) - Method in class textminer.ds.IntermediateTextFile
Search the term with the specified identifier, no
search(Integer) - Method in class textminer.ds.IntermediateTextFileEntry
Search an occurrence at the specified document, doc_id
search(String) - Method in class textminer.ds.TextDocument
Search the specified element from this class
search(String) - Method in class textminer.ds.TermDictionary
Search the specified element in this TermDictionary
search(String) - Method in class textminer.ds.Lexicon
Search the specified term from this Lexicion
search(String) - Method in class textminer.evaluation.Result
Search the specified id at this class
searchChildNode(int) - Method in class textminer.ds.CondProbTable
Return child node if found
selected_termdic - Variable in class textminer.text.DictionaryGenerator
term dictionary for given data set
SequentialHashMap - class textminer.ds.SequentialHashMap.
The SequentialHashmap class is intended to provide sequential access for java.util.HashMap.
SequentialHashMap() - Constructor for class textminer.ds.SequentialHashMap
Constructor of SequentialHashMap
SequentialHashMap(int) - Constructor for class textminer.ds.SequentialHashMap
Constructor of SequentialHashMap
set_cluster(String) - Method in class textminer.ds.HACClusterSet
Set the specified key as a member of this cluster
set_data(double[][], int, int) - Method in class textminer.ds.DataSet
Set an array of double into normalized data
set_data(tmDocVectorIndex) - Method in class textminer.clustering.hacLD
Set vectorized data
set_data(tmDocVectorIndex) - Method in class textminer.clustering.emLD
Set vectorized data set
set_data(tmDocVectorIndex, int) - Method in class textminer.clustering.kMeansLD
Set vectorized data set according the specified size
set_flag_feature(String) - Method in class textminer.task.TaskOrder
Set flag of feature selection with the specified value
set_flag_learning(String) - Method in class textminer.task.TaskOrder
Set flag of learning method with the specified value
set_flag_preprocess(int, boolean) - Method in class textminer.task.TaskOrder
Set flag of pre-precessor at the specified index by value
set_freq(int) - Method in class textminer.ds.Term
Replaces the frequency of this term with the specified freq
set_id(String) - Method in class textminer.evaluation.ResultEntry
Change the specified id to the identifier of this class
set_label(String) - Method in class textminer.evaluation.ResultEntry
Change the specified label to the label of this class
set_member(int, String) - Method in class textminer.ds.ClusterMembership
Set the specified key in the group at the specified index
set_runningmode(boolean) - Method in class textminer.clustering.SOM
Set current running mode
set_term(char[]) - Method in class textminer.ds.Term
Replaces the term in this Class with the specified term
set_weight(double) - Method in class textminer.ds.weightedTerm
Set the specified weight
setBody(String) - Method in class textminer.ds.TDTData
 
setBody(String) - Method in class textminer.ds.NewsgroupInstance
 
setBody(String) - Method in class textminer.ds.FinancialData
Set the specified body as the content of this news article
setBodyField(String) - Method in class textminer.ds.ReutersInstance
Set body field
setCompany(String) - Method in class textminer.ds.FinancialData
Set the specified company as the company that this news article mainly discussed about
setCondProbOneParent(boolean, double) - Method in class textminer.ds.CPTEntry
 
setCondProbTwoParents(boolean, boolean, double) - Method in class textminer.ds.CPTEntry
 
setDate(String) - Method in class textminer.ds.TDTData
 
setDate(String) - Method in class textminer.ds.FinancialData
Set the specified date as the published date of this news article
setDateField(String) - Method in class textminer.ds.ReutersInstance
Set date field
setExchangeField(String) - Method in class textminer.ds.ReutersInstance
Set exchanges field
setID(int) - Method in class textminer.ds.LexiconEntry
Replace the specified identifier
setId(String) - Method in class textminer.ds.NewsgroupInstance
 
setID(String) - Method in class textminer.ds.TDTData
 
setID(String) - Method in class textminer.ds.FinancialData
Set the specified id as the identifier of this news article
setIdField(String) - Method in class textminer.ds.ReutersInstance
Set the specified id
setKeySet(Vector) - Method in class textminer.ds.SequentialHashMap
Assign a set of keys
setLabel(String) - Method in class textminer.ds.FinancialData
Set the specified label as the target label of this news article
setMode(boolean) - Method in class textminer.classification.ANN
Set current network mode, training or testing
setNumOfTerms(int) - Method in class textminer.core.Env
Assign the number of components per a (document) vector
setOccurrence(Integer, String) - Method in class textminer.ds.IntermediateTextFileEntry
Set an occurrence
setOccurrence(String, String) - Method in class textminer.ds.InvertedMatrixTermEntry
Set the specified doc_id and value
setOrgField(String) - Method in class textminer.ds.ReutersInstance
Set org field
setParentID(int, int) - Method in class textminer.ds.CPTEntry
 
setPeopleField(String) - Method in class textminer.ds.ReutersInstance
Set people field
setPlaceField(String) - Method in class textminer.ds.ReutersInstance
Set place field
setRepModel(String) - Method in class textminer.task.Classifiers
Set the model of data representation
setSource(String) - Method in class textminer.ds.TDTData
 
setSource(String) - Method in class textminer.ds.FinancialData
Set the specified source as the news provider of this news article
setSubject(String) - Method in class textminer.ds.NewsgroupInstance
 
setTerm(String) - Method in class textminer.ds.LexiconEntry
Replace the specified term
setTermID(int) - Method in class textminer.ds.InvertedMatrixTermEntry
Replace the specified termid with old one
setTitle(String) - Method in class textminer.ds.TDTData
 
setTitle(String) - Method in class textminer.ds.FinancialData
Set the specified title as the title of this news article
setTitleField(String) - Method in class textminer.ds.ReutersInstance
Set title field
setTopicField(String) - Method in class textminer.ds.ReutersInstance
Set topics as separated by a space (32)
setupData(tmVectorizedDS, int, int) - Method in class textminer.classification.ANN
Set vectorized data
setupTask() - Method in class textminer.task.Task
Setup a task from given task specification file
setURL(String) - Method in class textminer.ds.FinancialData
Set the specified url as the URL of this news article
signedAverage(double[]) - Static method in class textminer.util.MathUtil
Return average value of a given array Note: elements of which sign are positive are only eligible for computation of average
signedMin(double[]) - Static method in class textminer.util.MathUtil
Return the index of which value is the minimun in the given array Note: elements of which sign are positive are only compared with each other
Similarity - class textminer.util.Similarity.
The Similarity class provides a set of measures that are used to estimate the similarity between two abstract objects, such as (real-valued and multi-dimensional) vectors and probability distributions.
Similarity() - Constructor for class textminer.util.Similarity
 
SINGLE_LINK - Static variable in interface textminer.clustering.hacMethods
Definition of single link
size_index - Variable in class textminer.text.TextModel
number of instances of given data set
size_model - Variable in class textminer.text.TextModel
number of dimensions (it can be interpreted in various ways: attributes of an instance, number of terms in bag of words, and number of components in a document vector)
SIZE_OF_DOCVEC - Static variable in interface textminer.core.Constants
 
size_of_index - Variable in class textminer.text.DictionaryGenerator
number of entrys in the index of the given data set, i.e.
SOM - class textminer.clustering.SOM.
The SOM class is an implementation of the Self-Organized Map.
SOM - Static variable in interface textminer.clustering.ClusteringMethods
SOM
SOM(SubtaskClustering) - Constructor for class textminer.clustering.SOM
Constructor of SOM
sortbyWeight(ArrayList, ArrayList) - Static method in class textminer.util.VectorUtil
Sort the specified weight in descending order and Return array of ArrayList: terms and their weights
sortbyWeight(double[], double[]) - Static method in class textminer.util.VectorUtil
Sort the specified weight in descending order and Return array of ArrayList: terms and their weights
source - Variable in class textminer.ds.NewsArticle
Source (news provider) of a news article, such as CNN, Reuters, and so on.
stat_filename - Variable in class textminer.classification.AbstractClassifier
 
StatUtil - class textminer.util.StatUtil.
The StatUtil class provides a set of common statistical functions.
StatUtil() - Constructor for class textminer.util.StatUtil
 
stem() - Method in class textminer.text.Stemmer
Stem the word placed into the Stemmer buffer through calls to add().
Stemmer - class textminer.text.Stemmer.
The Stemmer class is an implementation of the Porter's stemming algorithm [Porter, 1980] and is intended to transform a word into its root form.
Stemmer() - Constructor for class textminer.text.Stemmer
Constructor of Stemmer
Subtask - class textminer.task.Subtask.
The Subtask is an encapsulation of the specification of a given sub-procedure (or task).
Subtask() - Constructor for class textminer.task.Subtask
Constructor of Subtask
SubtaskClassifiers - class textminer.task.SubtaskClassifiers.
The SubtaskClassifiers is a specification for text classification.
SubtaskClassifiers() - Constructor for class textminer.task.SubtaskClassifiers
Constructor of SubtaskClassifiers
SubtaskClustering - class textminer.task.SubtaskClustering.
The SubtaskClustering is a specification for the task of text clustering.
SubtaskClustering() - Constructor for class textminer.task.SubtaskClustering
Constructor of SubtaskClustering
SubtaskPreprocess - class textminer.task.SubtaskPreprocess.
The SubtaskPreprocess is a specification for the task of preprocessing.
SubtaskPreprocess() - Constructor for class textminer.task.SubtaskPreprocess
Constructor of SubtaskPreprocess
SubtaskRepresentator - class textminer.task.SubtaskRepresentator.
The SubtaskRepresentator is a specification for the task of data representation.
SubtaskRepresentator() - Constructor for class textminer.task.SubtaskRepresentator
Constructor of SubtaskRepresentator
subtasks - Variable in class textminer.task.SubtaskPreprocess
 
SubtaskSelector - class textminer.task.SubtaskSelector.
The SubtaskSelector is a specification for the task of feature (subset) selection.
SubtaskSelector() - Constructor for class textminer.task.SubtaskSelector
Constructor of SubtaskSelector()

T

targetLabels - Variable in class textminer.classification.AbstractClassifier
 
targetLabels - Variable in class textminer.text.TextModel
target (class) labels of given data set
targetLabels - Variable in class textminer.text.TermbyDocumentMatrix
target (class) labels of given data set
targetLabels - Variable in class textminer.text.DictionaryGeneratorMultiClass
 
Task - class textminer.task.Task.
The Task class is an abstraction of specifications on the given task.
task_alias - Variable in class textminer.clustering.AbstractClusterer
 
task_alias - Variable in class textminer.text.LexiconGenerator
task alias (e.g.
task_alias - Variable in class textminer.text.DictionaryGeneratorMultiClass
 
task_alias - Variable in class textminer.text.DictionaryGenerator
name of task alias (dataset name)
task_assignment - Variable in class textminer.task.Subtask
Identifier of given task
task_assignment - Variable in class textminer.text.Indexer
numerical expression of given task
task_attributes - Static variable in interface textminer.core.Constants
 
task_name - Variable in class textminer.clustering.AbstractClusterer
 
task_name - Variable in class textminer.task.Subtask
Name of given task (e.g.
task_name - Variable in class textminer.text.LexiconGenerator
task name (e.g.
task_name - Variable in class textminer.text.Indexer
name of given task
task_name - Variable in class textminer.text.DictionaryGeneratorMultiClass
 
task_name - Variable in class textminer.text.DictionaryGenerator
name of task (classification or clustering)
task_names - Static variable in interface textminer.core.Constants
 
task_orders - Static variable in interface textminer.core.Constants
 
task_spec - Variable in class textminer.classification.AbstractClassifier
 
Task(String, String, String) - Constructor for class textminer.task.Task
Constructor of Task
TaskManager - class textminer.task.TaskManager.
The TaskManager class is responsible for managing all sub-tasks of a given text learning task (e.g.
TaskManager(Task, Env) - Constructor for class textminer.task.TaskManager
Constructor of the TaskManager
taskname - Variable in class textminer.text.TextDocumentConverter
name of task
TaskOrder - class textminer.task.TaskOrder.
The TaskOrder class is an encapsulation of the specification of given task assigned to a particular subtaskers.
TaskOrder() - Constructor for class textminer.task.TaskOrder
Constructor of TaskOrder
td_class - Variable in class textminer.text.DictionaryGeneratorMultiClass
 
td_filename - Variable in class textminer.text.DictionaryGeneratorMultiClass
 
td_master - Variable in class textminer.text.DictionaryGeneratorMultiClass
 
td_master_filename - Variable in class textminer.text.DictionaryGeneratorMultiClass
 
td_selected - Variable in class textminer.text.DictionaryGeneratorMultiClass
 
tdic_filename - Variable in class textminer.text.DictionaryGenerator
name of term dictionary file
tdic_filename1 - Variable in class textminer.text.DictionaryGenerator
name of term dictionary file
tdt_heuristics(double[], int, int) - Static method in class textminer.util.VectorUtil
Implementation of a TDT heuristics
TDTData - class textminer.ds.TDTData.
The TDTData is the abstraction of a text document from Topic Detection and Tracking (TDT) data set.
TDTData() - Constructor for class textminer.ds.TDTData
Constructor of TDTData
TDTData(String, String, String, String, String) - Constructor for class textminer.ds.TDTData
Constructor of TDTData
TDTdataMaker - class textminer.util.TDTdataMaker.
The TDTdataMaker class is implemented to generate a data structure which is capable of containing an instance or a set of instances in a structural and machine-readable form of the "TDT pilot" corpus.
TDTdataMaker() - Constructor for class textminer.util.TDTdataMaker
 
term - Variable in class textminer.ds.tmTerm
Term
Term - class textminer.ds.Term.
The Term class is an abstraction of a term (word or phrase) in text data set.
TERM_STRENGTH - Static variable in interface textminer.featureselection.FeatureSelectionMethods
Feature selection by Term Strength
Term() - Constructor for class textminer.ds.Term
Constructor of Term class
Term(char[], int) - Constructor for class textminer.ds.Term
Constructor of Term class
TermbyDocumentMatrix - class textminer.text.TermbyDocumentMatrix.
The TermDocumentMatrix is an implementation of "Term-by-Document" (inverted) matrix.
TermbyDocumentMatrix(CorpusIndex, String, String, Vector, ArrayList, int[]) - Constructor for class textminer.text.TermbyDocumentMatrix
Constructor of TermDocumentMatrix
TermDictionary - class textminer.ds.TermDictionary.
The TermDictionary class is a data structure of containing unique terms (word or phrase) and its occurrence info.
TermDictionary() - Constructor for class textminer.ds.TermDictionary
Constructor of TermDictionary
testdata_list - Variable in class textminer.classification.AbstractClassifier
 
TextConvertReuters - class textminer.text.TextConvertReuters.
The TextConvertReuters class is implemented to carry out the task of conversion all text documents in Reuters-21580 into machine-readable form.
TextConvertReuters(CorpusIndex, Lexicon, String, String, String, String, String, TextNoiseRemover, boolean) - Constructor for class textminer.text.TextConvertReuters
Constructor of TextConvertReuters
TextDocument - class textminer.ds.TextDocument.
The TextDocument class is an implementation of a text document in natural language (i.e.
TextDocument() - Constructor for class textminer.ds.TextDocument
Constructor of TextDocument
TextDocumentConverter - class textminer.text.TextDocumentConverter.
The TextDocumentConverter class is an abstract class of all classes that perform the task of conversion text document into machine-readable form.
TextDocumentConverter(CorpusIndex, Lexicon, String, String, String, String, String, TextNoiseRemover, boolean) - Constructor for class textminer.text.TextDocumentConverter
Constructor of TextDocumentConverter
textminer.classification - package textminer.classification
 
textminer.clustering - package textminer.clustering
 
textminer.core - package textminer.core
 
textminer.datarepresentation - package textminer.datarepresentation
 
textminer.ds - package textminer.ds
 
textminer.evaluation - package textminer.evaluation
 
textminer.featureselection - package textminer.featureselection
 
textminer.task - package textminer.task
 
textminer.text - package textminer.text
 
textminer.util - package textminer.util
 
TextModel - class textminer.datarepresentation.TextModel.
The TextModel class is an abstract class that encapsulates the task of text data representation and provides a set of common functions and variables.
TextModel - class textminer.text.TextModel.
The TextModel class is an abstraction of the text document collection.
TextModel(CorpusIndex, String, String, ArrayList) - Constructor for class textminer.text.TextModel
Constructor of TextModel
TextModel(CorpusIndex, String, String, Vector, ArrayList, int[]) - Constructor for class textminer.text.TextModel
Constructor of TextModel
TextModel(CorpusIndex, Vector, Vector, int[], int[], int[], int, int, String, Lexicon, String, String, boolean) - Constructor for class textminer.datarepresentation.TextModel
Constructor of TextModel
TextNoiseRemover - class textminer.text.TextNoiseRemover.
The TextNoiseRemover class is a utility class for removing all noises from the natural language text before applying any other text learning techniques.
TextNoiseRemover(String) - Constructor for class textminer.text.TextNoiseRemover
Constructor of TextNoiseRemover
TextUtil - class textminer.util.TextUtil.
The TextUtil class provides a set of primitive utility functions facilitating the manipulation of (natural language, usu.
TextUtil() - Constructor for class textminer.util.TextUtil
 
tf - Variable in class textminer.ds.tmTerm
Term frequency
TFIDF - class textminer.featureselection.TFIDF.
The TFIDF class is an implementation of term-frequency inverse document frequency algorithm.
TFIDF - Static variable in interface textminer.featureselection.FeatureSelectionMethods
Feature selection by TFIDF (term frequency by inverse document frequency
TFIDF() - Constructor for class textminer.featureselection.TFIDF
 
TFIDFModel - class textminer.text.TFIDFModel.
The TFIDFModel class is intended to generate a model of text document set by using TFIDF.
TFIDFModel(CorpusIndex, String, String, ArrayList) - Constructor for class textminer.text.TFIDFModel
Constructor of TFIDFModel
threshold - Variable in class textminer.task.SubtaskSelector
Threshold
title - Variable in class textminer.ds.NewsArticle
Title of news article
tmDocVecIndexEntry - class textminer.ds.tmDocVecIndexEntry.
The tmDocVecIndexEntry class is an encapsulation of instance of given dataset.
tmDocVecIndexEntry() - Constructor for class textminer.ds.tmDocVecIndexEntry
Constructor of tmDocVecIndexEntry
tmDocVecIndexEntry(String, String, String, boolean) - Constructor for class textminer.ds.tmDocVecIndexEntry
Constructor of tmDocVecIndexEntry
tmDocVectorIndex - class textminer.ds.tmDocVectorIndex.
The tmDocVectorIndex class is an index structure for maintaining all instances of given data set.
tmDocVectorIndex() - Constructor for class textminer.ds.tmDocVectorIndex
Constructor of tmDocVectorIndex
tmDVI_find(String) - Method in class textminer.ds.tmDocVectorIndex
Find the instance associated with the specified key, source
tmDVI_insert(String, String, String, boolean) - Method in class textminer.ds.tmDocVectorIndex
Insert an instance with the specified elements
tmDVI_loadIndex(String, int) - Method in class textminer.ds.tmDocVectorIndex
Load the specified index file, filename, according to given option.
tmDVI_printAll(int) - Method in class textminer.ds.tmDocVectorIndex
Print all instances belong to this index according to the specified option
tmp_dir - Variable in class textminer.text.TextDocumentConverter
path of directory for containing temporary results
tmTD_find(String) - Method in class textminer.ds.tmTermDictionary
Return the instance associated with the specified term
tmTD_insert(String) - Method in class textminer.ds.tmTermDictionary
Insert an instance with specified items
tmTD_insert(String, int, int) - Method in class textminer.ds.tmTermDictionary
Insert an instance with the specified items
tmTD_insert(tmTerm) - Method in class textminer.ds.tmTermDictionary
Insert the specified instance of tmTerm
tmTD_insert(tmTerm, boolean) - Method in class textminer.ds.tmTermDictionary
Insert the specified instance of tmTerm
tmTD_insert(tmTermExt) - Method in class textminer.ds.tmTermDictionary
Insert the specified instance of tmTermExt
tmTD_makeArray() - Method in class textminer.ds.tmTermDictionary
Return this dictionary in array of ArrayList, in order to facilitate the task of sorting
tmTD_makeVector() - Method in class textminer.ds.tmTermDictionary
Return the dictionary in Vector, in order to facilitate the task of printing, esp.
tmTD_makeVector(String) - Method in class textminer.ds.tmTermDictionary
Return the dictionary in Vector, in order to facilitate the task of printing, esp.
tmTD_printAll() - Method in class textminer.ds.tmTermDictionary
Print all instances in this dictionary
tmTD_pruneEntry(int, int) - Method in class textminer.ds.tmTermDictionary
Remove a subset of instances which their document frequencies are less than lower bound or larger than upper bound
tmTerm - class textminer.ds.tmTerm.
The tmTerm class is an encapsulation of a term in text processing.
tmTerm() - Constructor for class textminer.ds.tmTerm
Constructor of tmTerm
tmTerm(String, int, int) - Constructor for class textminer.ds.tmTerm
Constructor of tmTerm
tmTermDictionary - class textminer.ds.tmTermDictionary.
The tmTermDictionary class is a data structure of containing unique terms and its occurrence info.
tmTermDictionary() - Constructor for class textminer.ds.tmTermDictionary
Constructor of tmTermDictionary
tmTermExt - class textminer.ds.tmTermExt.
The tmTermExt class is an extension of tmTerm
tmTermExt() - Constructor for class textminer.ds.tmTermExt
Constructor of tmTermExt
tmTermExt(String, int, int, double) - Constructor for class textminer.ds.tmTermExt
Constructor fo tmTermExt
tokenize(String) - Static method in class textminer.text.Pretokenizer
Convert the specified string to a string of consisting a set of grammatical unit.
toString() - Method in class textminer.ds.ReutersInstance
 
toString() - Method in class textminer.ds.InvertedMatrixTermEntry
Return a string containing identifier of term and its occurrences info
toString() - Method in class textminer.ds.IntermediateTextFileEntry
Return a string expression of this class
toString() - Method in class textminer.text.Stemmer
Return String representation of stemmed word
total_elements_per_class - Variable in class textminer.datarepresentation.TextModel
total elements per each of classes
total_elements_per_class - Variable in class textminer.featureselection.AbstractFeatureSelector
total elements per each of classes
total_examples - Variable in class textminer.text.TextModel
total number of examples in the train set
total_num_of_examples - Variable in class textminer.datarepresentation.TextModel
total number of examples in given data set
total_num_of_examples - Variable in class textminer.featureselection.AbstractFeatureSelector
total number of examples in given data set
total_num_of_terms - Variable in class textminer.datarepresentation.TextModel
total number of terms in given data set
total_num_of_terms - Variable in class textminer.featureselection.AbstractFeatureSelector
total number of terms in given data set

U

unique_elements_per_class - Variable in class textminer.datarepresentation.TextModel
unique elements (e.g.
unique_elements_per_class - Variable in class textminer.featureselection.AbstractFeatureSelector
unique elements (e.g.
updateWeights(TermDictionary, int) - Method in class textminer.classification.DomainExperts
 
UPPER_BOUND - Static variable in class textminer.text.DictionaryGeneratorMultiClass
Definition of the threshold for high frequency word

V

variance(double[]) - Static method in class textminer.util.StatUtil
Returns the variance of the specified vector
VECTOR_SPACE_MODEL - Static variable in interface textminer.datarepresentation.DataRepMethods
Vector space model
vectorindex - Variable in class textminer.classification.AbstractClassifier
 
vectorindex - Variable in class textminer.clustering.AbstractClusterer
 
VectorSpaceModel - class textminer.datarepresentation.VectorSpaceModel.
The VectorSpaceModel class is an implementation of "Vector Space Model."
VectorSpaceModel - class textminer.text.VectorSpaceModel.
The VectorSpaceModel is an implementation of "vector space model."
VectorSpaceModel(CorpusIndex, String, String, Vector, ArrayList, int[]) - Constructor for class textminer.text.VectorSpaceModel
Constructor of VectorSpaceModel
VectorSpaceModel(CorpusIndex, Vector, Vector, int[], int[], int[], int, int, String, Lexicon, String, String, boolean) - Constructor for class textminer.datarepresentation.VectorSpaceModel
Constructor of VectorSpaceModel
VectorUtil - class textminer.util.VectorUtil.
The VectorUtil class provides a set of functions, in order to facilitate the operations of manipulating multi-dimensional vectors.
VectorUtil() - Constructor for class textminer.util.VectorUtil
 
verbose - Variable in class textminer.datarepresentation.TextModel
Flag of verbosity
verbose - Variable in class textminer.featureselection.AbstractFeatureSelector
Flag of verbosity
verbose - Variable in class textminer.task.Subtask
Flag of verbosity
verbose - Variable in class textminer.text.TextDocumentConverter
flag of verbosity
verbose - Variable in class textminer.text.LexiconGenerator
flag of verbosity
verifyOptions() - Method in class textminer.task.Task
Verify the task specification file
visualization() - Method in class textminer.clustering.SOM
Show network structure with weights

W

weight - Variable in class textminer.ds.tmTermExt
Weight
weightedTerm - class textminer.ds.weightedTerm.
The weightedTerm class is an abstraction of a term (word or phrase) in text data set.
weightedTerm() - Constructor for class textminer.ds.weightedTerm
Constructor of weightedTerm
weightedTerm(char[], int) - Constructor for class textminer.ds.weightedTerm
Constructor of weightedTerm
weightVectors - Variable in class textminer.featureselection.AbstractFeatureSelector
weight vectors for each of classes resulted from feature selection
whichClass(int[]) - Method in class textminer.classification.BayesianNet
Return the output label from the target classes
whichClass(Vector, String) - Static method in class textminer.util.TextUtil
Verify whether a given label belongs to the set of class labels
WIDROW_HOFF - Static variable in interface textminer.classification.ClassificationMethods
Widrow-Hoff
WidrowHoff - class textminer.classification.WidrowHoff.
The WH class is an implementation of the Widrow-Hoff.
WidrowHoff(int, int, double[][], double) - Constructor for class textminer.classification.WidrowHoff
Constructor of WidrowHoff
writeClusteringResults(Vector[], String) - Static method in class textminer.util.IOUtil
Write a result from a clustering experiment the specified filename
writeDocVector(String, Vector) - Static method in class textminer.util.IOUtil
Write data in Vector as filename
writeFile(String, Vector) - Static method in class textminer.util.IOUtil
Write the specified data in Vector as filename
writeFile(String, Vector, boolean) - Static method in class textminer.util.IOUtil
Write the specified data in Vector
writeLexicon(String, Vector) - Static method in class textminer.util.IOUtil
 
writeStatistics() - Method in class textminer.text.DictionaryGeneratorMultiClass
Write the statistics of a given data set on the disk.
writeTestDocVector(String, String, Vector[], Vector) - Static method in class textminer.util.IOUtil
 
writeVectorToFile(Vector, String) - Static method in class textminer.util.IOUtil
Write the specified source the specified filename

A B C D E F G H I K L M N O P Q R S T U V W