Andrew
W. Moore
Academic Papers
 Theoretical
Justification of Popular Link Prediction Heuristics (COLT 2010)
 Fast
Nearestneighbor Search in Diskresident Graphs (KDD 2010)
 Fast
Dynamic
Reranking
in Large Graphs
(2009)
 Efficient
Algorithms for LargeScale Asteroid Discovery (2008)
 Forecasting
Web Page Views: Methods and Observations (JMLR 2008)
In order to book display advertisements in advance, you need to
estimate the new number of views in advance.

Fast
Incremental
Proximity Search in Large Graphs
(2008)
Slightly revised from the ICML cameraready version
 Fast
State Discovery
for HMM Model Selection and Learning
(2007)

A
Tractable Approach to
Finding Closest Truncatedcommutetime Neighbors in Large Graphs
(2007)
 Handbook of Biosurveillance, (Elsevier Books, 2006)

Sequential
Update of
ADtrees
(2006)

Dependency
Trees in
Sublinear Time and Bounded Memory
(2006)
Efficient learning of dependency trees for huge datasets.

Monitoring
Food Safety
by Detecting Patterns in Consumer Complaints
(2006)

Autonomous
Visualization
(2006)

Efficient
Analytics for
Effective Monitoring of Biomedical Security
(2005)

Making
Logistic
Regression A Core Data Mining Tool With TRIRLS
(2005)
This short paper is the easiest, fastest way
to learn about Truncated Regularized Iteratively Reweighted Least
Squares (TRIRLS), my algorithm for fast, parameterfree logistic
regression. TRIRLS can also be used for any generalized linear
model. This

A
Multiple Tree
Algorithm for the Efficient Association of Asteroid Observations
(2005)

Alias
Detection in Link
Data Sets
(2005)
Combining string similarity with contextual similarity when searching
for aliases using active learning.

Making
Logistic
Regression A Core Data Mining Tool: A Practical Investigation of
Accuracy, Speed, and Simplicity
(2005)
Regularized logistic regression can be fast, accurate, and simple. This
paper includes the most important findings of my thesis, and a few new
details.

Fast
Inference and
Learning in LargeStateSpace HMMs
(2005)

Efficiently
Identifying
Close Track/Observation Pairs in Continuous Timed Data
(2005)

Detecting
Significant
Multidimensional Spatial Clusters
(2005)
Applying the fast multidimensional spatial scan statistic to detect
clusters in epidemiological and brain imaging data.

A
Bayesian scan
statistic for spatial cluster detection
(2005)
A new Bayesian method for cluster detection

Dynamic
Social Network
Analysis using Latent Space Models
(2005)

A
Bayesian spatial scan
statistic
(2005)
A new Bayesian method for spatial cluster detection

Detecting
Anomalous
Patterns in Pharmacy Retail Data
(2005)
A biosurveillance system to collect disease outbreak feedback from
public health officials

Variable
KDTree
Algorithms for Spatial Pattern Search and Discovery
(2005)

Anomalous
Spatial
Cluster Detection
(2005)
A general and powerful framework for spatial cluster detection.

Finding
Optimal
Bayesian Networks by Dynamic Programming
(2005)
Learning the optimal Bayes net structure

An
Investigation of
Practical Approximate Nearest Neighbor Algorithms
(2004)
How to use variations on classic exact data structures for nearest
neighbor, if you want to get faster answers and are prepared to accept
approximation?

HighDimensional
Probabilistic Classification for Drug Discovery
(2004)
Discriminative probabilistic classifiers have been used successfully on
large lifesciences datasets, but high dimensionalities have prohibited
the use of nonparametric class probability estimation. This paper
explores a method (SLAMDUNK) which addresses

Fast
Nonlinear
Regression via Eigenimages Applied to Galactic Morphology
(2004)
Determining the shapes of millions of galaxies.

Active
Learning for
Anomaly and RareCategory Detection
(2004)
How to use active learning in a reallife scenario.

Semantic
based
Biomedical Image Indexing and Retrieval
(2004)
Volumetric pathological neuroimage retrieval under the framework of
classificationdriven feature selection.

The
IOC
algorithm:
Efficient ManyClass Nonparametric Classification for HighDimensional
Data
(2004)
Performing knearestneghbor classifications on multiclass problems
without actually finding the knearest neighbors.

Rapid
Detection of
Significant Spatial Clusters
(2004)
A new spatial scan algorith that searches over arbitrary rectangles in
addition to squares.

Alias
Detection in Link
Data Sets
(2004)
An active learning approach to deciding whether two names correspond to
the same entity, combining string similarity information and context
similarity.

Tractable
Learning of
Large Bayes Net Structures from Sparse Data
(2004)
in this paper we propose an algorithm that allows to learn a Bayes Net
structure from sparse data (e.g., powerlaw distributed) with over
100,000 variables. we also report time and performance accuracy when
applied to several very large datasets

Belief
State Approaches
to Signaling Alarms in Surveillance Systems
(2004)

A
Fast
MultiResolution
Method for Detection of Significant Spatial Disease Clusters
(2003)
Rapid detection of disease clusters using a fast spatial scan statistic
algorithm.

Rapid
Evaluation of
Multiple Density Models
(2003)
A way to quickly evaluate and compare multiple nonparametric density
estimates.

Efficient
Exact kNN
and Nonparametric Classification in High Dimensions
(2003)
Can we do nonapproximate kNN classification without actually finding
the kNN?

A
Fast
MultiResolution
Method for Detection of Significant Spatial Overdensities
(2003)
Scaling up the classical Kulldorff scan statistic.

Bayesian
Network
Anomaly Pattern Detection for Disease Outbreaks
(2003)
Handles temporal trends in data by replacing the baseline of WSARE 2.0
with a baseline generated by a Bayesian network.

Empirical
Bayes
Screening for Link Analysis
(2003)
An algorithm for discovering top N strange cooccurences of size 2,3,4,
etc Uses ideas of frequent sets, but stratifies them according to a
statistically justified hierarchical bayes model, using empirical bayes
to find the parameters

What's
Strange About
Recent Events
(2003)
A shorter paper on WSARE that was submitted to the Journal of Urban
Health

Optimal
Reinsertion: A
new search operator for accelerated and more accurate Bayesian network
structure learning
(2003)
Very aggressive but computationally efficient search steps for Bayes
net learning.

Tractable
Group
Detection on Large Link Data Sets
(2003)
We present the kgroups algorithm, an improvement of the GDA algorithm
that includes significant computational advantages. The kgroups
algorithm allows tractable group detection on large data sets.

Finding
Underlying
Connections: A Fast GraphBased Method for Link Analysis and
Collaboration Queries
(2003)
CGraph is an algorithm to quickly learn a graphbased model of the
underlying connections of a set of entities given link data.

cGraph:
A Fast
GraphBased Method for Link Analysis and Queries
(2003)
This paper is an extended version of the 2003 ICML conference paper.

Probabilistic
Noise
Identification and Data Cleaning
(2003)
We examine the use of explicit noise and corruption models to aid in
the task of noise identification and data cleaning.

A
Comparison of
Statistical and Machine Learning Algorithms on the Task of Link
Completion
(2003)
This paper examines the task of link completion, relative algorithm
performance, and what this can tell us about the structure of the data.

Fast
Robust Logistic
Regression for Large Sparse Datasets with Binary Outputs
(2003)
Logistic regression can provide faster, better results than SVM for
lifesciences datasets with hundreds of thousands of attributes.

Active
Learning in
Discrete Input Spaces
(2002)
Using modified Gittins indices to decide which datapoint to actively
label next whilst being rewarded for each new label.

Realvalued
AllDimensions search: Lowoverhead rapid searching over subsets of
attributes
(2002)
Searching over large numbers of contingency tables quickly

Using
Tarjan's Red Rule
for Fast Dependency Tree Construction
(2002)
Very fast growth of dependency trees.

Interpolating
Conditional Density Trees
(2002)
Very fast nonparametric Bayesian Network nodes

Efficient
Algorithms
for NonParametric Clustering with Clutter
(2002)
Finding and counting the high density regions in spatial data.

Stochastic
Link and
Group Detection
(2002)
This paper introduces the GDA algorithm. We use noisy link data
(ntuples of entities) to learn underlying groupings of entities.

Rulebased
Anomaly
Pattern Detection for Detecting Disease Outbreaks
(2002)
Answering the question: "What's Strange About Recent Events?"

Summary
of
Biosurveillancerelevant statistical and data mining technologies
(2002)
A brief informal survey of some techniques that have been used for
Biosurveillance.

Variable
Resolution
Discretization in Optimal Control
(2002)
A short paper on choosing the right resolution in a tessalation of
state space.

Reinforcement
Learning
for Cooperating and Communicating Reactive Agents in Electrical Power
Grids
(2001)
One approach to distributed reinforcement learning

ClassificationDriven
Pathological Neuroimage Retrieval Using Statistical Asymmetry Measures
(2001)
Using machine learning to detect abnormalities in neuroimaging output.

NBody
Problems in
Statistical Learning
(2001)
A way to use multiple trees simultaneously to solve a large class of
statistical problems efficiently.

Repairing
Faulty
Mixture Models using Density Estimation
(2001)
Intelligent automatic selection of new mixture model components

Mixtures
of Rectangles:
Interpretable Soft Clustering
(2001)
A mixture model that is easily readable by humans.

Direct
Policy Search
using Paired Statistical Tests
(2001)
If you're going to choose the best policy by rollouts, how can
statistical tests and "racing" help?

Mixnets:
Learning
Bayesian Networks with mixtures of discrete and continuous attributes
(2000)
Bayes Nets with Mixture Models in the nodes

Learning
Filaments
(2000)
A generative model and efficient algorithm for identifying noisy
networks of points in kdimensional space

Xmeans:
Extending
Kmeans with Efficient Estimation of the Number of Clusters
(2000)
Extension to popular Kmeans, where the number of clusters K is also
estimated.

A
Dynamic Adaptation of
ADtrees for Efficient Machine Learning on Large Data Sets
(2000)
Fast implementation of ondemand ADtrees for scores of higharity
attributes and millions of rows.

Condensed
Representations for computationally tractable data mining of massive
sky surveys
(2000)

A
Nonparametric
Approach to Noisy and Costly Optimization
(2000)
Optimizing a noisy process in a possibly discontinuous or noneuclidian
space.

The
Anchors Hierarchy:
Using the Triangle Inequality to Survive HighDimensional Data
(2000)
Using Balltrees allows cached sufficient statisticsbased
accelerations even in high dimensions. The Anchors approach quickly
optimizes the Balltree structure for this purpose.

MultiValueFunctions:
Efficient Automatic Action Hierarchies for Multiple Goal MDPs
(1999)
An efficient procedure to approximately compute all policies for all
possible goal states.

Accelerating
Exact
kmeans Algorithms with Geometric Reasoning (Extended version)
(1999)
This is an extended version of the KDD99 paper.

Accelerating
Exact
kmeans Algorithms with Geometric Reasoning
(1999)
Using cached counts and a different kind of search operator during
kmeans updates, with no approximation

Distributed
Value
Functions
(1999)
Distributed Reinforcement learning for applications such a power grids

Variable
Resolution
discretizations for highaccuracy solutions of optimal control problems
(1999)

Influence
and Variance
of a Markov Chain: Application to adaptive discretization in optimal
control
(1999)
You're using variable resolution splines to approximate a value
function. Where is most profitable to increasing the resolution?

Efficient
MultiObject
Dynamic Query Histograms
(1999)
Using Multiresolution kdtrees to accelerate visualization algorithms

Bayesian
Networks for
Lossless Dataset Compression
(1999)
Practical ways to compress large tabular datasets

Very
Fast EMbased
Mixture Model Clustering Using Multiresolution KDtrees
(1999)
Using kdtrees with centroids in the nodes can allow accurate EM updates
in time sublinear in the number of records

Value
Function Based
Production Scheduling
(1998)
Production scheduling in which we account for a probability
distribution on future jobs by means of kernelbased value function
approximation

Q2:
Memorybased active
learning for optimizing noisy continuous functions
(1998)
Maximizing a very noisy function in kdimensional space with few samples

Cached
Sufficient
Statistics for Efficient Machine Learning with Large Datasets
(1998)
Introduces ADtrees: a new way to implicitly precache the answers to
all possible counting queries for a dataset.

Learning
Evaluation
Functions for Global Optimization and Boolean Satisfiability
(1998)

ADtrees
for Fast
Counting and for Fast Learning of Association Rules
(1998)
Using ADtrees to learn conjunctive rules via beam search.

Applying
online search
techniques to ContinuousState reinforcement learning
(1998)
Using a specialized version of Astar to boost the performance of
approximate value functions

The
Racing Algorithm:
Model Selection for Lazy Learners
(1997)
A detailed analysis and study of Racing.

Efficient
Locally
Weighted Polynomial Regression Predictions
(1997)
Using Multiresolution KDtrees with cached first and second moments in
the nodes

Locally
Weighted
Learning
(1997)
Survey of the use of kernel functions in kernel regression, locally
weighted regression and related function approximators.

Using
Prediction to
Improve Combinatorial Optimization Search
(1997)
Automatically improving combinatorial search by
reinforcementlearningstyle analysis of earlier runs

Locally
Weighted
Learning for Control
(1997)
How can kernel methods and locally weighted regression help robots
learn to control themselves?

A
tutorial on using the
Vizier memorybased learning system
(1997)
A tutorial on using the Windows Vizier software for fast locally
weighted and kNN style classification and regression.

Simulationbased
optimization of a stochastic product coating problem using hashtable
cacheing of costs
(1997)

Memorybased
Stochastic
Optimization
(1996)
Using locally weighted regression to model response surfaces and to
choose the next experiment

Reinforcement
Learning:
A Survey
(1996)
Surveys MDPs, TD, Qlearning and many other Reinforcement Learning
staples.

Algorithms
for
Approximating Optimal Value Functions in Acyclic Domains
(1996)
Using "Rollouts" to make valuefunctionbased RL more practical

Multiresolution
Instancebased Learning
(1995)
Multiresolution Kdtrees with cached statistics for accelerating kernel
regression

The
Partigame
Algorithm for Variable Resolution Reinforcement Learning in
Multidimensional Statespaces
(1995)
Automatic variable resolution discretization of a multidimensional
state space during searches for shortest paths

Learning
Automated
Product Recommendations Without Observable Features: An Initial
Investigation
(1995)

Generalization
in
Reinforcement Learning: Safely Approximating the Value Function
(1995)
An introduction to the ways that naive application of function
approximation of value functions can fail.

An
Empirical
Investigation of Brute Force to choose Features, Smoothers and Function
Approximators
(1995)
What happens when you use very intense crossvalidation?

Proceedings
of the
Workshop on Value Function Approximation, Machine Learning Conference
1995.
(1995)
Short talks from a workshop about value function approximation

Variable
Resolution
Reinforcement Learning
(1995)
Briefly surveys a number of approaches to making Bellman updates faster

The
Partigame
Algorithm for Variable Resolution Reinforcement Learning in
Multidimensional Statespaces
(1994)
A short introduction to an efficient learningshortestpaths algorithm.

Efficient
Algorithms
for Minimizing Cross Validation Error
(1994)

Hoeffding
Races:
Accelerating Model Selection Search for Classification and Function
Approximation
(1994)
Perform many crossvalidation operations in a roundrobin fashion,
pruning likely nonwinners early.

A
short
tutorial note
on computing information gain from counts
(1994)
A simple 2page tutorial.

Prioritized
Sweeping:
Reinforcement Learning with Less Data and Less Real Time
(1993)
As estimates of rewards and transition probabilites improve during
learning, how can we efficiently allow the value function to keep up?

Memorybased
Reinforcement Learning: Efficient Computation with Prioritized Sweeping
(1992)
Using a priority queue to schedule the most useful value function
updates

Fast,
Robust Adaptive
Control by Learning only Forward Models
(1992)
A real robot pool player achieves high accuracy by learning in a
forward direction.

A
tutorial on kdtrees
(1991)
A description of Bentley et al's classic nearest neighbor algorithm

Variable
Resolution
Dynamic Programming: Efficiently Learning Action Maps in Multivariate
Realvalued Statespaces
(1991)

Knowledge
of Knowledge
and Intelligent Experimentation for Learning Control
(1991)

Efficient
Memorybased
Learning for Robot Control
(1990)
Using KDtrees, nearest neighbor and active learning.

Acquisition
of Dynamic
Control Knowledge for a Robotic Manipulator
(1990)

Experiments
in Adaptive
State Space Robotics
(1989)
Using a nearest neighbor classifier to design control experiments

Learning
Robotic
Control: PhD. Thesis Proposal
(1988)
Thesis Proposal