Radiation threat detection is an important security challenge for many countries. This paper introduces Canonical Autocorrelation Analysis (CAA), a method for finding multiple-to-multiple linear correlations within a single set of features. This can be useful when looking for hidden parsimonious structures in data, each involving only a small subset of all features. The discovered correlations are highly interpretable as they are formed by pairs of sparse linear combinations of the original features. In addition, this results in useful visualizations of data.
In this paper it is shown how CAA can be of use as a tool for anomaly detection when an expected structure of correlations is not followed by anomalous data. In the case of radiation threat detection, this allows characterization of harmless radiation data based on the patterns across bins of photon counts, and flagging of anomalies where they fail to follow such patterns. The resulting technique performs significantly better than an unsupervised alternative prevalent in the domain, while providing valuable additional insights for threat analysis.
Artur Dubrawski (Advisor)
Simon Labov (Lawrence Livermore National Laboratory)
We consider modeling time series data with time-varying linear regression, a model that allows the predictor matrix to vary at every time point but penalizes this variation with the (multivariate) total variation norm. This corresponds to simultaneously learning the parameters of multiple linear systems as well as the change points that describe when the underlying process switches between linear models. Computationally, this formulation is appealing as parameter estimation can be done using convex methods and here we derive a fast Newton method by considering the dual problem and by exploiting sparsity with an active set approach. Our motivating example is the problem of modeling energy consumption at various levels of detail; we consider learning models of individual appliances (which are well-described as switched linear systems) as well as modeling the energy consumption for an entire home, a complex process affected by unobserved latent variables. In both cases, as well as on synthetic data, we show that the time-varying linear regression model improves over reasonable baselines and that our proposed optimization algorithm converges to highly accurate solutions quickly.
We propose Discovering Novel Anomalous Patterns (DAP), a new method for continual and automated discovery of anomalous patterns in general datasets. Currently, general methods for anomalous pattern detection attempt to identify data patterns that are unexpected as compared to “normal” system behavior. We propose a novel approach for discovering data patterns that are unexpected given a profile of previously known, both normal and abnormal, system behavior. This enables the DAP algorithm to identify previously unknown data patterns, add these newly discovered patterns to the profile of “known” system behavior, and continue to discover novel (unknown) patterns. We evaluate the performance of DAP in two domains of computer system intrusion detection (network intrusion detection and masquerade detection), demonstrating that DAP can successfully discover and characterize relevant patterns for these two tasks. As compared to the current state of the art, DAP provides a substantially improved ability to discover novel patterns in massive multivariate datasets.
A common classifier for unlabeled nodes on undirected graphs uses label propagation from the labeled nodes, equivalent to the harmonic predictor on Gaussian random fields (GRFs). For active learning on GRFs, the commonly used V-optimality criterion queries nodes that reduce the L2 (regression) loss. V-optimality satisfies a submodularity property showing that greedy reduction produces a (1-1/e) globally optimal solution. However, L2 loss may not characterise the true nature of 0/1 loss in classification problems and thus may not be the best choice for active learning.
We consider a new criterion we call Sigma-optimality, which queries the node that minimizes the sum of the elements in the predictive covariance. Sigma-optimality directly optimizes the risk of the surveying problem, which is to determine the proportion of nodes belonging to one class. In this paper we extend submodularity guarantees from V-optimality to Sigma-optimality using properties specific to GRFs. We further show that GRFs satisfy the suppressor-free condition in addition to the conditional independence inherited from Markov random fields. We test Sigma-optimality on real-world graphs with both synthetic and real data and show that it outperforms V-optimality and other related methods on classification.