Publications

AuthorTitleYearJournal/Proceedings/VenueRef'd
A. P. Singh & G. J. Gordon A Unified View of Matrix Factorization Models 2008 Machine Learning and Knowledge Discovery in Databases, European Conference (ECML/PKDD)   ✔  
Abstract: We present a unified view of matrix factorization that frames the differences among popular methods, such as NMF, Weighted SVD, E-PCA, MMMF, pLSI, pLSI-pHITS, Bregman co-clustering, and many others, in terms of a small number of modeling choices. Many of these approaches can be viewed as minimizing a generalized Bregman divergence, and we show that (i) a straightforward alternating projection algorithm can be applied to almost any model in our unified view; (ii) the Hessian for each projection has special structure that makes a Newton projection feasible, even when there are equality constraints on the factors, which allows for matrix co-clustering; and (iii) alternating projections can be generalized to simultaneously factor a set of matrices that share dimensions. These observations immediately yield new optimization algorithms for the above factorization methods, and suggest novel generalizations of these methods such as incorporating row and column biases, and adding or relaxing clustering constraints.
BibTeX:
@inproceedings{Singh2008a,
  author = {Ajit P. Singh and Geoff J. Gordon},
  title = {A Unified View of Matrix Factorization Models},
  booktitle = {Machine Learning and Knowledge Discovery in Databases, European Conference (ECML/PKDD)},
  year = {2008},
  note = {ECML/PKDD-2008}
}
A. P. Singh & G. J. Gordon Relational Learning via Collective Matrix Factorization 2008 Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD)   ✔  
Abstract: Relational learning is concerned with predicting unknown values of a relation, given a database of entities and observed relations between entities. An example of relational learning is movie rating prediction, where the entities are users, movies, genres, and actors. Relations encode each user’s ratings of movies, the genre of each movie, and the roles that actors played in movies. A common prediction technique given one pairwise relation, for example a #users × #movies matrix, is low-rank matrix factorization. In domains with multiple relations, represented as multiple matrices, we may improve predictive accuracy by exploiting information from one relation while predicting another. To this end, we propose a collective matrix factorization model: we simultaneously factor a set of matrices, sharing parameters among factors when an entity participates in multiple relations. Each relation can have a different value type and error distribution; to handle this diversity, we generalize our model to allow nonlinear relations between the parameters and outputs, using Bregman divergences to measure error. We extend standard alternating projection algorithms to collective factorization, and derive efficient Newton-Raphson updates. Furthermore, we propose stochastic optimization methods to deal with large, sparse matrices. Our model can handle any pairwise relational schema and a wide variety of error models. We demonstrate that additional relations improve prediction in the movie rating domain. Since our model generalizes several existing matrix factorization problems, this work also yields new large scale optimization algorithms for these problems.
BibTeX:
@inproceedings{Singh2008,
  author = {Ajit P. Singh and Geoffrey J. Gordon},
  title = {Relational Learning via Collective Matrix Factorization},
  booktitle = {Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD)},
  year = {2008}
}
A. Krause, A. Singh & C. Guestrin Near-Optimal Sensor Placements in Gaussian Processes: Theory, Efficient Algorithms and Empirical Studies 2008 Journal of Machine Learning Research   ✔  
Abstract: When monitoring spatial phenomena, which can often be modeled as Gaussian processes (GPs), choosing sensor locations is a fundamental task. There are several common strategies to address this task, for example, geometry or disk models, placing sensors at the points of highest entropy (variance) in the GP model, and A-, D-, or E-optimal design. In this paper, we tackle the combinatorial optimization problem of maximizing the mutual information between the chosen locations and the locations which are not selected. We prove that the problem of finding the configuration that maximizes mutual information is NP-complete. To address this issue, we describe a polynomial-time approximation that is within (1-1/e) of the optimum by exploiting the submodularity of mutual information. We also show how submodularity can be used to obtain online bounds, and design branch and bound search procedures. We then extend our algorithm to exploit lazy evaluations and local structure in the GP, yielding significant speedups. We also extend our approach to find placements which are robust against node failures and uncertainties in the model. These extensions are again associated with rigorous theoretical approximation guarantees, exploiting the submodularity of the objective function. We demonstrate the advantages of our approach towards optimizing mutual information in a very extensive empirical study on two real-world data sets.
BibTeX:
@article{Krause2008,
  author = {Andreas Krause and Ajit Singh and Carlos Guestrin},
  title = {Near-Optimal Sensor Placements in {G}aussian Processes: Theory, Efficient Algorithms and Empirical Studies},
  journal = {Journal of Machine Learning Research},
  year = {2008},
  volume = {9},
  pages = {235--284},
  url = {http://jmlr.csail.mit.edu/papers/v9/}
}
T. van Allen, A. Singh, R. Greiner & P. Hooper Quantifying the uncertainty of a belief net response: Bayesian error-bars for belief net inference 2008 Artificial Intelligence   ✔  
Abstract: A Bayesian belief network models a joint distribution over variables using a DAG to represent variable dependencies and network parameters to represent the conditional probability of each variable given an assignment to its immediate parents. Existing algorithms assume each network parameter is fixed. From a Bayesian perspective, however, these network parameters can be random variables that reflect uncertainty in parameter estimates, arising because the parameters are learned from data, or because they are elicited from uncertain experts.

Belief networks are commonly used to compute responses to queries—i.e., return a number for . Parameter uncertainty induces uncertainty in query responses, which are thus themselves random variables. This paper investigates this query response distribution, and shows how to accurately model this distribution for any query and any network structure. In particular, we prove that the query response is asymptotically Gaussian and provide its mean value and asymptotic variance. Moreover, we present an algorithm for computing these quantities that has the same worst-case complexity as inference in general, and also describe straight-line code when the query includes all n variables. We provide empirical evidence that (1) our approximation of the variance is very accurate, and (2) a Beta distribution with these moments provides a very accurate model of the observed query response distribution. We also show how to use this to produce accurate error bars around these responses—i.e., to determine that the response to is x±y with confidence 1−δ.

BibTeX:
@article{Allen2008,
  author = {Tim van Allen and Ajit Singh and Russell Greiner and Peter Hooper},
  title = {Quantifying the uncertainty of a belief net response: Bayesian error-bars for belief net inference},
  journal = {Artificial Intelligence},
  year = {2008},
  volume = {172},
  number = {4-5},
  pages = {483--513},
  doi = {http://dx.doi.org/10.1016/j.artint.2007.09.004}
}
A. P. Singh, A. Gunawardana, C. Meek & A. C. Surendran Recommendations using Absorbing Random Walks 2007 North East Student Colloquium on Artificial Intelligence (NESCAI)    
Abstract: Collaborative filtering attempts to find items of interest for a user by utilizing the preferences of other users. In this paper we describe an approach to filtering that explicitly uses social relationships, such as friendship, to find items of interest to a user. Modeling user-item relations as a bipartite graph we augment it with user-user (social) links and propose an absorbing random walk that induces a set of stationary distributions, one per user, over all items. These distributions can be interpreted as personalized rankings for each user. We exploit sparsity of both user-item and user-user relationships to improve the efficiency of our algorithm.
BibTeX:
@inproceedings{Singh2007,
  author = {Ajit P. Singh and Asela Gunawardana and Chris Meek and Arun C. Surendran},
  title = {Recommendations using Absorbing Random Walks},
  booktitle = {North East Student Colloquium on Artificial Intelligence (NESCAI)},
  year = {2007}
}
J. Leskovec, A. Singh & J. M. Kleinberg Patterns of Influence in a Recommendation Network 2006 Advances in Knowledge Discovery and Data Mining, Tenth Pacific-Asia Conference (PAKDD)   ✔  
Abstract: Information cascades are phenomena in which individuals adopt a new action or idea due to influence by others. As such a process spreads through an underlying social network, it can result in widespread adoption overall. We consider information cascades in the context of recommendations, and in particular study the patterns of cascading recommendations that arise in large social networks. We investigate a large person-to-person recommendation network, consisting of four million people who made sixteen million recommendations on half a million products. Such a dataset allows us to pose a number of fundamental questions: What kinds of cascades arise frequently in real life? What features distinguish them? We enumerate and count cascade subgraphs on large directed graphs; as one component of this, we develop a novel efficient heuristic based on graph isomorphism testing that scales to large datasets. We discover novel patterns: the distribution of cascade sizes is approximately heavy-tailed; cascades tend to be shallow, but occasional large bursts of propagation can occur. The relative abundance of different cascade subgraphs suggests subtle properties of the underlying social network and recommendation process.
BibTeX:
@inproceedings{Leskovec2006,
  author = {Jure Leskovec and Ajit Singh and Jon M. Kleinberg},
  title = {Patterns of Influence in a Recommendation Network},
  booktitle = {Advances in Knowledge Discovery and Data Mining, Tenth Pacific-Asia Conference (PAKDD)},
  year = {2006},
  pages = {380-389}
}
A. Singh & A. W. Moore Finding Optimal Bayesian Networks by Dynamic Programming 2005 Tech report: CMU-CALD-05-106
Carnegie Mellon University  
 
Abstract: Finding the Bayesian network that maximizes a score function is known as structure learning or structure discovery. Most approaches use local search in the space of acyclic digraphs, which is prone to local maxima. Exhaustive enumeration requires super-exponential time. In this paper we describe a “merely” exponential space/time algorithm for finding a Bayesiannetwork that corresponds to a global maxima of a decomposable scoring function, such as BDeu or BIC.
BibTeX:
@techreport{Singh2005,
  author = {Ajit Singh and Andrew W. Moore},
  title = {Finding Optimal {B}ayesian Networks by Dynamic Programming},
  year = {2005}
  institution = {Carnegie Mellon University},
  number = {CMU-CALD-05-106}
}
C. Guestrin, A. Krause & A. P. Singh Near-optimal sensor placements in Gaussian processes
Winner of the Best Paper Runner-up Award
2005 Machine Learning, Proceedings of the Twenty-Second International Conference (ICML)   ✔  
Abstract: When monitoring spatial phenomena, which are often modeled as Gaussian Processes (GPs), choosing sensor locations is a fundamental task. A common strategy is to place sensors at the points of highest entropy (variance) in the GP model. We propose a mutual information criteria, and show that it produces better placements. Furthermore, we prove that finding the configuration that maximizes mutual information is NP-complete. To address this issue, we describe a polynomial-time approximation that is within (1 − 1/e) of the optimum by exploiting the submodularity of our criterion. This algorithm is extended to handle local structure in the GP, yielding significant speedups. We demonstrate the advantages of our approach on two real-world data sets.
BibTeX:
@inproceedings{Guestrin2005,
  author = {Carlos Guestrin and Andreas Krause and Ajit Paul Singh},
  title = {Near-optimal sensor placements in {G}aussian processes},
  booktitle = {Machine Learning, Proceedings of the Twenty-Second International Conference (ICML)},
  year = {2005},
  pages = {265-272},
  doi = {http://doi.acm.org/10.1145/1102351.1102385}
}
A. Singh What to do when you don’t have much data: Issues in small sample parameter learning in Bayesian networks 2003 M.Sc. Thesis School: Department of Computing Science, University of Alberta    
Abstract: A discrete Bayesian network is a factorization of a joint distribution over random variables. The most common use of these networks is for the computation of conditional probabilities (query responses). The parameters of these networks are often learned from data. Thus network parameters are themselves uncertain, which induces a distribution for any query response. When data sets are small, the effects of parameter uncertainty can be severe. In this thesis we argue that when data sets are small, the distribution of a query response is accurately modelled by a Beta distribution. Procedures for the modeling of the query response are also reviewed. Furthermore, we examine proposed techniques for parameter learning when data sets are small and only one query is of interest.
BibTeX:
@mastersthesis{Singh2003,
  author = {Ajit Singh},
  title = {What to do when you don’t have much data: Issues in small sample parameter learning in {B}ayesian networks},
  school = {Department of Computing Science, University of Alberta},
  year = {2003},
  note = {M.Sc. Thesis}
}
A. Singh, R. Greiner, V. Dorian & B. Lefebvre Automated diagnosis of IEMs using high-throughput NMR spectroscopy 2002 Poster: Intelligent Systems for Molecular Biology (ISMB),    
Abstract: We examine the utility of high throughput Nuclear Magnetic Resonance (NMR) data for the diagnosis of inborn errors of metabolism. We propose a two-layer Bayesian network with the novel addition of Bayesian error bars, which provide confidence intervals for our diagnoses.
BibTeX:
@misc{Singh2002,
  author = {Ajit Singh and Russell Greiner and Victor Dorian and Brent Lefebvre},
  title = {Automated diagnosis of IEMs using high-throughput NMR spectroscopy},
  year = {2002}
}
A. Singh Implementation of a Real-Time Kalman Filtering Navigation System 2000 Tech report: FR-132
National Research Council Canada  
 
BibTeX:
@techreport{Singh2000,
  author = {Ajit Singh},
  title = {Implementation of a Real-Time {K}alman Filtering Navigation System},
  year = {2000}
  institution = {National Research Council Canada},
  number = {FR-132}
}
R. Elio, A. Haddadi & A. Singh Abstract task specifications for conversation policies 2000 Proceedings of the fourth international conference on Autonomous agents (AGENTS)   ✔  
Abstract: Under some views, a crucial function for conversation policies is to "constrain the messages that appear on the wire,’’ for there can be a many-to-many mapping between an agent’s intention and the message primitive used to express that intention. In this paper, we argue that the way to constrain messages is to constrain intentions. We propose a pragmatic approach to doing this through an abstract task specification or model. Abstract task specifications are based on a simple state-space representation for problem formulation, and this representation can be used to delimit the agents’ task intentions and discourse intentions. This analysis supports a flexible and pragmatic way of handling "unexpected’’ messages by reference to the abstract task specification. We see an abstract task specification as one component of a publicly posted conversation policy. A simple search-assistant agent was implemented within a BDI architecture to illustrate application of these ideas.
BibTeX:
@inproceedings{Elio2000a,
  author = {Ren\'{e}e Elio and Afsaneh Haddadi and Ajit Singh},
  title = {Abstract task specifications for conversation policies},
  booktitle = {Proceedings of the fourth international conference on Autonomous agents (AGENTS)},
  publisher = {ACM},
  year = {2000},
  pages = {229--230},
  doi = {http://doi.acm.org/10.1145/336595.337393}
}
R. Elio, A. Haddadi & A. Singh Task Models, Intentions, and Agent Conversation Policies 2000 Sixth Pacific Rim International Conference on Artificial Intelligence (PRICAI)   ✔  
Abstract: It is possible to define conversation policies, such as communication or dialogue protocols, that are based strictly on what messages and, respectively, what performatives may follow each other. While such an approach has many practical applications, such protocols support only "local coherence" in a conversation. Lengthy message exchanges require some infrastructure to lend them "global coherence." Recognition of agent intentions about the joint task is essential for this global coherence, but there are further mechanisms needed to ensure that both local and global coherence are jointly maintained. This paper presents a general yet practical approach to designing, managing, and engineering agents that can do simple run-time intention recognition without creating complex multi-state protocols. In this approach we promote developing abstract task models and designing conversation policies in terms of such models. An implemented agent assistant based on these ideas is briefly described.
BibTeX:
@inproceedings{Elio2000,
  author = {Renee Elio and Afsaneh Haddadi and Ajit Singh},
  title = {Task Models, Intentions, and Agent Conversation Policies},
  booktitle = {Sixth Pacific Rim International Conference on Artificial Intelligence (PRICAI)},
  year = {2000},
  pages = {394-403}
}
A. Singh Directions in BDI Architectures 1999 B.Sc. Independent Study    
BibTeX:
@unpublished{Singh1999,
  author = {Ajit Singh},
  title = {Directions in {BDI} Architectures},
  year = {1999},
  note = {B.Sc. Independent Study}
}

Created by JabRef on 15/10/2008.