Research details


My research is mainly in the area of studying evolutionary history of humans and other species using machine learning techniques. In the last few years, the amount of data available for such analyses has increased exponentially.The use of suitable machine learning methods can enable us to utilize these large amounts of data well to produce interpretable and meaningful results. The objective of my research is to develop methodology to model such phenomenon as muta- tion, recombination, genetic drift and selection that shape human evolutionary history.

My recent work involves the use of probabilistic graphical models to model the phenomena of mutations and recombinations in genetic data for various types of markers in human and other species. We use generative mixed-membership models to model mutations and recombinations in an intuitive way that is easy to understand. To enable efficient inference and parameter-learning on the re- sulting graphical model, we developed a variational inference algorithm and a variational Expectation-maximization algorithm. Of course, not much is known about the behavior and nature of variational approximations, and hence it is useful to study their correctness and other prop erties before using them in general problem-solving. Some previous work per- formed with that objective in mind used simuation studies with bootstrapping to confirm the accuracy of such methods.

Apart from population level studies of evolution, I am also interested in studying evolution at the sequence level. Some of my previous work is about modeling functional turnover of binding sites in Drosophila CRMs using phylo- genetic trees of multiple resolutions- by modeling functionality as evolving along the phylogenetic tree of the Drosophila species. Our method is novel in having a generative model of the binding site turnover phenomenon, which provides an easy explanation of the observations.


Publications

S. Shringarpure and E. P. Xing, mStruct: A New Admixture Model for Inference of Population Structure in Light of Both Genetic Admixing and Allele Mutations, Proceedings of the 25th International Conference on Machine Learning (ICML 2008).(pdf)
P. Ray, S. Shringarpure, M. Kolar and E. P. Xing, CSMET: Comparative Genomic Motif Detection via Multi-Resolution Phylogenetic Shadowing, PLoS Computational Biology (2008).(pdf)

Home Research Contact Resume MStruct