Eric P. Xing
Graphical Models and Algorithms for Integrative Bioinformatics

Probabilistic graphical model is a formalism that exploits the conjoined talents of graph theory and probability theory to build complex models out of simpler pieces. It offers a powerful language to elegantly define expressive distributions under complex scenarios in high-dimensional space, and provides a systematic computational framework for probabilistic inference. These virtues have particular relevance in bioinformatics, where many core inferential problems–e.g., linkage analysis, phylogenetic analysis, network reconstruction–are already naturally expressed in probabilistic terms, and must deal with experimental data with complex structure and temporal and/or spatial dynamics. I will discuss our recent work on graphical model inferential methodology in three areas in bioinformatics: (1) Population structure and recombination hotspot inference, using a novel approach based on Dirichlet process priors. I present a hidden Markov version of the Dirichlet process which allows us to infer recombination events among haplotypes in an "open" ancestral space. (2) Comparative genomics prediction of imperfectly conserved transcription factor binding sites, where multi-resolution phylogenetic inference combines with Markovian inference to provide sensitive detection of motifs and their evolutionary turnovers in eleven Drosophila species. (3) Reverse-engineering of temporally rewiring networks from gene expression time courses, where a novel hidden temporal exponential random graph model is employed to model temporal evolution of network topologies during a biological process, and to facilitate the inference of transient (rather than a single universal) regulatory circuitry underlying each time-point of the microarray time series.