Research
synopsis: My principal research interests lie
in the development of machine learning and statistical methodology;
especially for solving problems involving automated learning,
reasoning, and decision-making in high-dimensional, multimodal, and
dynamic possible worlds in social and biological systems.
Currently the following themes are studied in my group:
- Foundations
of Statistical Learning, including theory and algorithms for:
1) Time/space varying-coefficient models with evolving structures; 2) Sparse structured
input/output models in high-dimensional problems; 3) Nonparametric Bayesian
techniques for infinite-dimensional models; 4) RKHS embedding,
nonparametric inference, and spectral methods for graphical
models; 5) Distributed and online algorithms for optimization,
approximate inference, and sampling on Tara-scale data.
- Large-scale Information & Intelligent System: 1)
Multi-view latent space models, topics models, and sparse coding
for image/text/relational data mining; 2) Evolving structure, stable metrics,
and prediction for dynamic social networks, goal-driven network
design and optimization; 3) Web-scale image understanding, search,
prediction, and storyline synthesis; 4) User modeling,
personalization, temporal analysis, and computational
advertising; 5) Information visualization, indexing and storage,
web/mobile app development.
- Computational
Biology: 1) Understanding genome-microenvironment
interactions in cancer and embryogenesis via joint
analysis of genomic, proteomic, and pathway
signaling data; 2) Genetic analysis of
population variation, demography and evolution; 3) Statistical inference
of genome-transcriptome-phenome association in complex diseases; 4)
Personalized diagnosis and treatment of spectrum diseases via
next generation sequencing and computational "omic" analysis;
5) Biological image and text mining.
Recent
Activities:
Teaching:
I am teaching Probabilistic Graphical
Models
(10708) in Spring 2013.
Previously I
co-taught Machine Learning
(10701) with Prof. Aarti Singh in Fall 2012;
and I taught Computational
Genomics
(10810) in Spring 2009.
The Dragon Star Lectures: Advanced Machine Learning, @ Peking/Tsinghua Univ., Beijing, Summer 2009.
Services:
I am a member of the DARPA Information Science and Technology (ISAT) Advisory Group.
And I serve on the NIH Bio-Data Management and Analysis (BDMA) Study Section.
Sabbatical:
I was on sabbatical from 2010
to 2011 as a visiting professor at Department of Statistics, Stanford University.
I was also a visiting professor during 2010-2011 at Facebook, working on a variety of projects on social media.
Talks:
I gave an invited talk on "On Learning Sparse Structured Input-Output Models" [slides] at the
Conference on Empirical Methods in Natural Language Processing and
Computational Natural Language Learning (EMNLP 2012).
I gave a tutorial on "Topic Models, Latent Space Models, Sparse Coding, and All That: A systematic understanding of probabilistic semantic extraction in large corpus" [slides] at the 50th Annual Meeting of the Association for Computational Linguistics (ACL 2012).
With Dr. Seyoung Kim, we gave a tutorial on "
Modern Statistical Methods for Genetic Association Study: Structured
Genome-Transcriptome-Phenome Association Analysis" [slides] at the Nineteenth International Conference on Intelligence Systems for Molecular Biology (ISMB 2011).
I gave a keynote talk on "Sparsity and Learning Large Scale Models" [slides] at the 2011 CVPR Workshop on
Large Scale Learning for Vision.
I gave a keynote talk on "Dynamic Network Analysis: Model, Algorithm, Theory, and Application" [slides] at the Eighth Workshop on Mining and Learning with Graphs, 2010.
I gave a keynote talk on "Genome-Phenome Association Analysis of Complex Diseases - a Structured Sparse Regression Approach" [slides] at the Tenth Annual International Workshop on Bioinformatics and Systems Biology, 2010.
I gave
a keynote talk on "Jointly Maximum Margin and Maximum Entropy Learning of Graphical Models" [slides] at
the NIPS
2009 Workshop on "APPROXIMATE LEARNING OF LARGE SCALE GRAPHICAL MODELS: THEORY AND APPLICATIONS".
I gave
a keynote talk on "Time Varying Graphical Models: reverse engineering and analyzing rewiring networks" [slides] at
the NIPS
2009 Mini-Symposium on Machine Learning in Computational Biology.
I gave
a keynote talk on "Recent Advances in Learning Sparse Structured
Input/Output Model: Models, Algorithms, and Applications" at
the NIPS
2008 Workshop on "Structured Input, Structured Output".
I gave
a talk on "Time-Varying
Networks: Reconstructing Temporally/Spatially Rewiring Gene Interactions"
at the 2008 RECOMB Regulatory Genomics workshop.
I
co-organized NIPS
2012 Workshop on "Spectral Learning".
I
co-organized ICML
2011 Workshop on "Structured Sparsity: Learning and Inference".
I
co-organized NIPS
2008 Workshop on "Analyzing Graphs: Theories and Applications".
I
co-organized ICML
2007 Workshop on Learning in Structured Output Spaces.
I
co-organized NIPS
2007 Workshop on Statistical Models of Networks.
I gave
a keynote talk on "Graphical
models and algorithms for integrative bioinformatics at the 6th annual Graybill
Conference.
I gave
a keynote talk on
"Probabilistic graphical models --- theory, algorithm, and application"
at ICMLA'07.
|