Recent
Activities:
Research and Development:

On June 11th, 2020, we launched the
Petuum ML
open source consortium that brings our research and development at Petuum Inc. and CMU Sailing Lab on Distributed ML (e.g.,
AutoDist,
AdaptDL),
Automated ML (e.g.,
Dragonfly,
ProBO),
and Composable ML (e.g.,
Texar,
Forte)
implemented across PyTorch and TensorFlow under a unified umbrella.

On December 25th, 2013, we made an initial
opensource release of Petuum,
a new framework for distributed machine learning with massive data, big
models, and a wide spectrum of algorithms. Updates on Petuum are released every
three months. The latest release (version 1.1) was made in July, 2015.
Teaching:

I have been teaching Probabilistic Graphical Models
(10708), an advanced graduate course on theory, algorithm, and application for multivariate modeling, inference, and deep learning since 2005 at CMU. For all the past versions, please see here.

Video lectures of Probabilistic Graphical Models (10708):
2014,
2019,
2020.

I regularly teach
Graduate Machine Learning (10701), which is a
general Ph.D.level intro. ML for CMU students from all majors.
Sabbatical and Leave:
Talks and Tutorials:

From Learning, to MetaLearning, to "LegoLearning  A pathway toward autonomous AI
[video][slides], CMU AI Seminar, 2022.

From Learning, to MetaLearning, to "LegoLearning  theory, system, and applications
[video], BaiDu Seminar, 2021.

It is time for deep learning to understand its expense bills
[video], KDD Deep Learning Day 2021.

Learningtolearn through Modelbased Optimization: HPO, NAS, and Distributed Systems
[video], ACL 2021 workshop on Meta Learning and Its Applications to Natural Language Processing.

A DataCentric View for Composable Natural Language Processing
[video1] [video2], ICML 2021 Machine Learning for Data Workshop.

Thoughts and Efforts on AI Meeting Production
[video], Jeffrey L. Elman Distinguished Lecture Series, Halicioglu Data Science Inst., UC San Diego, 2021.

Simplifying and Automating Parallel Machine Learning via a Programmable and Composable Parallel ML System
[slides]
[video],
Tutorial, AAAI 2021.

From Performanceoriented AI to Production and IndustrialAI
[video],
Michigan Institute for Data Science, 2020.

A Blueprint of Standardized and Composable Machine Learning
,
[slides]
[video],
Institute for Advanced Study, Princeton, 2020.

Compositionality in Machine Learning
,
[slides]
[video],
Open Data Science Conference (ODSC) West 2019.

A Civil Engineering Perspective on Artificial Intelligence From Petuum
[slides],
Distinguished Lectures in Computational Innovation, Columbia University, 2018.

A Statistical Machine Learning Perspective of Deep Learning: Algorithm, Theory, and Scalable Computing
[slides],
tutorial at the International Summer School on Deep Learning, Genova, Italy, 2018.

Standardized Tests as benchmarks for Artificial Intelligence
[slides],
tutorial at EMNLP, Melbourne, Australia, 2018.

PetuumMed: algorithms and system for EHRbased medical decision support
[slides], MIT, 2018.

System and Algorithm CoDesign, Theory and Practice, for Distributed Machine Learning
[slides],
[video],
at the Simons Institute for the Theory of Computing, Berkeley, 2017.

Strategies & Principles for Distributed Machine Learning
[slides],
[video],
Allen Institute for AI, 2016.

The Machine Learning Behind Reading and Comprehension
[slides],
Summit of Language and AI, China, 2016.

A New Look at the System, Algorithm and Theory Foundations of Distributed Machine Learning
[slides],
tutotial with Dr. Qirong Ho at the
21st ACM SIGKDD Conference on knowledge Discovery and Data Mining (KDD 2015).

Big ML Software for Modern ML Algorithms
[slides],
tutotial with Dr. Qirong Ho at the
2014 IEEE International Conference on Big Data (IEEE BigData 2014).

Topic Models, Latent Space Models, Sparse Coding, and All That: A systematic understanding of probabilistic semantic extraction in large corpus
[slides], tutotial at the
50th Annual Meeting of the Association for Computational Linguistics (ACL 2012).

Modern Statistical Methods for Genetic Association Study: Structured
GenomeTranscriptomePhenome Association Analysis
[slides],
tutotial With Dr. Seyoung Kim, at the
Nineteenth International
Conference on Intelligence Systems for Molecular Biology
(ISMB 2011).
Some earlier talks:
I gave an invited talk on "On Learning Sparse Structured InputOutput Models" [slides] at the
Conference on Empirical Methods in Natural Language Processing and
Computational Natural Language Learning (EMNLP 2012).
I gave a tutorial on "Topic Models, Latent Space Models, Sparse Coding, and All That: A systematic understanding of probabilistic semantic extraction in large corpus" [slides] at the 50th Annual Meeting of the Association for Computational Linguistics (ACL 2012).
With Dr. Seyoung Kim, we gave a tutorial on "
Modern Statistical Methods for Genetic Association Study: Structured
GenomeTranscriptomePhenome Association Analysis" [slides] at the Nineteenth International Conference on Intelligence Systems for Molecular Biology (ISMB 2011).
I gave a keynote talk on "Sparsity and Learning Large Scale Models" [slides] at the 2011 CVPR Workshop on
Large Scale Learning for Vision.
I gave a keynote talk on "Dynamic Network Analysis: Model, Algorithm, Theory, and Application" [slides] at the Eighth Workshop on Mining and Learning with Graphs, 2010.
I gave a keynote talk on "GenomePhenome Association Analysis of Complex Diseases  a Structured Sparse Regression Approach" [slides] at the Tenth Annual International Workshop on Bioinformatics and Systems Biology, 2010.
I gave
a keynote talk on "Jointly Maximum Margin and Maximum Entropy Learning of Graphical Models" [slides] at
the NIPS
2009 Workshop on "APPROXIMATE LEARNING OF LARGE SCALE GRAPHICAL MODELS: THEORY AND APPLICATIONS".
I gave
a keynote talk on "Time Varying Graphical Models: reverse engineering and analyzing rewiring networks" [slides] at
the NIPS
2009 MiniSymposium on Machine Learning in Computational Biology.
I gave
a keynote talk on "Recent Advances in Learning Sparse Structured
Input/Output Model: Models, Algorithms, and Applications" at
the NIPS
2008 Workshop on "Structured Input, Structured Output".
I gave
a talk on TimeVarying
Networks: Reconstructing Temporally/Spatially Rewiring Gene Interactions
at the 2008 RECOMB Regulatory Genomics workshop.
I
coorganized NIPS
2012 Workshop on "Spectral Learning".
I
coorganized ICML
2011 Workshop on "Structured Sparsity: Learning and Inference".
I
coorganized NIPS
2008 Workshop on "Analyzing Graphs: Theories and Applications".
I
coorganized ICML
2007 Workshop on Learning in Structured Output Spaces.
I
coorganized NIPS
2007 Workshop on Statistical Models of Networks.
I gave
a keynote talk on "Graphical
models and algorithms for integrative bioinformatics" at the 6th annual Graybill
Conference.
I gave
a keynote talk on
"Probabilistic graphical models: theory, algorithm, and application"
at ICMLA 07.
Services:

I am a member of the DARPA Information Science and Technology (ISAT) Advisory Group.

And I serve on the NIH BioData Management and Analysis (BDMA) Study Section.