Feras Saad

I recently completed the PhD in EECS at MIT (MEng/SB 2016), where I worked with Dr. Vikash Mansinghka and Prof. Martin Rinard. My research integrates ideas from programming languages, artificial intelligence, and statistics that together enable sound and scalable systems for probabilistic inference.

I am pleased to be joining the Computer Science Department at Carnegie Mellon University as an Assistant Professor in Fall 2023. Before joining, I am spending one year as a Visiting Research Scientist at Google.

My group is recruiting students and postdocs! If you are interested in the research areas described below, please send me an email and (prospective students) apply to the CMU CS PhD program.

Research

I have broad interests in developing new techniques that enable large-scale probabilistic modeling, inference, and computation across many applications. Some current research themes are described below.

Probabilistic programming. Programs are a uniquely expressive formalism for modeling and understanding complex empirical phenomena. I am interested in building systems that help automate, formalize, and scale-up hard aspects of modeling and inference. Projects include synthesizing probabilistic programs for automated model discovery, symbolic solvers for fast Bayesian inference, modeling and query DSLs for Bayesian databases, and general-purpose systems for integrating symbolic, probabilistic, and neural approaches to engineering intelligent systems.

Automatically discovering models from data. How can we rapidly convert datasets into probabilistic models that surface interpretable patterns and make accurate predictions? Our approach is to perform Bayesian nonparametric inference over symbolic model representations that combine simple rules to form powerful models in aggregate. These methods operate within domain-specific data modeling languages and have been applied to discovering models of cross-sectional data, relational systems, and univariate and multivariate time series.

Statistical estimators and tests. As computer programs become the standard representation for complex probability distributions (e.g., stochastic simulators or probabilistic programs) we need new techniques to analyze their statistical properties through the black-box interfaces they expose. Examples include goodness-of-fit tests for programs that simulate random discrete data structures and estimators of entropy and information for probabilistic generative models.

Fast random sampling algorithms. Generating random variates is a fundamental operation enabling probabilistic computation. I am interested in exploring fundamental computational limits of sampling, devising algorithms that are theoretically optimal or near-optimal in entropy and error, and engineering samplers with extremely efficient runtime and memory—see optimal approximate sampling and fast loaded dice roller.

Applications. To be useful in modern applications, all these methods must be implemented in performant software systems, many of which are available as open-source projects. We aim to build software that make probabilistic learning more broadly accessible and help domain specialists solve applied problems in the sciences, engineering, and public interest.

Publications

Software

Software and repositories from research projects (2500+ Github stars).

Sublime Text users check out these productivity plugins (8000+ users):
AddRemoveFolder; RemoveLineBreaks; ViewSetting.

Talks

Videos of presentations at conferences, workshops, and seminars.

Press

My research is occasionally covered in the press.

Awards