Feras Saad

I recently completed the PhD in EECS at MIT (MEng/SB 2016), where I worked with Dr. Vikash Mansinghka and Prof. Martin Rinard. My research spans programming languages, artificial intelligence, and computational statistics.

I am pleased to be joining the Computer Science Department at Carnegie Mellon University as an Assistant Professor in Fall 2023. Before joining, I am spending one year as a Visiting Research Scientist at Google.

My group is recruiting students and postdocs! If you are interested in the research areas described below, please send me an email and (prospective students) apply to the CMU CS PhD program.

Research

I have broad interests in developing new techniques that enable large-scale probabilistic modeling, inference, and computation across many applications. My work integrates ideas from programming and probability and encompasses the following research areas:

Probabilistic programming. Programs are a uniquely expressive formalism for modeling and understanding complex empirical phenomena. I am interested in building systems that help automate, formalize, and scale-up hard aspects of modeling and inference. Projects include synthesizing probabilistic programs for automated model discovery, symbolic solvers for fast Bayesian inference, modeling and query DSLs for Bayesian databases, and general-purpose systems for integrating symbolic, probabilistic, and neural approaches to engineering intelligent systems.

Automatically discovering models from data. How can we rapidly convert datasets into probabilistic models that surface interpretable patterns and make accurate predictions? By performing Bayesian nonparametric inference over flexible symbolic model representations that combine simple forms of judgement to form powerful models in aggregate. These methods operate within domain-specific data modeling languages and have been applied to discovering models of cross-sectional data, relational systems, and univariate and multivariate time series.

Statistical estimators and tests. Now that most interesting probability distributions are expressed as computer programs (e.g., stochastic simulators or probabilistic programs) we need new techniques to analyze their statistical properties through the black-box interfaces they expose. Examples include goodness-of-fit tests for programs that simulate random discrete data structures and estimators of entropy and information for probabilistic generative models.

Fast random sampling algorithms. Generating random variates is a fundamental operation enabling probabilistic computation. I am interested in exploring fundamental computational limits of sampling, devising algorithms that are theoretically optimal or near-optimal in entropy and error, and engineering samplers with extremely efficient runtime and memory—see optimal approximate sampling and fast loaded dice roller.

Applications. To be useful in modern applications, all these methods must be implemented in performant software systems, many of which are available as open-source projects. A long-term aim is to build software that make probabilistic learning more broadly accessible and help domain specialists solve applied problems in the sciences, engineering, and public interest.

Publications

Software

Software and repositories from research projects (2500+ Github stars).

Sublime Text users check out these productivity plugins (8000+ users):
AddRemoveFolder; RemoveLineBreaks; ViewSetting.

Talks

Videos of presentations at conferences, workshops, and seminars.

Press

My research is occasionally covered in the press.

Awards