Ben Lengerich

Hello! I'm a Ph.D. student in the CS Department at Carnegie Mellon University, advised by Prof. Eric Xing. I've also been very blessed to spend time working with Rich Caruana at Microsoft Research (2019, 2020) and Chris Potts at Roam Analytics (2017). My work has been supported by the CMLH Fellowship.

Research Interests: Motivated by the promise of precision medicine and focused on methodology for uncovering meaningful patterns from complex data, which often contain heterogeneous samples with conflicting patterns. My thesis is focused on sample-specific models -- what would it mean if we could estimate models for each individual sample? For more, please refer here.

Email address:	`blengeri@cs.cmu.edu`
Office:	`9005 Gates`

News

Defended my PhD Thesis Sample-Specific Models for Precision Medicine. Thanks to my thesis committee of Eric Xing, Zico Kolter, Ziv Bar-Joseph, Rich Caruana, and Manolis Kellis for all of their help.

December 2020

Our pre-print "Distentangling Increased Testing from Covid-19 Epidemic Spread" is now on on Medrxiv.

July 10, 2020

Our pre-print "On Dropout, Overfitting, and Interaction Effects" is now on on Arxiv.

July 2, 2020

Our pre-print "Discriminative Subtyping of Lung Cancers" is now on on Medrxiv.

June 26, 2020

I'll be (virtually) heading back to MSR this summer to continue work with Rich Caruana on models of mortality risk.

March 4, 2020

Presented "Interaction Effects: Helpful or Harmful?" at CMU's AI Seminar.

February 18, 2020

Selected Publications

Purifying Interaction Effects with the Functional ANOVA: An Efficient Algorithm for Recovering Identifiable Additive Models

Benjamin J. Lengerich, Sarah Tan, Chun-Hao Chang, Giles Hooker, Rich Caruana

Abstract Pre-print Paper Cite

@InProceedings{pmlr-v108-lengerich20a, title = {Purifying Interaction Effects with the Functional ANOVA: An Efficient Algorithm for Recovering Identifiable Additive Models}, author = {Lengerich, Benjamin and Tan, Sarah and Chang, Chun-Hao and Hooker, Giles and Caruana, Rich}, pages = {2402--2412}, year = {2020}, editor = {Silvia Chiappa and Roberto Calandra}, volume = {108}, series = {Proceedings of Machine Learning Research}, address = {Online}, month = {26--28 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v108/lengerich20a/lengerich20a.pdf}, url = {http://proceedings.mlr.press/v108/lengerich20a.html}, abstract = {Models which estimate main effects of individual variables alongside interaction effects have an identifiability challenge: effects can be freely moved between main effects and interaction effects without changing the model prediction. This is a critical problem for interpretability because it permits “contradictory" models to represent the same function. To solve this problem, we propose pure interaction effects: variance in the outcome which cannot be represented by any subset of features. This definition has an equivalence with the Functional ANOVA decomposition. To compute this decomposition, we present a fast, exact algorithm that transforms any piecewise-constant function (such as a tree-based model) into a purified, canonical representation. We apply this algorithm to Generalized Additive Models with interactions trained on several datasets and show large disparity, including contradictions, between the apparent and the purified effects. These results underscore the need to specify data distributions and ensure identifiability before interpreting model parameters.} } }

AISTATS 2020

Learning Sample-Specific Models with Low-Rank Personalized Regression

Benjamin J. Lengerich, Bryon Aragam, Eric P. Xing

Abstract Pre-print Paper Poster Code Cite

NeurIPS 2019

Retrofitting Distributional Embeddings to Knowledge Graphs with Functional Relations

Benjamin J. Lengerich, Andrew L. Maas, Christopher Potts

Abstract Paper Pre-print Code

COLING 2018

Personalized Regression Enables Sample-Specific Pan-Cancer Analysis

Benjamin J. Lengerich, Bryon Aragam, Eric P. Xing

Abstract Paper Pre-print Slides Video Code Cite

ISMB 2018

A more complete list of my publications can be found here.

Ben Lengerich

News

Selected Publications

Motivation

Results