I am a rising third-year Ph.D. student at Carnegie Mellon University, advised by Eric Xing. I am interested in statistical machine learning for healthcare and the theoretical problems that arise from the constraints of real-world data. These include building interpretable, robust systems for prediction on structured genomic, medical, and other types of data.

Previously, I studied math and computer science as a Braddock Scholar in the Schreyer Honors College at Penn State University. I did research with Sharon Hammes-Schiffer's Theoretical Chemistry Lab and Raj Acharya's ALISA lab. In my free time, I enjoy backpacking and composing for piano.

In Fall 2017, I will be co-founding the AI+ club at CMU. If you are interested in sponsoring talks, contact me by email.

Contact me on Twitter, by email, or in person.


August 2017 Our preprint "Retrofitting Distributional Embeddings to Knowledge Graphs with Functional Relations" is available on ArXiv.
July 2017 Towards Visual Explanations for Convolutional Neural Networks via Input Resampling has been accepted to ICML Workshop on Visualization for Deep Learning.
May 2017 Starting my internship at Roam in San Mateo, CA.
March 2017 Attending ENAR in Washington, DC.
February 2017 Presenting "Improving the Accuracy of GWAS" at the Pittsburgh Center for Drug Abuse Research.


My heart is motivated by the promise of precision medicine and my mind is captivated by the puzzles of statistical machine learning. You can find a list of my publications according to Google Scholar, Semantic Scholar, or DBLP.

Publications and Presentations

Towards Visual Explanations for Convolutional Neural Networks via Input Resampling
Benjamin J. Lengerich*, Sandeep Konam, Eric P. Xing, Stephanie Rosenthal, Manuela Veloso
ICML Workshop on Visualization for Deep Learning
Opportunities And Obstacles For Deep Learning In Biology And Medicine
Author order was determined by a randomized algorithm.
Travers Ching, Daniel S. Himmelstein, Brett K. Beaulieu-Jones, Alexandr A. Kalinin, Brian T. Do, Gregory P. Way, Enrico Ferrero, Paul-Michael Agapow, Wei Xie, Gail L. Rosen, Benjamin J. Lengerich, Johnny Israeli, Jack Lanchantin, Stephen Woloszynek, Anne E. Carpenter, Avanti Shrikumar, Jinbo Xu, Evan M. Cofer, David J. Harris, Dave DeCaprio, Yanjun Qi, Anshul Kundaje, Yifan Peng, Laura K. Wiley, Marwin H.S. Segler, Anthony Gitter, Casey S. Greene

GenAMap on the Web: Intuitive and Scalable Machine Learning for Structured Association Mapping.
Benjamin J. Lengerich*, Haohan Wang*, Min Kyung Lee, Eric P. Xing
The 66th Annual Meeting of The American Society of Human Genetics, October 19, 2016, Vancouver, BC.
Experimental and Computational Mutagenesis To Investigate the Positioning of a General Base within an Enzyme Active Site
Jason P. Schwans, Philip Hanoian, Benjamin J. Lengerich, Fanny Sunden, Ana Gonzalez, Yingssu Tsai, Sharon Hammes-Schiffer, and Daniel Herschlag


On the Origin of Sequences: Computational Analysis of Somatic Hypermutation for Probabilistic Immunoglobulin Predecessor Identification
Benjamin J. Lengerich
Adviser: Raj Acharya, Supervisor: Jesse Barlow.

Undergraduate Thesis in completion of Schreyer Honors College requirements for honors in computer science.


In Spring 2017, I was a TA for Prof. Jian Ma and Prof. Maria Chikina's course 02-410/710 Computational Genomics. I gave a lecture on statistical methods for discovering genetic associations [pdf].

In Fall 2016, I was a TA for Prof. Chris Langmead's course 02-450/750 Automation of Biological Research: Robotics and Active Learning. I gave a lecture on active learning for Bayesian Networks [pdf].


GenAMap is an open source platform for visual machine learning of structured association mappings between genotypes and phenotypes.

My github page.