I am a 3rd-year Ph.D. student at Carnegie Mellon University. Under the guidance of Prof. Eric Xing, I study machine learning and computational biology with the goal of building systems to answer meaningful questions.

During my undergrad, I studied math and computer science as a Braddock Scholar in the Schreyer Honors College at Penn State University. In my spare time, I enjoy backpacking and composing for piano.

Contact me on Twitter, at blengeri[at]cs."$school".edu, or in person.


May 2017 Starting my internship at Roam in San Mateo, CA.
March 2017 Attending ENAR in Washington, DC.
February 2017 Presenting "Improving the Accuracy of GWAS" at the Pittsburgh Center for Drug Abuse Research.
Oct 2016 GenAMap has been accepted to the NSF I-Corps Fall 2016 cohort.
Oct 2016 Presenting GenAMap at ASHG.
Jun 2016 Volunteering at ICML.
Apr 2016 Created a pipeline for TCGA iPython notebooks at the "Hacking Cancer" Hackathon.
Apr 2016 Awarded an NSF GRFP Honorable Mention.
Aug 2015 Starting my Ph.D. at CMU!


My heart is motivated by the promise of precision medicine and my mind is captived by the puzzles of statistical machine learning. My Google Scholar profile is here.

Working Papers

Pre-prints available on request.
Hybrid Subspace Learning for High-Dimensional Data
Micol Marchetti-Bowick, Benjamin J. Lengerich, Ankur Parikh, Eric P. Xing.
Precision Lasso: Accounting for Correlations and Linear Dependencies in High-Dimensional Genomic Data
Haohan Wang, Benjamin J. Lengerich, Bryon Aragam, Eric P. Xing.
GenAMap on the Web: Visual Machine Learning for Next-Generation Genome Wide Association Studies
Haohan Wang*, Benjamin J. Lengerich*, Min Kyung Lee, Eric P. Xing.

Publications and Presentations

Opportunities And Obstacles For Deep Learning In Biology And Medicine.
Author order was determined by a randomized algorithm.
Travers Ching, Daniel S. Himmelstein, Brett K. Beaulieu-Jones, Alexandr A. Kalinin, Brian T. Do, Gregory P. Way, Enrico Ferrero, Paul-Michael Agapow, Wei Xie, Gail L. Rosen, Benjamin J. Lengerich, Johnny Israeli, Jack Lanchantin, Stephen Woloszynek, Anne E. Carpenter, Avanti Shrikumar, Jinbo Xu, Evan M. Cofer, David J. Harris, Dave DeCaprio, Yanjun Qi, Anshul Kundaje, Yifan Peng, Laura K. Wiley, Marwin H.S. Segler, Anthony Gitter, Casey S. Greene

GenAMap on the Web: Intuitive and Scalable Machine Learning for Structured Association Mapping.
Benjamin J. Lengerich*, Haohan Wang*, Min Kyung Lee, Eric P. Xing.
Presented at the 66th Annual Meeting of The American Society of Human Genetics, October 19, 2016, Vancouver, BC.
Experimental and Computational Mutagenesis To Investigate the Positioning of a General Base within an Enzyme Active Site.
Jason P. Schwans, Philip Hanoian, Benjamin J. Lengerich, Fanny Sunden, Ana Gonzalez, Yingssu Tsai, Sharon Hammes-Schiffer, and Daniel Herschlag.


On the Origin of Sequences: Computational Analysis of Somatic Hypermutation for Probabilistic Immunoglobulin Predecessor Identification
Benjamin J. Lengerich
Adviser: Raj Acharya, Supervisor: Jesse Barlow.

Undergraduate Thesis in completion of Schreyer Honors College requirements for honors in computer science.


In Spring 2017, I was a TA for Prof. Jian Ma and Prof. Maria Chikina's course 02-410/710 Computational Genomics. I gave a lecture on statistical methods for discovering genetic associations [pdf].

In Fall 2016, I was a TA for Prof. Chris Langmead's course 02-450/750 Automation of Biological Research: Robotics and Active Learning. I gave a lecture on active learning for Bayesian Networks [pdf].


My github page.

GenAMap is an open source platform for visual machine learning of structured association mappings between genotypes and phenotypes.


  • Roam Analytics

      San Mateo, CA.    Summer 2017

    Working on creating machine learning methods that can organize information from a 1-billion edge knowledge graph.

  • PhD Student at CMU

      Pittsburgh, PA.   From 09/2015

    PhD student at CMU, working with Prof. Eric Xing to design and implement systems and methods for statistical machine learning.
  • Google AdWords Express Team

      Mountain View, CA.    Summer 2014

    Implemented a new targeting method for Google Adwords Express based on machine learning of document localization tendencies.

  • Pennsylvania State University

      University Park, PA.    2011 - 2015

    Undergraduate degrees in computer science and mathematics. Research in theoretical chemistry.