Seunghak Lee’s Homepage

Seunghak Lee

6219 GHC
Computer Science Department
Carnegie Mellon University
5000 Forbes Avenue
Pittsburgh PA 15213

I am a PhD student of Computer Science Department at Carnegie Mellon University, and my advisor is Prof. Eric Xing. I received my M.Sc. in Computer Science at the University of Toronto under the supervision of Prof. Michael Brudno and B.S. at POSTECH in Chemistry and Computer Science and Engineering in Korea.

My areas of research are computational biology and machine learning. More specifically, I am interested in developing models and algorithms that can incorporate biological information into the inference of genome-wide association (GWA) mapping. For example, one of my research directions is to use pathway data, gene-gene interaction networks, gene ontology, trait networks, and genome annotations for boosting the power of detecting trait-associated genetic variants. I am also interested in developing scalable algorithms and systems that enable us to solve large-scale GWA mapping problems. One of my recent projects aims to solve the whole-genome regression (i.e., an ultra high-dimensional regression problem that includes all SNPs in human genome) under the control of false positives in a statistically sound manner.


Petuum: Our Distributed Large-Scale Machine Learning Framework

Latest Manuscripts

  • W. Dai, J. Wei, X. Zheng, J. Kim, S. Lee, J. Yin, Q. Ho, E. P. Xing,
    "Petuum: A Framework for Iterative-Convergent Distributed ML"
    in Manuscript, arXiv:1312.7651v1

  • S. Lee and E. P. Xing,
    "Efficient Algorithm for Extremely Large Multi-task Regression with Massive Structured Sparsity"
    in Manuscript, arXiv:1208.3014

  • S. Lee and E. P. Xing,
    "Structured Input-Output Lasso, with Application to eQTL Mapping, and a Thresholding Algorithm for Fast Estimation"
    in Manuscript, arXiv:1205.1989
  • Publications

  • S. Lee, J. K. Kim, X. Zheng, Q. Ho, G. A. Gibson, E. P. Xing,
    "Primitives for Dynamic Big Model Parallelism"
    will appear in Advances Neural Information Processing Systems 27 (NIPS 2014)

  • H. Cui, J. Cipar, Q. Ho, J. Kim, S. Lee, A. Kumar, J. Wei, W. Dai, G. R. Ganger, P. B. Gibbons, G. A. Gibson, and E. P. Xing,
    "Exploiting Bounded Staleness to Speed up Big Data Analytics"
    in USENIX Annual Technical Conference, June 19-20, 2014. Philadelphia, PA (ATC 2014)

  • E. P. Xing, R. Curtis, G. Schoenherr, S. Lee, J. Yin, K. Puniyani, W. Wu, P. Kinnaird,
    "GWAS in a Box: Statistical and Visual Analytics of Structured Associations via GenAMap"
    in PLoS One, 2014

  • Q. Ho, J. Cipar, H. Cui, J. Kim, S. Lee, P. B. Gibbons, G. Gibson, G. R. Ganger and E. P. Xing,
    "More Effective Distributed ML via a Stale Synchronous Parallel Parameter Server"
    in Advances in Neural Information Processing Systems 26 (NIPS 2013)

  • J. Cipar, Q. Ho, J. Kim, S. Lee, G. R. Ganger, G. Gibson, K. Keeton and E. P. Xing,
    "Solving the Straggler Problem with Bounded Staleness"
    in The 14th Workshop on Hot Topics in Operating Systems (HotOS XIV, 2013)

  • S. Lee and E. P. Xing,
    "Leveraging Input and Output Structures For Joint Mapping of Epistatic and Marginal eQTLs"
    in Proceedings of the 20th Annual International Conference on Intelligent Systems in Molecular Biology; Bioinformatics, 28:i137-i146 (ISMB 2012).

  • S. Lee, J. Zhu, and E. P. Xing,
    "Adaptive Multi-Task Lasso: with Application to eQTL Detection"
    in Advances in Neural Information Processing Systems 23 (NIPS 2010)
    [MATLAB source code (tar.gz)] source code is updated for speed-up 2/10/2014

  • S. Lee, E. P. Xing and M. Brudno,
    "MoGUL: Detecting Common Insertions and Deletions in a Population"
    in Fourteenth International Conference on Research in Computational Molecular Biology (RECOMB 2010)

  • S. Lee, F. Hormozdiari, C. Alkan and M. Brudno,
    "MoDIL: detecting small indels from clone-end sequencing with mixtures of distributions"
    in Nature Methods, 2009

  • S. Lee and S. Choi, "Ensembles of Landmark Multidimensional Scaling"
    in IEEE Int'l Conference on Acoustics, Speech, and Signal Processing (ICASSP 2009)

  • S. Lee and S. Choi, "Landmark MDS Ensemble"
    in Pattern Recognition, 2008

  • S. Lee, E. Cheran and M. Brudno, "A Robust Framework for Detecting Structural Variations in a Genome"
    in Proc. Int'l Conference on Intelligent Systems for Molecular Biology (ISMB 2008)
    Toronto, Canada, July 19-23, 2008 (accepted for oral presentation)

  • S. Lee, I. Jeong and S. Choi, "Dynamically Weighted Hidden Markov Model for Spam Deobfuscation"
    in Proc. Int'l Joint Conference on Artificial Intelligence (IJCAI 2007)
    Hyderabad, India, January 6-12, 2007 (accepted for oral presentation)

  • S. Lee and I. Jeong, "SDD: high performance code clone detection system for large scale source code"
    in Companion to the 20th annual ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications (OOPSLA 2005)