RESEARCH


    In the SAILING Lab, we strive to explore a standardized and unified approach to Machine Learning. Our research spans across core machine learning algorithms and theory, adaptive and scalable systems for machine learning, and applications of machine learning in a number of areas including healthcare, augmented intelligence, and autonomous control. We develop trustworthy, explainable, and personalizable methods for reasoning and decision-making with visual, textual, bio-omic, and time serious data, and general theoretical and system frameworks for compositional ML, meta ML, and auto ML.

    ML



    The following themes have been or are being studied in my group:

    o   Core Machine learning: with emphasis on theory and algorithms for learning complex probabilistic models, learning with prior knowledge, and reasoning and decision-making in open, evolving and uncertain possible worlds. Of particular interest are:

    1)     Theory and algorithms for learning time/space varying-coefficient models with evolving structures or sample-specific (personalized) structures

    2)     Meta ML and trustworthy ML for generalizable and adversary-robust algorithms

    3)     The "standard equation" for ML: building a unifying framework for various ML paradigms via a standardized forms of loss, model, and solver.

    4)     Theory and algorithms for learning sparse structured input/output models and multi-task models in ultra high-dimensional space

    5)     Nonparametric Bayesian methods, infinite mixture models, algorithms and applications of Bayesian nonparametrics for data mining and object/topic/event tracking in open, evolving possible worlds

    6)     Nonparametric graphical models, RKHS embedding and spectrum algorithm for general graph models

    7)     Distributed and online algorithms for optimization, approximate inference, and Monte Carlo sampling on large-scale data and models

    o   System Architecture and Strategies for Large Scale ML: with emphasize on developing general purpose systems for machine learning on massive data with massive model on industrial-scale multicore and distributed systems. Of particular interest are:

    1)     Design and implementation of representations and systems for composable ML parallelism

    2)     Global and local protocols for adaptive scheduling in multi-tenant multi-job distributed ML

    3)     Theoretical analysis of distributed ML system behaviors

    4)     Automated model learning and tuning via neural architectural search (NAS) and hyperparameter optimization (HPO)

    o   Healthcare and Medical Applications: with emphasis on developing algorithms and solutions that address problems of practical clinical, medical, and biological concerns. Of particular interest are:

    1)     Robot radiologist: reasoning on rediological images, clinical case report generation, medical training image generation

    2)     EHR-based patient modeling and prediction, ICD coding

    3)     Sample specific models for panomic-microenvironment interactions in cancer development or cell differentiation via joint analysis of genomic, proteomic, cytogenetic and pathway signaling data

    4)     Statistical inference on genetic fingerprints, pedigrees, and their associations to diseases and other complex traits; application to clinical diagnosis and forensic analysis

    o   Information and Intelligent Systems: with emphasis on developing web-scale, multi-core, and on-line machine learning systems for social media, computer vision, and HCI applications. Of particular interest are:

    1)     Multi-view latent space models, topics models, sparse coding methods for image/text/relational information retrieval

    2)     Evolving structure, stable metrics, and prediction for large-scale dynamic social networks; goal-driven network design/modification/optimization

    3)     eb-scale image understanding, search, annotation, and retrieval; photo storyline; analysis of video and multimedia

    4)     User modeling and personalization, computational advertising, and temporal analysis based on image, text, and activities

Last updated 06/01/19