Yi Zhang

 

I am a third-year PhD student at Machine Learning Department, School of Computer Science, Carnegie Mellon University. Currently, I'm working with Jeff Schneider on learning in high-dimensional space with minimal supervision, large-scale web mining, and parallel machine learning (see research interests for details). I've worked as a summer intern on behavioral targeting and computational advertising at Yahoo! Labs and parallel machine learning on Hadoop at IBM T.J. Watson.

News

I recently won the Yahoo! Key Scientific Challenges Award (20 awardees worldwide).

I was recently awarded the IBM PhD Fellowship.

Curriculum Vitae

Available as pdf.

Education

09/2007 -- present         Machine Learning Department, Carnegie Mellon University
GPA: 4.20/4.00
09/2005 -- 09/2006 Department of Computer Science and Technology, Tsinghua University
GPA: 97.2/100 (Rank: 1/84 in the department)
09/2001 -- 06/2005 School of Software, Tsinghua University
Bachelor of Engineering (with the highest honor) in Computer Software
GPA: 89.2/100 (Rank: 1/51 in the school)

Graduate Courses

10-701 Machine Learning: A+     (Instructor: Carlos Guestrin)
10-702 Statistical Machine Learning: A+     (Instructor: Larry Wasserman)
10-725 Optimization: A+     (Instructor: Geoffrey Gordon and Carlos Guestrin)
10-708 Probabilistic Graphical Models: A+     (Instructor: Carlos Guestrin)
15-826 Multimedia Databases and Data Mining: A+     (Instructor: Christos Faloutsos)
10-705 Intermediate Statistics: A     (Instructor: Matthew Harrison)
36-724 Applied Bayesian Methods: A     (Instructor: Surya Tokdar)
15-853 Algorithms in the Real World: A     (Instructor: Guy Blelloch and Daniel Golovin)
47-811 Econometrics I: A     (Instructor: Fallaw Sowell)

Research Interests

1) Learning in high-dimensional space with minimal supervision: regularization with model compression, learning with unlabeled text from the Web, matrix-variate distributions for modeling multiple tasks, constraint-driven active learning.

2) Web mining: web information extraction, computational advertising and behavioral targeting.

3) Parallel machine learning, e.g., ensemble learning and additive models on Hadoop.

Industry Experimence

06/2008 - 08/2008: Yahoo! Labs (Mentor: Ye Chen)   Behavioral Targeting and Computational Advertising
06/2009 - 08/2009: IBM T.J. Watson (Mentor: Rong Yan)   Parallel Machine Learning for Multimedia Analysis

Publications (Organized by Topics)

1. Learning in High-Dimensional Space with Minimal Supervision

Yi Zhang. Smart PCA. The 21th International Joint Conference on Artificial Intelligence (IJCAI), 2009. (pdf)

Yi Zhang, Jeff Schneider and Artur Dubrawski. Learning the Semantic Correlation: An Alternative Way to Gain from Unlabeled Text. The 21st Neural Information Processing Systems (NIPS), 2008. (pdf)

2. Model Selection in Ensemble Learning

Yi Zhang and Xiaoming Jin. Concept Sampling: Towards Systematic Selection in Large-Scale Mixed Concepts in Machine Learning. The 20th International Joint Conference on Artificial Intelligence (IJCAI), 2007. (pdf)

3. Mining Streaming Data

Yi Zhang and Xiaoming Jin. An Automatic Construction and Organization Strategy for Ensemble Learning on Data Streams. SIGMOD Record, Vol. 35, No. 3, 2006. (pdf)

Yi Zhang and Xiaoming Jin. Classifying Data Streams by Training Data Combination. The 1st China Symposium on Classification and Applications, 2005.

4. Neural Computation with Complex Networks: Small-World and Scale-Free

Zhidong Deng and Yi Zhang. Complex Systems Modeling Using Scale-Free Highly-Clustered Echo State Network. International Joint Conference on Neural Networks, 2006. (pdf)

Zhidong Deng and Yi Zhang. Collective Behavior of a Small-World Recurrent Neural System with Scale-Free Distribution. IEEE Transactions on Neural Networks, Vol. 18, Issue 5, 2007 (pdf)

5. Computational Biology: Analyzing Time-Series Gene Expression Data

Yi Zhang and Zhidong Deng. Identifying Biological Pathways via Phase Decomposition and Profile Extraction. Computational Systems Bioinformatics, 2006 (pdf)

Working Papers

Yi Zhang, Jeff Schneider and Artur Dubrawski. Learning Compressible Models.

Yi Zhang and Jeff Schneider. Learning Multiple Tasks with Matrix-Variate Distributions

Yi Zhang and Duen Horng Chau. Portfolio Optimization with Regularized Predictive Distributions: Sparse-Gaussian Kalman Filters.

 

----- Where Am I? -----

Contact Info

email:
yizhang1 at cs.cmu.edu

office:
3721 Wean Hall
Carnegie Mellon University

phone:
412.589.2501