Yahoo! Search Sciences
2821 Mission College Blvd, Santa Clara, CA 95054
E-mail: zzkou AT yahoo-inc DOT com
I am now with Search Sciences Department at Yahoo! as a Relevance Scientist.
My current project is machine learning for ranking.
· Interests: Machine Learning, Information Extraction/Retrieval, data mining
· Minorthird: software for text learning, classification, extraction and annotations
· SLIF: Subcellular Location Image Finder
· CALO: Cognitive Assistant that Learns and Organizes
My thesis, stacked graphical learning, is a statistical learning model for collective inference over relational data. The most important feature of stacked graphical learning is that it is very efficient than the existing models and thus very competitive in applications. I have applied the idea of my thesis to document classifications, and named entity extraction. Also I have applied it to some inter-related subtasks in a complex information extraction system.
· Who Rated What
I worked with Yan Liu to develop on a link prediction model for movie recommendation, which ranks 3rd(second runner-up) in the KDD Cup 07.
Please check out our paper for details.
· Stacked Graphical Learning package in Minorthird
I designed and implemented the Stacked Graphical Learning package in minorthird for classification on relational dataset. Stacked Graphical Learning is an efficient and effective statistical model for collective classification.
· Protein name extractors
I developed several protein name extractors, including a protein name extractor trained with conditional random fields (CRFs) (download) and an extractor trained with dictionary hidden Markov models (Dictionary-HMM, download). Dictionary-HMM combines a dictionary with a Markov model to do soft match and extract names from free text.
· A tool for protein name annotation
· Curriculum Vitae [HTML]
· Yan Liu, Zhenzhen Kou, Claudia Perlich and Richard Lawrence (2008): Intelligent System for Workforce Classification, in KDD 2008 Workshop on Data Mining for Business Applications.
· Zhenzhen Kou, Vitor R. Carvalho and William W. Cohen (2007): Online Stacked Graphical Learning, to in NIPS 2007 Workshop on Efficient Machine Learning.
· Yan Liu and Zhenzhen Kou (2007): Predicting Who Rated What in Large-Scale Datasets, in Proceedings of KDD Cup and Workshop 2007
· Zhenzhen Kou and William W. Cohen (2007): Notes for Stacked Graphical Models for Effcient Inference in Markov Random Fields Technical Report: CMU-ML-07-101.
· Zhenzhen Kou and William W. Cohen (2007): Stacked Graphical Models for Effcient Inference in Markov Random Fields in SDM 07.
· Zhenzhen Kou, William W. Cohen & Robert F. Murphy (2007): A Stacked Graphical Model for Associating Information from Text And Images In Figures in PSB07.
· Zhenzhen Kou, William W. Cohen & Robert F. Murphy (2005): High-Recall Protein Entity Recognition Using a Dictionary in ISMB-2005.
· R. Murphy, Z. Kou, J. Hua, M. Joffe, W. W. Cohen (2005): Extracting Structured Information from Text and Images in On-line Journal Articles for Localization Proteomics, in Biolink05.
· Robert F. Murphy, Zhenzhen Kou, Juchang Hua, Matthew Joffe, William W. Cohen (2004): Extracting and Structuring Subcellular Location Information from On-line Journal Articles: The Subcellular Location Image Finder in KSCE-2004.
· William W. Cohen, Zhenzhen Kou & Robert F. Murphy (2003): Extracting Information from Text and Images for Location Proteomics in BIOKDD 2003: 2-9.
· Zhenzhen Kou, Liang Ji and Xuegong Zhang(2001), Karyotyping of CGH human metaphase by using support vector machines, Cytometry, December 2001.
· Zhenzhen Kou, Jianhua Xu, Xuegong Zhang and Liang Ji(2001), An Improved Support Vector Machine Using Class-Median Vectors, in proceedings of 8th International Conference on Neural Information Processing, 2001, Shanghai, China, pp883-887.