|
|
Zhenzhen
Kou
Yahoo! Search Sciences 2821 Mission College Blvd, Santa Clara, CA 95054 E-mail: zkou AT andrew DOT cmu DOTedu |
|
|
I am now with Search Sciences
Department at Yahoo! as a Relevance Scientist. My current project is machine
learning for ranking. |
||
Research at CMU:
·
Interests:
Machine Learning, Information Extraction/Retrieval, data mining ·
Advisor:
William W. Cohen and Robert F. Murphy ·
Minorthird: software for text
learning, classification, extraction and annotations ·
SLIF: Subcellular Location
Image Finder ·
CALO: Cognitive
Assistant that Learns and Organizes Thesis:
My thesis, stacked
graphical learning, is a statistical learning model for collective
inference over relational data. The most important feature of stacked
graphical learning is that it is very efficient than the existing models
and thus very competitive in applications. I have applied the idea of my
thesis to document classifications, and named entity extraction. Also I have
applied it to some inter-related subtasks in a complex information extraction
system. Projects:
·
Who Rated What I worked with Yan Liu to develop on a link
prediction model for movie recommendation, which ranks 3rd(second
runner-up) in the KDD Cup 07. Please check out our
paper for details. ·
Stacked Graphical Learning package in
Minorthird I designed and implemented the Stacked Graphical Learning
package in minorthird for classification on relational dataset. Stacked
Graphical Learning is an efficient and effective statistical model for
collective classification. Please find more about the model in our SDM07 paper.
Here is a tutorial to the package. ·
Protein name extractors I developed several protein name extractors, including a
protein name extractor trained with conditional random fields (CRFs) (download) and
an extractor trained with dictionary hidden Markov models (Dictionary-HMM, download).
Dictionary-HMM combines a dictionary with a Markov model to do soft match and
extract names from free text. Please find more details about the algorithm of
Dictionary-HMM in our ISMB05 paper.
Here is how to use the extractors. ·
SLIF I also did projects on Optical
Character Recognition (bioKDD03),
designed and implemented a web interface to an SQL database(KSCE-2004).
Please check out our SLIF
webpage. ·
A
tool for protein name annotation I modified the labeling
package in Minorthird and here is a
labeling tool for protein name annotation. Please find the tutorial here on how to use the labeling
tool. Resume:
·
Curriculum Vitae [HTML] Publications
·
Yan Liu, Zhenzhen Kou, Claudia Perlich and Richard
Lawrence (2008): Intelligent System for Workforce Classification, in KDD 2008 Workshop on Data Mining
for Business Applications. ·
Zhenzhen Kou, Vitor R. Carvalho
and William W.
Cohen (2007): Online
Stacked Graphical Learning, to in NIPS 2007 Workshop on Efficient Machine
Learning. ·
Yan
Liu and Zhenzhen Kou (2007): Predicting Who
Rated What in Large-Scale Datasets, in Proceedings of KDD Cup and
Workshop 2007 ·
Zhenzhen
Kou and William W. Cohen (2007): Notes for
Stacked Graphical Models for Effcient Inference in Markov Random Fields
Technical Report: CMU-ML-07-101. ·
Zhenzhen
Kou and William W. Cohen (2007): Stacked Graphical
Models for Effcient Inference in Markov Random Fields in SDM 07. ·
Zhenzhen
Kou, William W. Cohen & Robert F. Murphy (2007): A Stacked Graphical
Model for Associating Information from Text And Images In Figures in
PSB07. ·
Zhenzhen
Kou, William W. Cohen & Robert F. Murphy (2005): High-Recall
Protein Entity Recognition Using a Dictionary in ISMB-2005. ·
R.
Murphy, Z. Kou, J. Hua, M. Joffe, W. W. Cohen (2005): Extracting Structured Information from Text
and Images in On-line Journal Articles for Localization Proteomics, in
Biolink05. ·
Robert
F. Murphy, Zhenzhen Kou, Juchang Hua, Matthew Joffe, William W. Cohen (2004):
Extracting
and Structuring Subcellular Location Information from On-line Journal
Articles: The Subcellular Location Image Finder in KSCE-2004. ·
William
W. Cohen, Zhenzhen Kou & Robert F. Murphy (2003): Extracting
Information from Text and Images for Location Proteomics in BIOKDD 2003:
2-9. ·
Zhenzhen
Kou, Liang Ji and Xuegong Zhang(2001), Karyotyping of
CGH human metaphase by using support vector machines, Cytometry, December
2001. ·
Zhenzhen
Kou, Jianhua Xu, Xuegong Zhang and Liang Ji(2001), An Improved
Support Vector Machine Using Class-Median Vectors, in proceedings of 8th
International Conference on Neural Information Processing, 2001, Shanghai,
China, pp883-887. |
||