| 
 | Zhenzhen
  KouYahoo! Search Sciences 2821 Mission College Blvd, Santa Clara, CA 95054 E-mail: zzkou AT yahoo-inc DOT com |  | 
| I am now with Search Sciences
  Department at Yahoo! as a Relevance Scientist. My current project is machine
  learning for ranking. | ||
| Research at CMU:·     
  Interests:
  Machine Learning, Information Extraction/Retrieval, data mining  ·     
  Advisor:
  William W. Cohen and Robert F. Murphy  ·     
  Minorthird: software for text
  learning, classification, extraction and annotations  ·     
  SLIF: Subcellular Location
  Image Finder  ·     
  CALO: Cognitive
  Assistant that Learns and Organizes Thesis:My thesis, stacked graphical learning,
  is a statistical learning model for collective inference over relational
  data. The most important feature of stacked graphical learning is that it is very
  efficient than the existing models and thus very competitive in applications.
  I have applied the idea of my thesis to document classifications, and named
  entity extraction. Also I have applied it to some inter-related subtasks in a
  complex information extraction system.  Projects:·     
  Who Rated What       I worked with Yan Liu to develop on a link
  prediction model for movie recommendation, which ranks 3rd(second
  runner-up) in the KDD Cup 07.       Please check out our
  paper for details.  ·     
  Stacked Graphical Learning package in
  Minorthird I designed and
  implemented the Stacked Graphical Learning package in minorthird for
  classification on relational dataset. Stacked Graphical Learning is an
  efficient and effective statistical model for collective classification. Please find more
  about the model in our SDM07 paper.
  Here is a tutorial to the package. ·     
  Protein name extractors I developed
  several protein name extractors, including a protein name extractor trained
  with conditional random fields (CRFs) (download) and
  an extractor trained with dictionary hidden Markov models (Dictionary-HMM, download).
  Dictionary-HMM combines a dictionary with a Markov model to do soft match and
  extract names from free text. Please find more
  details about the algorithm of Dictionary-HMM in our ISMB05 paper.
  Here is how to use the extractors. ·     
  SLIF I also did projects on Optical Character Recognition (bioKDD03),
  designed and implemented a web interface to an SQL database(KSCE-2004).
  Please check out our SLIF
  webpage. ·     
  A
  tool for protein name annotation  I modified the labeling package in Minorthird and here is a labeling tool for protein name
  annotation. Please find the tutorial here
  on how to use the labeling tool. Resume:·     
  Curriculum Vitae [HTML]     Publications·     
  Yan Liu, Zhenzhen Kou, Claudia Perlich and Richard
  Lawrence (2008): Intelligent System for Workforce Classification,  in KDD 2008 Workshop on Data
  Mining for Business Applications. ·     
  Zhenzhen Kou, Vitor R. Carvalho
  and William W.
  Cohen (2007): Online
  Stacked Graphical Learning, to in NIPS 2007 Workshop on Efficient Machine
  Learning. ·     
  Yan
  Liu and Zhenzhen Kou (2007): Predicting Who
  Rated What in Large-Scale Datasets, in Proceedings of KDD Cup and
  Workshop 2007 ·     
  Zhenzhen
  Kou and William W. Cohen (2007): Notes for
  Stacked Graphical Models for Effcient Inference in Markov Random Fields
  Technical Report: CMU-ML-07-101.  ·     
  Zhenzhen
  Kou and William W. Cohen (2007): Stacked Graphical
  Models for Effcient Inference in Markov Random Fields in SDM 07.  ·     
  Zhenzhen
  Kou, William W. Cohen & Robert F. Murphy (2007): A Stacked Graphical
  Model for Associating Information from Text And Images In Figures in
  PSB07.  ·     
  Zhenzhen
  Kou, William W. Cohen & Robert F. Murphy (2005): High-Recall
  Protein Entity Recognition Using a Dictionary in ISMB-2005.  ·     
  R.
  Murphy, Z. Kou, J. Hua, M. Joffe, W. W. Cohen
  (2005): Extracting Structured Information
  from Text and Images in On-line Journal Articles for Localization Proteomics,
  in Biolink05. ·     
  Robert
  F. Murphy, Zhenzhen Kou, Juchang Hua, Matthew Joffe, William W. Cohen (2004):
  Extracting
  and Structuring Subcellular Location Information from On-line Journal
  Articles: The Subcellular Location Image Finder in KSCE-2004.  ·     
  William
  W. Cohen, Zhenzhen Kou & Robert F. Murphy (2003): Extracting
  Information from Text and Images for Location Proteomics in BIOKDD 2003:
  2-9.  ·     
  Zhenzhen
  Kou, Liang Ji and Xuegong Zhang(2001), Karyotyping of
  CGH human metaphase by using support vector machines, Cytometry, December
  2001.  ·     
  Zhenzhen
  Kou, Jianhua Xu, Xuegong Zhang and Liang Ji(2001), An Improved
  Support Vector Machine Using Class-Median Vectors, in proceedings of 8th
  International Conference on Neural Information Processing, 2001, Shanghai,
  China, pp883-887. | ||