Richard Wang

Richard C. Wang
王俊晴

Software Engineer
Google Inc., New York, NY 10011

Ph.D. & M.S. in Language Technologies
B.S. in Computer Science
School of Computer Science
Carnegie Mellon University
E-mail:
Resume: resume.pdf (last updated on 6/10/2009)

Ph.D. Thesis

Systems

  1. Automatic Set Instance Acquirer (ASIA)

    Instance Acquisition refers to extracting instances of a given semantic class name (e.g., car makers => ford, nissan, toyota). ASIA extracts set instances by utilizing hearst patterns along with the state-of-the-art set expansion technique implemented in SEAL (see below). ASIA currently supports input in multiple languages, including Chinese, Japanese, as well as English.

  2. Set Expander for Any Language (SEAL)

    Set Expansion refers to expanding a given partial set of objects into a more complete set (e.g., ford, nissan => toyota, audi, buick). A well-known example system that does set expansion using the web is Google Sets. SEAL uses a novel method for expanding sets of named entities. The approach can be applied to semi-structured documents written in any markup language and in any human language.

Publications DBLP

  1. Andrew Carlson, Justin Betteridge, Richard C. Wang, Estevam R. Hruschka Jr. and Tom M. Mitchell: Coupled Semi-Supervised Learning for Information Extraction. In Proceedings of the Third ACM International Conference on Web Search and Data Mining (WSDM 2010), New York (Brooklyn), New York, USA. 2010.

  2. Tom M. Mitchell, Justin Betteridge, Andrew Carlson, Estevam R. Hruschka Jr. and Richard C. Wang: Populating the Semantic Web by Macro-Reading Internet Text. Invited paper. In Proceedings of the 8th International Semantic Web Conference (ISWC 2009), Chantilly, Virginia, USA. 2009.

  3. Richard C. Wang and William W. Cohen: Character-Level Analysis of Semi-Structured Documents for Set Expansion. In Proceedings of Conference on Empirical Methods in Natural Language Processing (EMNLP 2009), Suntec City, Singapore. 2009.


  4. Richard C. Wang and William W. Cohen: Automatic Set Instance Extraction using the Web. In Proceedings of Joint Conference of the Association for Computational Linguistics and the International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing (ACL-IJCNLP 2009), Suntec City, Singapore. 2009.


  5. Richard C. Wang and William W. Cohen: Iterative Set Expansion of Named Entities using the Web. In Proceedings of IEEE International Conference on Data Mining (ICDM 2008), Pisa, Italy. 2008.


  6. Richard C. Wang, Nico Schlaefer, William W. Cohen and Eric Nyberg: Automatic Set Expansion for List Question Answering. In Proceedings of Conference on Empirical Methods in Natural Language Processing (EMNLP 2008), Honolulu, Hawaii, USA. 2008.


  7. Eric Nyberg, Eric Riebling, Richard C. Wang and Robert Frederking: Integrating a Natural Language Message Pre-Processor with UIMA. In Proceedings of LREC Workshop - Towards Enhanced Interoperability for Large HLT Systems: UIMA for NLP, 2008.


  8. Richard C. Wang, Anthony Tomasic, Robert E. Frederking, Isaac Simmons and William W. Cohen: Learning to Extract Gene-Protein Names from Weakly-Labeled Text. In CMU SCS Technical Report Series (CMU-LTI-08-004), 2008.

  9. Richard C. Wang and William W. Cohen: Language-Independent Set Expansion of Named Entities using the Web. In Proceedings of IEEE International Conference on Data Mining (ICDM 2007), Omaha, NE, USA. 2007.


  10. Einat Minkov, Richard C. Wang, Anthony Tomasic and William W. Cohen: NER Systems that Suit Users Preferences: Adjusting the Recall-Precision Trade-off for Entity Extraction. In Proceedings of Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics (HLT/NAACL 2006), New York, NY, USA. 2006, pp 93-96.

  11. Einat Minkov, Richard C. Wang and William W. Cohen: Extracting Personal Names from Emails: Applying Named Entity Recognition to Informal Text. In Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP 2005), Vancouver, B.C., Canada. 2005, pp 443-450.


  12. William W. Cohen, Richard C. Wang and Robert Murphy: Understanding Captions in Biomedical Publications. In Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2003, pp 499-504.
Richard Wang  王俊晴  Richard Wang  王俊晴  Richard Wang  王俊晴