William W. Cohen's Papers: Information Extraction

  1. Lidong Bing, Bhuwan Dhingra, Kathryn Mazaitis, Jong Hyuk Park, William W. Cohen (2016): Bootstrapping Distantly Supervised IE using Joint Learning and Small Well-structured Corpora in arxiv 1606.03398.
  2. Lidong Bing, William W. Cohen, Bhuwan Dhingra, and Richard C. Wang (2016): Using Graphs of Classifiers to Impose Constraints on Semi-supervised Relation Extraction in WAKBC-2016.
  3. Zhilin Yang, Ruslan Salakhutdinov, William Cohen (2016): Revisiting Semi-Supervised Learning with Graph Embeddings in ICML-2016.
  4. Zhilin Yang, Ruslan Salakhutdinov, William Cohen (2016): Multi-Task Cross-Lingual Sequence Tagging from Scratch in arxiv 1603.06270.
  5. Lidong Bing, Mingyang Ling, Richard C. Wang, William W. Cohen (2016): Distant IE by Bootstrapping Using Lists and Document Structure in AAAI-2016.
  6. Lidong Bing, Sneha Chaudhari, Richard C. Wang, and William W. Cohen (2015): Improving Distant Supervision for Information Extraction Using Label Propagation Through Lists in EMNLP-2015.
  7. Bhavana Dalvi, Einat Minkov, Partha P. Talukdar, and William W. Cohen (2015): Automatic Gloss Finding for a Knowledge Base using Ontological Constraints in WSDM-2015.
  8. T. Mitchell, W. Cohen, E. Hruscha, P. Talukdar, J. Betteridge, A. Carlson, B. Dalvi, M. Gardner,B. Kisiel,J. Krishnamurthy, N. Lao, K. Mazaitis, T. Mohammad, N. Nakashole, E. Platanios,A. Ritter, M. Samadi, B. Settles, R.Wang, D.Wijaya, A. Gupta, X. Chen, A. Saparov, M. Greaves, J.Welling (2015): Never-Ending Learning in AAAI-2015.
  9. Jay Pujara, Hui Miao, Lise Getoor and William W. Cohen (2014): Using Semantics & Statistics to Turn Data into Knowledge in AI Magazine 2014.
  10. Jay Pujara, Hui Miao, Lise Getoor, and William W. Cohen (2013): Ontology-Aware Partitioning for Knowledge Graph Identification in AKBC-2013.
  11. Bhavana Dalvi, William W. Cohen, and Jamie Callan (2013): Classifying Entities into an Incomplete Ontology in AKBC-2013.
  12. Jay Pujara, Hui Miao, Lise Getoor, and William W. Cohen (2013): Knowledge Graph Identification in ISWC-2013 (Best Student Paper at ISWC-2013).
  13. Ramnath Balasubramanyan, Bhavana Dalvi and William W. Cohen (2013): From Topic Models to Semi-Supervised Learning: Biasing Mixed-membership Models to Exploit Topic-Indicative Features in Entity Clustering in ECML/PKDD-2013.
  14. Bhavana Dalvi and William W. Cohen and Jamie Callan (2013): Exploratory Learning in ECML/PKDD-2013.
  15. Bhavana Dalvi and William W. Cohen (2013): Very Fast Similarity Queries on Semi-Structured Data from the Web in SDM-2013.
  16. Freddy Chong Tat Chua, William W. Cohen, Justin Betteridge, and Ee-Peng Lim (2012): Community-Based Classification of Noun Phrases in Twitter in CIKM-2012 (short paper).
  17. Ni Lao, Amar Subramanya, Fernando Pereira and William W. Cohen (2012): Reading The Web with Learned Syntactic-Semantic Inference Rules in EMNLP-CoNLL-2012.
  18. Bhavana Dalvi, William W. Cohen, and Jamie Callan (2012): Collectively Representing Semi-Structured Data from the Web in AKBC-2012 (Honorable Mention for Best Paper at AKBC-2012).
  19. Dana Movshovitz-Attias and William W. Cohen (2012): Alignment-based Extraction of Abbreviations from Biomedical Text in BioNLP-2012.
  20. Dana Movshovitz-Attias and William W. Cohen (2012): Bootstrapping Biomedical Ontologies for Scientific Text using NELL in BioNLP-2012.
  21. Bhavana Dalvi, William W. Cohen, and Jamie Callan (2012): WebSets: Extracting Sets of Entities from the Web Using Unsupervised Information Extraction in WSDM-2012.
  22. Ni Lao, Tom Mitchell, and William W. Cohen (2011): Random Walk Inference and Learning in A Large Scale Knowledge Base in EMNLP-2011.
  23. Jacob Eisenstein, Tae Yano, William W. Cohen, Noah A. Smith, and Eric P. Xing (2011): Structured Databases of Named Entities from Bayesian Nonparametrics in UNSUP-2011.
  24. Bhavana Dalvi, Jamie Callan, and William W. Cohen (2011): Entity List Completion Using Set Expansion Techniques in TREC 2011.
  25. Einat Minkov and William W. Cohen (2010): Improving Graph-Walk Based Similarity with Reranking: Case Studies for Personal Information Management in TOIS-2010.
  26. L. P. Coelho, A. Ahmed, A. Arnold, J. Kangas, A.-S. Sheikh, E. Xing, W. Cohen, and R. F. Murphy (2010): Structured Literature Image Finder: Extracting Information from Text and Images in Biomedical Literature in Lecture Notes in Bioinformatics.
  27. A. Ahmed, A. Arnold, L. P. Coelho, J. Kangas, A.-S. Sheikh, E. Xing, W. Cohen, and R. F. Murphy (2010): Structured Literature Image Finder: Parsing Text and Figures in Biomedical Literature in Journal of Web Semantics.
  28. Richard Wang and William W. Cohen (2009): Character-level Analysis of Semi-Structured Documents for Set Expansion in EMNLP 2009.
  29. Richard Wang and William W. Cohen (2009): Automatic Set Instance Extraction using the Web in ACL-IJNLP 2009.
  30. Richard Wang and William W. Cohen (2008): Iterative Set Expansion of Named Entities Using the Web in ICDM-2008.
  31. Andrew Arnold and William W. Cohen (2008): Intra-document Structural Frequency Features for Semi-Supervised Domain Adaptation in CIKM-2008.
  32. Andrew Arnold, Ramesh Nallapati and William W. Cohen (2008): Exploiting Feature Hierarchy for Transfer Learning in Named Entity Recognition in ACL-2008.
  33. Andrew Arnold, Ramesh Nallapati and William W. Cohen (2007): A Comparative Study of Methods for Transductive Transfer Learning in ICDM Workshop on Mining and Management of Biological Data.
  34. Richard Wang and William Cohen (2007): Language-Independent Set Expansion of Named Entities using the Web in ICDM-2007.
  35. Zhenzhen Kou and William W. Cohen (2007): Stacked Graphical Models for Efficient Inference in Markov Random Fields in SDM-2007.
  36. Zhenzhen Kou, William W. Cohen, and Robert F. Murphy (2007): A Stacked Graphical Model for Associating Information from Text And Images In Figures in PSB-2007.
  37. Richard C. Wang, Anthony Tomasic, Robert E. Frederking, William W. Cohen (2006): Learning to Extract Gene-Protein Names from Weakly-Labeled Text in CMU SCS Technical Report Series (CMU-LTI-08-04).
  38. Einat Minkov, Richard C.Wang, Anthony Tomasic and William W. Cohen (2006): NER Systems that Suit Users Preferences: Adjusting the Recall-Precision Trade-off for Entity Extraction in HLT/NAACL-2006 (short paper).
  39. William W. Cohen (2006): A Graph-Search Framework for GeneId Ranking (Extended Abstract) in BioNLP'06.
  40. William W. Cohen & Einat Minkov (2006): A Graph-Search Framework for Associating Gene Identifiers with Documents in BMC Bioinformatics.
  41. Einat Minkov, Richard C. Wang, and William W. Cohen (2005): Extracting Personal Names from Email: Applying Named Entity Recognition to Informal Text in EMNLP/HLT-2005.
  42. William W. Cohen, Einat Minkov & Anthony Tomasic (2005): Learning to Understand Web Site Update Requests in IJCAI-2005.
  43. Zhenzhen Kou, William W. Cohen & Robert F. Murphy (2005): High-Recall Protein Entity Recognition Using a Dictionary in ISMB-2005.
  44. Einat Minkov, Richard Wang & William Cohen (2004): Extracting Personal Names from Emails: Applying Named Entity Recognition to Informal Text in preparation.
  45. Sunita Sarawagi & William W. Cohen (2004): Semi-Markov Conditional Random Fields for Information Extraction in NIPS 2004.
  46. Robert F. Murphy, Zhenzhen Kou, Juchang Hua, Matthew Joffe, William W. Cohen (2004): Extracting and Structuring Subcellular Location Information from On-line Journal Articles: The Subcellular Location Image Finder in KSCE-2004.
  47. Anthony Tomasic, William W. Cohen, Einat Minkov (2004): Learning to Navigate Web Forms in IIWeb 2004.
  48. Vitor Carvalho & William W. Cohen (2004): Learning to Extract Signature and Reply Lines from Email in CEAS 2004.
  49. William W. Cohen & Sunita Sarawagi (2004): Exploiting Dictionaries in Named Entity Extraction: Combining Semi-Markov Extraction Processes and Data Integration Methods in KDD 2004: 89-98.
  50. William W. Cohen (2003): Learning and Discovering Structure in Web Pages in IEEE Data Eng. Bull. 26(3): 3-10 (2003).
  51. William W. Cohen, Zhenzhen Kou & Robert F. Murphy (2003): Extracting Information from Text and Images for Location Proteomics in BIOKDD 2003: 2-9.
  52. William W. Cohen, Richard Wang & Robert Murphy (2003): Understanding Captions in Biomedical Publications in KDD 2003: 499-504.
  53. William W. Cohen (2003): Infrastructure Components for Large-Scale Information Extraction Systems in IAAI 2003: 71-78.
  54. William W. Cohen (2002): Improving A Page Classifier with Anchor Extraction and Link Analysis in NIPS 2002.
  55. William W. Cohen, Matthew Hurst & Lee S. Jensen (2003): A Flexible Learning System for Wrapping Tables and Lists in HTML Documents in Web Document Analysis: Challenges and Opportunities, ed. Antonacopoulos & Hu, Word Scientific Publishing. (Originally published as: William W. Cohen, Matthew Hurst & Lee S. Jensen (2002): A Flexible Learning System for Wrapping Tables and Lists in HTML Documents in WWW 2002: 232-241; Lee S. Jensen & William W. Cohen (2001): A Structured Wrapper Induction System for Extracting Information from Semi-Structured Documents in Proc. of the IJCAI-2001 Workshop on Adaptive Text Extraction and Mining).
  56. William W. Cohen (2001): Issues in Extracting Information from the Web (Extended Abstract) in IWPT 2001.
  57. William W. Cohen (2000): Extracting Information from the Web for Concept Learning and Collaborative Filtering in ALT 2000: 1-12.
  58. William W. Cohen, Andrew McCallum, Dallan Quass (2000): Learning to Understand the Web in IEEE Data Eng. Bull. 23(3): 17-24 (2000).
  59. William W. Cohen and Wei Fan (1999): Learning Page-Independent Heuristics for Extracting Data from Web Pages in Computer Networks 31(11-16): 1641-1652 (1999). (Originally published as: William W. Cohen and Wei Fan (1999): Learning Page-Independent Heuristics for Extracting Data from Web Pages in WWW 1999).
  60. William W. Cohen (1999): Reasoning about Textual Similarity in a Web-Based Information Access in Autonomous Agents and Multi-Agent Systems 2(1): 65-86 (1999).
  61. William W. Cohen (1999): A Demonstration of WHIRL (demonstration abstract) in SIGIR 1999: 327.

[Selected papers| By topic: Deep Learning| Information Extraction| Topic Modeling| Learning in Graphs| Matching/Data Integration| Text Categorization| Rule Learning| Explanation-Based Learning| Formal Results| Inductive Logic Programming| Collaborative Filtering| Applications| Intelligent Tutoring| GNAT System| By year: All papers]