Papers by year
home | papers by year
| statistical language learning | computational social science | text-driven forecasting | NLP | transformations | translation | machine learning | linguistics & resources
Statistical language learning
- A Supertag-Context Model for Weakly-Supervised CCG Parser Learning. Dan Garrette, Chris Dyer, Jason Baldridge, and Noah A. Smith. In Proceedings of the Conference on Computational Natural Language Learning (CoNLL 2015), Beijing, China, July 2015.
- Weakly-Supervised Grammar-Informed Bayesian CCG Parser Learning. Dan Garrette, Chris Dyer, Jason Baldridge, and Noah A. Smith. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2015), Austin, TX, January 2015.
- Conditional Random Field Autoencoders for Unsupervised Structured Prediction. Waleed Ammar, Chris Dyer, and Noah A. Smith. In Advances in Neural Information Processing Systems 27 (NIPS 2014), Montréal, Quebec, December 2014.
- Weakly-Supervised Bayesian Learning of a CCG Supertagger. Dan Garrette, Chris Dyer, Jason Baldridge, and Noah A. Smith. In Proceedings of the Conference on Computational Natural Language Learning (CoNLL 2014), Baltimore, MD, June 2014.
- Dynamic Models of Streaming Text. Dani Yogatama, Chong Wang, Bryan R. Routledge, Noah A. Smith, and Eric P. Xing. Transactions of the Association for Computational Linguistics 2:181–192, April 2014.
- Knowledge-Rich Morphological Priors for Bayesian Language Models. Victor Chahuneau, Noah A. Smith, and Chris Dyer. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2013), Atlanta, GA, June 2013.
- Empirical Risk Minimization for Probabilistic Grammars: Sample Complexity and Hardness of Learning. Shay B. Cohen and Noah A. Smith. Computational Linguistics 38(3), September 2012.
- Concavity and Initialization for Unsupervised Dependency Parsing. Kevin Gimpel and Noah A. Smith. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2012), Montréal, Québec, June 2012.
- Unsupervised Bilingual POS Tagging with Markov Random Fields. Desai Chen, Chris Dyer, Shay B. Cohen, and Noah A. Smith. In Proceedings of the EMNLP Workshop on Unsupervised Learning in NLP (UNSUP 2011), Edinburgh, UK, July 2011.
- Unsupervised Structure Prediction with Non-Parallel Multilingual Guidance. Shay B. Cohen, Dipanjan Das, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2011), Edinburgh, UK, July 2011.
- Empirical Risk Minimization with Approximations of Probabilistic Grammars. Shay B. Cohen and Noah A. Smith. In Advances in Neural Information Processing Systems 23 (NIPS 2010), Vancouver, BC, December 2010.
Also available: appendix.
- Covariance in Unsupervised Learning of Probabilistic Grammars. Shay B. Cohen and Noah A. Smith. Journal of Machine Learning Research 11:3017–3051, November 2010.
- Viterbi Training for PCFGs: Hardness Results and Competitiveness of Uniform Initialization. Shay B. Cohen and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2010), pages 1502–1511, Uppsala, Sweden, July 2010.
- Variational Inference for Adaptor Grammars. Shay B. Cohen, David M. Blei, and Noah A. Smith. In Proceedings of the North American Chapter of the Association for Computational Linguistics Human Language Technologies Conference (NAACL 2010), Los Angeles, CA, June 2010.
- Variational Inference for Grammar Induction with Prior Knowledge. Shay B. Cohen and Noah A. Smith. In Proceedings of the Joint Conference of the Annual Meeting of the Association for Computational Linguistics and the International Joint Conference on Natural Language Processing, companion volume (ACL 2009), pages 1–4, Singapore, August 2009.
- Shared Logistic Normal Distributions for Soft Parameter Tying in Unsupervised Grammar Induction. Shay B. Cohen and Noah A. Smith. In Proceedings of the North American Association for Computational Linguistics Human Language Technologies Conference (NAACL 2009), pages 74–82, Boulder, CO, May/June 2009.
- Logistic Normal Priors for Unsupervised Probabilistic Grammar Induction. Shay B. Cohen, Kevin Gimpel, and Noah A. Smith. In Advances in Neural Information Processing Systems 21 (NIPS 2008), pages 321–328, Vancouver, BC, December 2008.
- The Shared Logistic Normal Distribution for Grammar Induction. Shay B. Cohen and Noah A. Smith. In Proceedings of the NIPS Workshop on Speech and Language: Unsupervised Latent-Variable Models, Whistler, BC, December 2008.
- Weighted and Probabilistic Context-Free Grammars Are Equally Expressive. Noah A. Smith and Mark Johnson. Computational Linguistics 33(4):477–491, December 2007.
- Probabilistic Models of Nonprojective Dependency Trees. David A. Smith and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL 2007), pages 132–140, Prague, Czech Republic, June 2007.
- Computationally Efficient M-Estimation of Log-Linear Structure Models. Noah A. Smith, Douglas L. Vail, and John D. Lafferty. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2007), pages 752–759, Prague, Czech Republic, June 2007.
Also available: talk slides.
- Novel Estimation Methods for Unsupervised Discovery of Latent Structure in Natural Language Text. Noah A. Smith. Ph.D. thesis, Department of Computer Science, Johns Hopkins University, Baltimore, MD, October 2006.
- Annealing Structural Bias in Multilingual Weighted Grammar Induction. Noah A. Smith and Jason Eisner. In Proceedings of the International Conference on Computational Linguistics and Annual Meeting of the Association for Computational Linguistics (COLING-ACL 2006), pages 569–576, Sydney, Australia, July 2006.
- Guiding Unsupervised Grammar Induction Using Contrastive Estimation. Noah A. Smith and Jason Eisner. In Proceedings of the IJCAI Workshop on Grammatical Inference Applications, pages 73–82, Edinburgh, UK, July 2005.
- Contrastive Estimation: Training Log-Linear Models on Unlabeled Data. Noah A. Smith and Jason Eisner. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2005), pages 354–362, Ann Arbor, MI, June 2005.
- Annealing Techniques for Unsupervised Statistical Language Learning. Noah A. Smith and Jason Eisner. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2004), pages 487–494, Barcelona, Spain, July 2004.
Text analysis for computational social science
- Open Extraction of Fine-Grained Political Statements. David Bamman and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2015), Lisbon, Portugal, September 2015.
- A Utility Model of Authors in the Scientific Community. Yanchuan Sim, Bryan R. Routledge, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2015), Lisbon, Portugal, September 2015.
- The Media Frames Corpus: Annotations of Frames Across Issues. Dallas Card, Amber E. Boydstun, Justin H. Gross, Philip Resnik, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2015), Beijing, China, July 2015.
- Contextualized Sarcasm Detection on Twitter. David Bamman and Noah A. Smith. In Proceedings of the International AAAI Conference on Weblogs and Social Media (ICWSM 2015), Oxford, UK, May 2015.
- Modeling User Arguments, Interactions, and Attributes for Stance Prediction in Online Debate Forums. Minghui Qiu, Yanchuan Sim, Noah A. Smith, and Jing Jiang. In Proceedings of the SIAM Conference on Data Mining (SDM 2015), Vancouver, BC, April/May 2015.
Also available: appendix.
- The Utility of Text: The Case of Amicus Briefs and the Supreme Court. Yanchuan Sim, Bryan R. Routledge, and Noah A. Smith. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2015), Austin, TX, January 2015.
- Diffusion of Language Change in Social Media. Jacob Eisenstein, Brendan O'Connor, Noah A. Smith, and Eric P. Xing. PLoS ONE, November 2014.
- Unsupervised Discovery of Biographical Structure from Text. David Bamman and Noah A. Smith. Transactions of the Association for Computational Linguistics 2(2014):363–376, October 2014.
- Tracking the Development of Media Frames within and across Policy Issues. Amber E. Boydstun, Dallas Card, Justin H. Gross, Philip Resnik, and Noah A. Smith. August 2014.
- A Bayesian Mixed Effects Model of Literary Character. David Bamman, Ted Underwood, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2014), Baltimore, MD, June 2014.
- Overview of the 2014 NLP Unshared Task in PoliInformatics. Noah A. Smith, Claire Cardie, Anne L. Washington, and John D. Wilkerson. In Proceedings of the ACL 2014 Workshop on Language Technologies and Computational Social Science, pages 5–7, Baltimore, MD, June 2014.
- Narrative Framing of Consumer Sentiment in Online Restaurant Reviews. Dan Jurafsky, Victor Chahuneau, Bryan R. Routledge, and Noah A. Smith. First Monday 19(4), April 2014.
- Learning Topics and Positions from Debatepedia. Swapna Gottipati, Minghui Qiu, Yanchuan Sim, Jing Jiang, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2013), Seattle, WA, October 2013.
Also available: appendix.
- Measuring Ideological Proportions in Political Speeches. Yanchuan Sim, Brice D. L. Acree, Justin H. Gross, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2013), Seattle, WA, October 2013.
Also available: appendix.
- Learning to Extract International Relations from Political Context. Brendan O'Connor, Brandon Stewart, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2013), Sofia, Bulgaria, August 2013.
Also available: appendix.
- Learning Latent Personas of Film Characters. David Bamman, Brendan O'Connor, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2013), Sofia, Bulgaria, August 2013.
- Testing the Etch-a-Sketch Hypothesis: A Computational Analysis of Mitt Romney's Ideological Makeover During the 2012 Primary vs. General Elections. Justin H. Gross, Brice Acree, Yanchuan Sim, and Noah A. Smith. Presented at the Annual Meeting of the American Political Science Association, August 2013.
- A Penny for your Tweets: Campaign Contributions and Capitol Hill Microblogs. Tae Yano, Dani Yogatama, and Noah A. Smith. In Proceedings of the International AAAI Conference on Weblogs and Social Media (ICWSM 2013), Boston, MA, July 2013.
- Mapping the Geographical Diffusion of New Words. Jacob Eisenstein, Brendan O'Connor, Noah A. Smith, and Eric P. Xing. In Proceedings of the NIPS Workshop on Social Network and Social Media Analysis: Methods, Models and Applications, Lake Tahoe, NV, December 2012.
- Word Salad: Relating Food Prices and Descriptions. Victor Chahuneau, Kevin Gimpel, Bryan R. Routledge, Lily Scherlis, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and Natural Language Learning (EMNLP 2012), Jeju, Korea, July 2012.
Also available: appendix.
- Discovering Factions in the Computational Linguistics Community. Yanchuan Sim, Noah A. Smith, and David A. Smith. In Proceedings of the ACL Workshop on Rediscovering Fifty Years of Discoveries, Jeju, Korea, July 2012.
- Censorship and Content Deletion in Chinese Social Media. David Bamman, Brendan O'Connor, and Noah A. Smith. First Monday 17(3), March 2012.
- Computational Text Analysis for Social Science: Model Complexity and Assumptions. Brendan O'Connor, David Bamman, and Noah A. Smith. In Proceedings of the NIPS Workshop on Computational Social Science and the Wisdom of Crowds, Sierra Nevada, Spain, December 2011.
- Discovering Sociolinguistic Associations with Structured Sparsity. Jacob Eisenstein, Noah A. Smith, and Eric P. Xing. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2011), Portland, OR, June 2011.
- Author Age Prediction from Text using Linear Regression. Dong Nguyen, Noah A. Smith, and Carolyn P. Rosé. In Proceedings of the ACL Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LATECH 2011), Portland, OR, June 2011.
- Discovering Demographic Language Variation. Brendan O'Connor, Jacob Eisenstein, Eric P. Xing, and Noah A. Smith. In Proceedings of the NIPS Workshop on Machine Learning for Social Computing, Whistler, BC, December 2010.
- A Latent Variable Model for Geographic Lexical Variation. Jacob Eisenstein, Brendan O'Connor, Noah A. Smith, and Eric P. Xing. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2010), Cambridge, MA, October 2010.
- Shedding (a Thousand Points of) Light on Biased Language. Tae Yano, Philip Resnik, and Noah A. Smith. In Proceedings of the NAACL-HLT Workshop on Creating Speech and Language Data With Mechanical Turk, Los Angeles, CA, June 2010.
- From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series. Brendan O'Connor, Ramnath Balasubramanyan, Bryan R. Routledge, and Noah A. Smith. In Proceedings of the International AAAI Conference on Weblogs and Social Media (ICWSM 2010), pages 122–129, Washington, DC, May 2010.
- From Episodes to Sagas: Understanding the News by Identifying Temporally Related Story Sequences. Ramnath Balasubramanyan, Frank Lin, William W. Cohen, Matthew Hurst, and Noah A. Smith. In Proceedings of the International AAAI Conference on Weblogs and Social Media (ICWSM 2009), San Jose, CA, May 2009.
Text-driven forecasting
- A Sparse and Adaptive Prior for Time-Dependent Model Parameters. Dani Yogatama, Bryan R. Routledge, and Noah A. Smith. October 2013.
- Predicting the NFL Using Twitter. Shiladitya Sinha, Chris Dyer, Kevin Gimpel, and Noah A. Smith. In Proceedings of the ECML/PKDD Workshop on (Machine Learning and Data Mining for) Sports Analytics, Prague, Czech Republic, September 2013.
- Textual Predictors of Bill Survival in Congressional Committees. Tae Yano, Noah A. Smith, and John D. Wilkerson. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2012), pages 793–802, Montréal, Québec, June 2012.
Also available: talk slides.
- Predicting a Scientific Community's Response to an Article. Dani Yogatama, Michael Heilman, Brendan O'Connor, Chris Dyer, Bryan R. Routledge, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2011), Edinburgh, UK, July 2011.
Minor revisions.
Also available: extended technical report.
- Movie Reviews and Revenues: An Experiment in Text Regression. Mahesh Joshi, Dipanjan Das, Kevin Gimpel, and Noah A. Smith. In Proceedings of the North American Chapter of the Association for Computational Linguistics Human Language Technologies Conference (NAACL 2010), Los Angeles, CA, June 2010.
- What's Worthy of Comment? Content and Comment Volume in Political Blogs. Tae Yano and Noah A. Smith. In Proceedings of the International AAAI Conference on Weblogs and Social Media (ICWSM 2010), Washington, DC, May 2010.
- Text-Driven Forecasting. Noah A. Smith. March 2010.
- Predicting Risk from Financial Reports with Regression. Shimon Kogan, Dimitry Levin, Bryan R. Routledge, Jacob S. Sagi, and Noah A. Smith. In Proceedings of the North American Association for Computational Linguistics Human Language Technologies Conference (NAACL 2009), pages 272–280, Boulder, CO, May/June 2009.
Also available: talk slides.
- Predicting Response to Political Blog Posts with Topic Models. Tae Yano, William W. Cohen, and Noah A. Smith. In Proceedings of the North American Association for Computational Linguistics Human Language Technologies Conference (NAACL 2009), pages 477–485, Boulder, CO, May/June 2009.
Core NLP: semantics, syntax, morphology, and algorithms
- Improved Transition-based Parsing by Modeling Characters instead of Words with LSTMs. Miguel Ballesteros, Chris Dyer, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2015), Lisbon, Portugal, September 2015.
- Bayesian Optimization of Text Representations. Dani Yogatama, Lingpeng Kong, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2015), Lisbon, Portugal, September 2015.
- Transition-Based Dependency Parsing with Stack Long Short-Term Memory. Chris Dyer, Miguel Ballesteros, Wang Ling, Austin Matthews, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2015), Beijing, China, July 2015.
- Sparse Binary Word Vector Representations. Manaal Faruqui, Yulia Tsvetkov, Dani Yogatama, Chris Dyer, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2015), Beijing, China, July 2015.
- Frame-Semantic Role Labeling with Heterogeneous Annotations. Meghana Kshirsagar, Sam Thomson, Nathan Schneider, Jaime Carbonell, Noah A. Smith, and Chris Dyer. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2015), Beijing, China, July 2015.
- Learning Word Representations with Hierarchical Sparse Coding. Dani Yogatama, Manaal Faruqui, Chris Dyer, and Noah A. Smith. In Proceedings of the International Conference on Machine Learning (ICML 2015), Lille, France, July 2015.
- Retrofitting Word Vectors to Semantic Lexicons. Manaal Faruqui, Jesse Dodge, Sujay Kumar Jauhar, Chris Dyer, Eduard Hovy, and Noah A. Smith. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2015), Denver, CO, June 2015.
- Transforming Dependencies into Phrase Structures. Lingpeng Kong, Alexander M. Rush, and Noah A. Smith. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2015), Denver, CO, June 2015.
- AD3: Alternating Directions Dual Decomposition for MAP Inference in Graphical Models. André F. T. Martins, Mário A. T. Figueiredo, Pedro M. Q. Aguiar, Noah A. Smith, and Eric P. Xing. Journal of Machine Learning Research 16:495–545, March 2015.
- A Dependency Parser for Tweets. Lingpeng Kong, Nathan Schneider, Swabha Swayamdipta, Archna Bhatia, Chris Dyer, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2014), Doha, Qatar, October 2014.
- CMU: Arc-Factored, Discriminative Semantic Dependency Parsing. Sam Thomson, Brendan O'Connor, Jeffrey Flanigan, David Bamman, Jesse Dodge, Swabha Swayamdipta, Nathan Schneider, Chris Dyer, and Noah A. Smith. In Proceedings of the International (COLING) Workshop on Semantic Evaluations (SemEval 2014), Dublin, Ireland, August 2014.
- Distributed Representations of Geographically Situated Language. David Bamman, Chris Dyer, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2014), Baltimore, MD, June 2014.
- A Discriminative Graph-Based Parser for the Abstract Meaning Representation. Jeffrey Flanigan, Sam Thomson, Jaime Carbonell, Chris Dyer, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2014), Baltimore, MD, June 2014.
- An Empirical Comparison of Parsing Methods for Stanford Dependencies. Lingpeng Kong and Noah A. Smith. April 2014.
- Discriminative Lexical Semantic Segmentation with Gaps: Running the MWE Gamut. Nathan Schneider, Emily Danchik, Chris Dyer, and Noah A. Smith. Transactions of the Association for Computational Linguistics 2:193–206, April 2014.
- Frame-Semantic Parsing. Dipanjan Das, Desai Chen, André F. T. Martins, Nathan Schneider, and Noah A. Smith. Computational Linguistics 40(1):9–56, March 2014.
- Turning on the Turbo: Fast Third-Order Non-Projective Turbo Parsers. André F. T. Martins, Miguel Almeida, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2013), Sofia, Bulgaria, August 2013.
- A Framework for (Under)specifying Dependency Syntax without Overloading Annotators. Nathan Schneider, Brendan O'Connor, Naomi Saphra, David Bamman, Manaal Faruqui, Noah A. Smith, Chris Dyer, and Jason Baldridge. In Proceedings of the ACL Linguistic Annotation Workshop (LAW 2013), Sofia, Bulgaria, August 2013.
Also available: extended technical report.
- Improved Part-of-Speech Tagging for Online Conversational Text with Word Clusters. Olutobi Owoputi, Brendan O'Connor, Chris Dyer, Kevin Gimpel, Nathan Schneider, and Noah A. Smith. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2013), Atlanta, GA, June 2013.
- Supersense Tagging for Arabic: the MT-in-the-Middle Attack. Nathan Schneider, Behrang Mohit, Chris Dyer, Kemal Oflazer, and Noah A. Smith. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2013), Atlanta, GA, June 2013.
- Linguistic Structure Prediction with the Sparseptron. Noah A. Smith and André F. T. Martins. ACM Crossroads 19(3):44–48, April 2013.
- Adversarial Evaluation for Models of Natural Language. Noah A. Smith. July 2012.
- Transliteration by Sequence Labeling with Lattice Encodings and Reranking. Waleed Ammar, Chris Dyer, and Noah A. Smith. In Proceedings of the ACL Named Entities Workshop, Jeju, Korea, July 2012.
- Coarse Lexical Semantic Annotation with Supersenses: An Arabic Case Study. Nathan Schneider, Behrang Mohit, Kemal Oflazer, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2012), Jeju, Korea, July 2012.
- A Probabilistic Model for Canonicalizing Named Entity Mentions. Dani Yogatama, Yanchuan Sim, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2012), Jeju, Korea, July 2012.
- An Exact Dual Decomposition Algorithm for Shallow Semantic Parsing with Constraints. Dipanjan Das, André F. T. Martins, and Noah A. Smith. In Proceedings of the Joint Conference on Lexical and Computational Semantics (*SEM 2012), Montréal, Québec, June 2012.
- Graph-Based Lexicon Expansion with Sparsity-Inducing Penalties. Dipanjan Das and Noah A. Smith. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2012), Montréal, Québec, June 2012.
Also available: talk slides.
- Recall-Oriented Learning of Named Entities in Arabic Wikipedia. Behrang Mohit, Nathan Schneider, Rishav Bhowmick, Kemal Oflazer, and Noah A. Smith. In Proceedings of the Conference of the European Chapter of the Association for Computational Linguistics (EACL 2012), Avignon, France, April 2012.
Also available: extended technical report.
Also available: appendix.
- Structured Databases of Named Entities from Bayesian Nonparametrics. Jacob Eisenstein, Tae Yano, William W. Cohen, Noah A. Smith, and Eric P. Xing. In Proceedings of the EMNLP Workshop on Unsupervised Learning in NLP (UNSUP 2011), Edinburgh, UK, July 2011.
Also available: talk slides.
- Dual Decomposition with Many Overlapping Components. André F. T. Martins, Noah A. Smith, Pedro M. Q. Aguiar, and Mário A. T. Figueiredo. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2011), Edinburgh, UK, July 2011.
- Semi-Supervised Frame-Semantic Parsing for Unknown Predicates. Dipanjan Das and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2011), Portland, OR, June 2011.
- Part-of-Speech Tagging for Twitter: Annotation, Features, and Experiments. Kevin Gimpel, Nathan Schneider, Brendan O'Connor, Dipanjan Das, Daniel Mills, Jacob Eisenstein, Michael Heilman, Dani Yogatama, Jeffrey Flanigan, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, companion volume (ACL 2011), Portland, OR, June 2011.
- Linguistic Structure Prediction. Noah A. Smith. Morgan and Claypool, May 2011.
- Products of Weighted Logic Programs. Shay B. Cohen, Robert J. Simmons, and Noah A. Smith. Theory and Practice of Logic Programming 11(2–3):263–296, January 2011.
- Favor Short Dependencies: Parsing with Soft and Hard Constraints on Dependency Length. Jason Eisner and Noah A. Smith. In ed. Harry Bunt, Paola Merlo, and Joakim Nivre, Trends in Parsing Technology: Dependency Parsing, Domain Adaptation, and Deep Parsing, Text, Speech, and Language Technology 43, chapter 8, pages 121–150, 2011, Springer.
- Turbo Parsers: Dependency Parsing by Approximate Variational Inference. André F. T. Martins, Noah A. Smith, Eric P. Xing, Pedro M. Q. Aguiar, and Mário A. T. Figueiredo. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2010), Cambridge, MA, October 2010.
- SEMAFOR: Frame Argument Resolution with Log-Linear Models. Desai Chen, Nathan Schneider, Dipanjan Das, and Noah A. Smith. In Proceedings of the International (ACL) Workshop on Semantic Evaluations (SemEval 2010), Uppsala, Sweden, July 2010.
- Distributed Asynchronous Online Learning for Natural Language Processing. Kevin Gimpel, Dipanjan Das, and Noah A. Smith. In Proceedings of the Conference on Computational Natural Language Learning (CoNLL 2010), Uppsala, Sweden, July 2010.
- Visualizing Topical Quotations Over Time to Understand News Discourse. Nathan Schneider, Rebecca Hwa, Philip Gianfortoni, Dipanjan Das, Michael Heilman, Alan W. Black, Frederick L. Crabbe, and Noah A. Smith. Pittsburgh, PA, July 2010.
- Probabilistic Frame-Semantic Parsing. Dipanjan Das, Nathan Schneider, Desai Chen, and Noah A. Smith. In Proceedings of the North American Chapter of the Association for Computational Linguistics Human Language Technologies Conference (NAACL 2010), Los Angeles, CA, June 2010.
Also available: extended technical report.
- Tree Edit Models for Recognizing Textual Entailments, Paraphrases, and Answers to Questions. Michael Heilman and Noah A. Smith. In Proceedings of the North American Chapter of the Association for Computational Linguistics Human Language Technologies Conference (NAACL 2010), Los Angeles, CA, June 2010.
Also available: appendix.
- Paraphrase Identification as Probabilistic Quasi-Synchronous Recognition. Dipanjan Das and Noah A. Smith. In Proceedings of the Joint Conference of the Annual Meeting of the Association for Computational Linguistics and the International Joint Conference on Natural Language Processing (ACL 2009), pages 468–476, Singapore, August 2009.
- Concise Integer Linear Programming Formulations for Dependency Parsing. André F. T. Martins, Noah A. Smith, and Eric P. Xing. In Proceedings of the Joint Conference of the Annual Meeting of the Association for Computational Linguistics and the International Joint Conference on Natural Language Processing (ACL 2009), pages 342–350, Singapore, August 2009.
- Cube Summing, Approximate Inference with Non-Local Features, and Dynamic Programming without Semirings. Kevin Gimpel and Noah A. Smith. In Proceedings of the Conference of the European Chapter of the Association for Computational Linguistics (EACL 2009), pages 157–166, Athens, Greece, March/April 2009.
- Dynamic Programming Algorithms as Products of Weighted Logic Programs. Shay B. Cohen, Robert J. Simmons, and Noah A. Smith. In Proceedings of the International Conference on Logic Programming (ICLP 2008), Udine, Italy, December 2008.
Also available: extended technical report.
- Stacking Dependency Parsers. André F. T. Martins, Dipanjan Das, Noah A. Smith, and Eric P. Xing. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2008), pages 157–166, Waikiki, HI, October 2008.
- Review of Computational Approaches to Morphology and Syntax by Brian Roark and Richard Sproat. Noah A. Smith. Computational Linguistics 34(3):453–457, September 2008.
- Competitive Grammar Writing. Jason Eisner and Noah A. Smith. In Proceedings of the ACL Workshop on Issues in Teaching Computational Linguistics, pages 97–105, Columbus, OH, June 2008.
- SOUR CREAM: Toward Semantic Processing of Recipes. Dan Tasse and Noah A. Smith. Pittsburgh, PA, May 2008.
- Relative Keyboard Input System. Daniel R. Rashid and Noah A. Smith. In Proceedings of the International Conference on Intelligent User Interfaces (IUI 2008), pages 397–400, Canary Islands, Spain, January 2008.
- Joint Morphological and Syntactic Disambiguation. Shay B. Cohen and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL 2007), pages 208–217, Prague, Czech Republic, June 2007.
- Vine Parsing and Minimum Risk Reranking for Speed and Precision. Markus Dreyer, David A. Smith, and Noah A. Smith. In Proceedings of the Conference on Natural Language Learning (CoNLL 2006), pages 201–205, New York, NY, June 2006.
- Compiling Comp Ling: Practical Weighted Dynamic Programming and the Dyna Language. Jason Eisner, Eric Goldlust, and Noah A. Smith. In Proceedings of the Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (EMNLP 2005), pages 281–290, Vancouver, BC, October 2005.
- Context-Based Morphological Disambiguation with Random Fields. Noah A. Smith, David A. Smith, and Roy W. Tromble. In Proceedings of the Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (EMNLP 2005), pages 475–482, Vancouver, BC, October 2005.
- Parsing with Soft and Hard Constraints on Dependency Length. Jason Eisner and Noah A. Smith. In Proceedings of the International Workshop on Parsing Technologies (IWPT 2005), pages 30–41, Vancouver, BC, October 2005.
- Dyna: A Declarative Language for Implementing Dynamic Programs. Jason Eisner, Eric Goldlust, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, companion volume (ACL 2004), pages 218–221, Barcelona, Spain, July 2004.
- Bilingual Parsing with Factored Estimation: Using English to Parse Korean. David A. Smith and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2004), pages 49–56, Barcelona, Spain, July 2004.
Transformations on text: question answering, question generation, summarization, and compression
- Extractive Summarization by Maximizing Semantic Volume. Dani Yogatama, Fei Liu, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2015), Lisbon, Portugal, September 2015.
- Toward Abstractive Summarization Using Semantic Representations. Fei Liu, Jeffrey Flanigan, Sam Thomson, Norman Sadeh, and Noah A. Smith. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2015), Denver, CO, June 2015.
- A Step Towards Usable Privacy Policy: Automatic Alignment of Privacy Statements. Fei Liu, Rohan Ramanath, Norman Sadeh, and Noah A. Smith. In Proceedings of the International Conference on Computational Linguistics (COLING 2014), Dublin, Ireland, August 2014.
- Unsupervised Alignment of Privacy Policies using Hidden Markov Models. Rohan Ramanath, Fei Liu, Norman Sadeh, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2014), Baltimore, MD, June 2014.
Also available: appendix.
- New Alignment Methods for Discriminative Summarization. David Bamman and Noah A. Smith. May 2013.
- Automatic Categorization of Privacy Policies: A Pilot Study. Waleed Ammar, Shomir Wilson, Norman Sadeh, and Noah A. Smith. Pittsburgh, PA, December 2012.
- Good Question! Statistical Ranking for Question Generation. Michael Heilman and Noah A. Smith. In Proceedings of the North American Chapter of the Association for Computational Linguistics Human Language Technologies Conference (NAACL 2010), Los Angeles, CA, June 2010.
Also available: extended technical report.
- Rating Computer-Generated Questions with Mechanical Turk. Michael Heilman and Noah A. Smith. In Proceedings of the NAACL-HLT Workshop on Creating Speech and Language Data With Mechanical Turk, Los Angeles, CA, June 2010.
- Extracting Simplified Statements for Factual Question Generation. Michael Heilman and Noah A. Smith. In Proceedings of the AIED Workshop on Question Generation, Pittsburgh, PA, June 2010.
- Leveraging Structural Relations for Fluent Compressions at Multiple Compression Rates. Sourish Chaudhuri, Naman K. Gupta, Noah A. Smith, and Carolyn P. Rosé. In Proceedings of the Joint Conference of the Annual Meeting of the Association for Computational Linguistics and the International Joint Conference on Natural Language Processing, companion volume (ACL 2009), pages 101–104, Singapore, August 2009.
- Ranking Automatically Generated Questions as a Shared Task. Michael Heilman and Noah A. Smith. In Proceedings of the AIED Workshop on Question Generation, Brighton, UK, July 2009.
- Question Generation via Overgenerating Transformations and Ranking. Michael Heilman and Noah A. Smith. Pittsburgh, PA, June 2009.
- Summarization with a Joint Model for Sentence Extraction and Compression. André F. T. Martins and Noah A. Smith. In Proceedings of the NAACL-HLT Workshop on Integer Linear Programming for Natural Language Processing, Boulder, CO, June 2009.
- Question Generation as a Competitive Undergraduate Course Project. Noah A. Smith, Michael Heilman, and Rebecca Hwa. In Proceedings of the NSF Workshop on the Question Generation Shared Task and Evaluation Challenge, Arlington, VA, September 2008.
- What is the Jeopardy Model? A Quasi-Synchronous Grammar for QA. Mengqiu Wang, Noah A. Smith, and Teruko Mitamura. In Proceedings of the Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL 2007), pages 22–32, Prague, Czech Republic, June 2007.
Machine translation and parallel corpora
- Phrase Dependency Machine Translation with Quasi-Synchronous Tree-to-Tree Features. Kevin Gimpel and Noah A. Smith. Computational Linguistics 40(2), June 2014.
- Translating into Morphologically Rich Languages with Synthetic Phrases. Victor Chahuneau, Eva Schlinger, Chris Dyer, and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2013), Seattle, WA, October 2013.
- A Simple, Fast, and Effective Reparameterization of IBM Model 2. Chris Dyer, Victor Chahuneau, and Noah A. Smith. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2013), Atlanta, GA, June 2013.
- pycdec: A Python Interface to cdec. Victor Chahuneau, Noah A. Smith, and Chris Dyer. Prague Bulletin of Mathematical Linguistics 98:51–61, October 2012.
- Structured Ramp Loss Minimization for Machine Translation. Kevin Gimpel and Noah A. Smith. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2012), Montréal, Québec, June 2012.
- The CMU-Oxford Translation System for the NIST Open Machine Translation 2012 Evaluation. Chris Dyer, Noah A. Smith, Graham Morehead, Phil Blunsom, and Abby Levenberg. May 2012.
- The CMU-ARK German-English Translation System. Chris Dyer, Kevin Gimpel, Jonathan H. Clark, and Noah A. Smith. In Proceedings of the EMNLP Workshop on Statistical Machine Translation (SMT 2011), Edinburgh, UK, July 2011.
- Quasi-Synchronous Phrase Dependency Grammars for Machine Translation. Kevin Gimpel and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2011), Edinburgh, UK, July 2011.
- Generative Models of Monolingual and Bilingual Gappy Patterns. Kevin Gimpel and Noah A. Smith. In Proceedings of the EMNLP Workshop on Statistical Machine Translation (SMT 2011), Edinburgh, UK, July 2011.
- Better Hypothesis Testing for Statistical Machine Translation: Controlling for Optimizer Instability. Jonathan H. Clark, Chris Dyer, Alon Lavie, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, companion volume (ACL 2011), Portland, OR, June 2011.
- Unsupervised Word Alignment with Arbitrary Features. Chris Dyer, Jonathan H. Clark, Alon Lavie, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2011), Portland, OR, June 2011.
- Nonparametric Word Segmentation for Machine Translation. ThuyLinh Nguyen, Stephan Vogel, and Noah A. Smith. In Proceedings of the International Conference on Computational Linguistics (COLING 2010), Beijing, China, August 2010.
- Feature-Rich Translation by Quasi-Synchronous Lattice Parsing. Kevin Gimpel and Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2009), pages 219–228, Singapore, August 2009.
- Preference Grammars: Softening Syntactic Constraints to Improve Statistical Machine Translation. Ashish Venugopal, Andreas Zollmann, Noah A. Smith, and Stephan Vogel. In Proceedings of the North American Association for Computational Linguistics Human Language Technologies Conference (NAACL 2009), pages 236–244, Boulder, CO, May/June 2009.
- Wider Pipelines: N-Best Alignments and Parses in MT Training. Ashish Venugopal, Andreas Zollmann, Noah A. Smith, and Stephan Vogel. In Proceedings of the Conference of the Association for Machine Translation in the Americas (AMTA 2008), Waikiki, HI, October 2008.
- Rich Source-Side Context for Statistical Machine Translation. Kevin Gimpel and Noah A. Smith. In Proceedings of the ACL Workshop on Statistical Machine Translation (SMT 2008), pages 9–17, Columbus, OH, June 2008.
- The Web as a Parallel Corpus. Philip Resnik and Noah A. Smith. Computational Linguistics 29(3):349–380, September 2003.
- From Words to Corpora: Recognizing Translation. Noah A. Smith. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2002), pages 95–102, Philadelphia, PA, July 2002.
- Detection of Translational Equivalence. Noah A. Smith. T.R. Technical report 4253, Department of Computer Science, University of Maryland College Park, College Park, MD, May 2001.
- Cairo: An Alignment Visualization Tool. Noah A. Smith and Michael E. Jahr. In Proceedings of the Language Resources and Evaluation Conference (LREC 2000), pages 549–552, Athens, Greece, May/June 2000.
- Statistical Machine Translation. Yaser Al-Onaizan, Jan Curin, Michael Jahr, Kevin Knight, John Lafferty, I. Dan Melamed, Noah A. Smith, Franz-Josef Och, David Purdy, and David Yarowsky. T.R. CLSP Research Notes 42, Johns Hopkins University, Baltimore, MD, 1999.
Core machine learning
- Linguistic Structured Sparsity in Text Categorization. Dani Yogatama and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2014), Baltimore, MD, June 2014.
Also available: talk slides.
- Making the Most of Bag of Words: Sentence Regularization with Alternating Direction Method of Multipliers. Dani Yogatama and Noah A. Smith. In Proceedings of the International Conference on Machine Learning (ICML 2014), Beijing, China, June 2014.
- Structured Sparsity in Structured Prediction. André F. T. Martins, Noah A. Smith, Pedro M. Q. Aguiar, and Mário A. T. Figueiredo. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2011), Edinburgh, UK, July 2011.
- An Augmented Lagrangian Approach to Constrained MAP Inference. André F. T. Martins, Pedro M. Q. Aguiar, Mário A. T. Figueiredo, Noah A. Smith, and Eric P. Xing. In Proceedings of the International Conference on Machine Learning (ICML 2011), Bellevue, WA, June/July 2011.
- Online Learning of Structured Predictors with Multiple Kernels. André F. T. Martins, Noah A. Smith, Eric P. Xing, Pedro M. Q. Aguiar, and Mário A. T. Figueiredo. In Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS 2011), Fort Lauderdale, FL, April 2011.
- Online Multiple Kernel Learning for Structured Prediction. André F. T. Martins, Noah A. Smith, Eric P. Xing, Pedro M. Q. Aguiar, and Mário A. T. Figueiredo. In Proceedings of the NIPS Workshop on New Directions in Multiple Kernel Learning, Whistler, BC, December 2010.
- Augmenting Dual Decomposition for MAP Inference. André F. T. Martins, Noah A. Smith, Eric P. Xing, Pedro M. Q. Aguiar, and Mário A. T. Figueiredo. In Proceedings of the International Workshop on Optimization for Machine Learning (OPT 2010), Whistler, BC, December 2010.
- Softmax-Margin CRFs: Training Log-Linear Models with Cost Functions. Kevin Gimpel and Noah A. Smith. In Proceedings of the North American Chapter of the Association for Computational Linguistics Human Language Technologies Conference (NAACL 2010), Los Angeles, CA, June 2010.
Also available: extended technical report.
- Softmax-Margin Training for Structured Log-Linear Models. Kevin Gimpel and Noah A. Smith. Pittsburgh, PA, June 2010.
- Aggressive Online Learning of Structured Classifiers. André F. T. Martins, Kevin Gimpel, Noah A. Smith, Eric P. Xing, Pedro M. Q. Aguiar, and Mário A. T. Figueiredo. Pittsburgh, PA, June 2010.
- Polyhedral Outer Approximations with Application to Natural Language Parsing. André F. T. Martins, Noah A. Smith, and Eric P. Xing. In Proceedings of the International Conference on Machine Learning (ICML 2009), pages 713–720, Montréal, Québec, June 2009.
- Nonextensive Information Theoretic Kernels on Measures. André F. T. Martins, Noah A. Smith, Eric P. Xing, Mário A. T. Figueiredo, and Pedro M. Q. Aguiar. Journal of Machine Learning Research 10:935–975, April 2009.
- Nonextensive Entropic Kernels. André F. T. Martins, Mário A. T. Figueiredo, Pedro M. Q. Aguiar, Noah A. Smith, and Eric P. Xing. In Proceedings of the International Conference on Machine Learning (ICML 2008), pages 640–647, Helsinki, Finland, July 2008.
Linguistics and linguistic resources
- A Corpus and Model Integrating Multiword Expressions and Supersenses. Nathan Schneider and Noah A. Smith. In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2015), Denver, CO, June 2015.
- Simplified Dependency Annotations with GFL-Web. Michael T. Mordowanec, Nathan Schneider, Chris Dyer, and Noah A. Smith. In Proceedings of the Annual Meeting of the Association for Computational Linguistics, companion volume (ACL 2014 demonstration track), Baltimore, MD, June 2014.
- Comprehensive Annotation of Multiword Expressions in a Social Web Corpus. Nathan Schneider, Spencer Onuffer, Nora Kazour, Emily Danchik, Michael T. Mordowanec, Henrietta Conrad, and Noah A. Smith. In Proceedings of the Language Resources and Evaluation Conference (LREC 2014), Reykjavik, Iceland, May 2014.
- An Adjective Analysis. Noah A. Smith. January 2002.
- Ellipsis Happens, and Deletion is How. Noah A. Smith. In ed. Andrea Gualmini, Soo-Min Hong, and Mitsue Motomura, University of Maryland Working Papers in Linguistics, pages 176–191, 2001, Department of Linguistics, University of Maryland.