Noah Smith designs algorithms for automated analysis of human language. He often exploits the web to this end, including mining the web for translations (Resnik and Smith, 2003), measuring public opinion from social messages (O'Connor et al., 2010), and inferring geographic linguistic variation (Eisenstein et al., 2010).
|| نوح سميث
|Photo by Karen Meyers.|
Smith has also contributed algorithms tackling the core problems of natural language processing: parsing sentences into syntactic representations (Eisner et al., 2005; Martins et al., 2009) and semantic representations (Das et al., 2010; Flanigan et al., 2014), as well as cross-cutting techniques for unsupervised language learning (Smith and Eisner, 2005; Cohen and Smith, 2009). His 2011 book, Linguistic Structure Prediction, synthesizes many statistical modeling techniques for language.
Such methods advance applications for automatic translation (Al-Onaizan et al., 1999; Gimpel and Smith, 2011), empirical work in the social sciences (Kogan et al., 2009; Yano et al., 2009, Sim et al., 2013) and humanities (Bamman et al., 2014), and education (Heilman and Smith, 2010), and other next-generation language technologies.
Smith is associate professor of Computer Science & Engineering at the University of Washington.
Formerly, he was assistant professor (2006–11), Finmeccanica associate professor (2011–14), then tenured associate professor (2014–15) at the Language Technologies and the Machine Learning Department in the School of Computer Science at CMU.
Before that, he was a Hertz
Foundation Fellow at Johns Hopkins
University, where he completed his Ph.D. in 2006. He is a clarinetist, tanguero, and swimmer.
Active courses at CMU:
Sparsity in NLP: a tutorial on models, algorithms, and applications of
structured sparsity in NLP
- NLP Demystified (June 28, 2013): brief tutorial on three common paradigms for solving problems in NLP, presented at the NSF SoCS PI meeting
- Probability and Structure in NLP (November 2014): an invited tutorial course at the University of Heidelberg
(earlier version from July–August 2012, presented at the International Summer School in Language and Speech Technologies)
(earlier version from May 2011, with Shay Cohen at IBM's T. J. Watson Research Center)
- Sequence Models (July 2011, July 2012, July 2013): a three-hour lecture at the Lisbon Machine Learning Summer School (watch a video of the lecture here)
- Structured Prediction for NLP (ICML, June 14, 2009): a tutorial on statistical models for linguistic structure
- Probabilistic Graphical Models: advanced graduate course in machine learning, taught Fall 2010
- Text-Driven Forecasting: seminar-project hybrid course for graduate students, taught Fall 2009
- Language and Statistics II: advanced statistical NLP for graduate students, taught Fall 2006, Fall 2007, Fall 2008, and Fall 2009
- Empirical Research Methods in Computer Science (JHU, Fall 2005)
- Computational Genomics: Sequence Modeling (JHU, Fall 2004)
- Hidden Markov Models: All the Glorious Gory Details (October, 2004)
- Log-Linear Models (December, 2004)
- Predicting English (JHU CLSP summer workshop lab with Jason Eisner, Summers 2002 and 2003, taught by others since); read the paper
Research in NLP (Natural Language Processing)
How can computer programs intelligently process text
data? My research brings together linguistic abstractions,
statistical reasoning, and computational formalisms
to develop general
NLP methods and models. The results
are used in software applications (e.g., machine translation,
information extraction, text mining, question answering, and
text-driven forecasting) and also serve scientific discovery wherever text
serves as data (e.g., sociolinguistics, political science, and
Some research activities and events I have helped organize include:
Curriculum vitae highlights
See also: biographical blurbs and long C.V.
- 2015: to join UW faculty
- 2006: joined CMU faculty (promoted in 2011, tenured in 2014)
- 2006: Ph.D. in Computer Science, JHU, affiliated with the CLSP
- 2004: interned at Microsoft Research
- 2002: junior research scientist at NYU
- 2001–6: Hertz Foundation Fellowship
- 2001: B.A. in Linguistics and B.S. in Computer Science, U. Maryland
- 2000: visiting student, U. Edinburgh
- 1999: participant, JHU CLSP summer workshop
- 1997–2001: Banneker/Key Scholarship, U. Maryland
- 1996: performer, North Sea and Montreux Jazz Festivals, with the Glenelg Jazz Ensemble
- 1992: visiting student, Western Maryland College (now McDaniel College)