David BammanSchool of Computer Science
Language Technologies Institute
Carnegie Mellon University
School of Information
University of California, Berkeley
email: dbamman at cs.cmu.edu
I'm a PhD student in the School of Computer Science at CMU, advised by Noah Smith. My research applies natural language processing and machine learning to empirical questions in the humanities and social sciences. I work on NLP centered around people. Before CMU, I was a senior researcher in computational linguistics at the Perseus Project at Tufts University, where I developed the Ancient Greek and Latin Dependency Treebanks.In fall 2015, I'll be joining UC Berkeley as an assistant professor in the School of Information.
- Winner, 2014 Alan J. Perlis Graduate Student Teaching Award
- Spring 2014: TA, Natural Language Processing (11-411/611)
- Fall 2013: Co-instructor, Digital Literary and Cultural Studies (76-429/829)
- Bamman, David, Adam Anderson, and Noah Smith, "Inferring Social Rank in an Old Assyrian Trade Network," Digital Humanities (2013) [ArXiv]
- Schneider, Nathan, Brendan O'Connor, Naomi Saphra, David Bamman, Manaal Faruqui, Jason Baldridge, Noah A. Smith, and Chris Dyer, "A Framework for (Under)specifying Dependency Syntax without Overloading Annotators," In Proceedings of the ACL Linguistic Annotation Workshop (LAW 2013), Sofia, Bulgaria, August 2013. [Extended version]
O'Connor, Brendan, David Bamman and Noah A. Smith, "Computational Text Analysis for Social Science: Model Assumptions and Complexity," NIPS Workshop on Computational Social Science and the Wisdom of Crowds (2011). [pdf] [bib]
Bamman, David, and Gregory Crane, "Measuring Historical Word Sense Variation," in: Proceedings of the 11th ACM/IEEE Joint Conference on Digital Libraries (JCDL 2011). Runner up, Best Paper Award. [pdf] [bib]
Bamman, David, and Gregory Crane, "The Ancient Greek and Latin Dependency Treebanks," in: Caroline Sporleder, Antal van den Bosch and Kalliopi Zervanou (eds.), Language Technology for Cultural Heritage (Springer, 2011). [pdf] [bib]
- Bamman, David, "Mapping the Demographics of American English with Twitter," Language Log, May 18, 2010. [html]
Bamman, David, Alison Babeu, and Gregory Crane, "Transferring Structural Markup Across Translations Using Multilingual Alignment and Projection," in: Proceedings of the 10th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL 2010). Winner, Best Paper Award. [pdf] [bib]
Bamman, David, Francesco Mambrini and Gregory Crane, "An Ownership Model of Annotation: The Ancient Greek Dependency Treebank," in: Proceedings of the Eighth International Workshop on Treebanks and Linguistic Theories (TLT8) (Milan, Italy: 2009). [pdf] [bib]
Bamman, David, Marco Passarotti and Gregory Crane, "A Case Study in Treebank Collaboration and Comparison: Accusativus cum Infinitivo and Subordination in Latin," Prague Bulletin of Mathematical Linguistics 90 (2008). [pdf] [bib]
11K Latin Books. 11,261 OCR'd Latin texts from the Internet Archive (1.38B words), along with associated metadata detailing the dates of composition.
CMU Book Summary Dataset. 16,559 book plot summaries + metadata.
CMU Movie Summary Dataset. 42,306 movie plot summaries + metadata
Twitter14K Dataset. Aggregated word counts from 14,464 Twitter users (9.2M tweets)
- Oct. 24, 2014, 1-5pm. "Tutorial: Machine Learning and the Computational Humanities," Digital Humanities and Computer Science Colloquium (Northwestern University)