My research focuses on cross-lingual methods to improve low-resource NLP, on statistical machine translation of both text and speech, and other cross-lingual and cross-domain statistical NLP tasks.
Prior to joining CMU, I have completed my Master's thesis on Hebrew multiword expressions at the Department of Computer Science, University of Haifa, where I was fortunate to work with Shuly Wintner.
Here is my (outdated) CV and Google Scholar page.
Evaluation of Word Vector Representations by Subspace Alignment.In Proc. EMNLP'15. PDF CODE
Not All Contexts Are Created Equal: Better Word Representations with Variable Attention.In Proc. EMNLP'15. PDF
Lexicon Stratification for Translating Out-of-Vocabulary Words.In Proc. ACL'15. PDF
Sparse Overcomplete Word Vector Representations.In Proc. ACL'15. PDF
A Bottom Up Approach to Category Mapping and Meaning Change.In Proc. NetWordS'15. PDF
Constraint-Based Models of Lexical Borrowing.In Proc. NAACL'15. PDF
Identification of Multi-word Expressions by Combining Multiple Linguistic Information Sources.PDFComputational Linguistics, 40(2):449-468, 2014.
Augmenting Translation Models with Simulated Acoustic Confusions for Improved Spoken Language Translation.PDFIn Proc. EACL'14.
Automatic Classification of Communicative Functions of Definiteness.PDFIn Proc. COLING'14.
The CMU Machine Translation Systems at WMT 2014.PDFIn Proc. WMT'14.
Generating English Determiners in Phrase-Based Translation with Synthetic Translation Options.PDFIn Proc. WMT'13.
The CMU Machine Translation Systems at WMT 2013: Syntax, Synthetic Translation Options, and Pseudo-References.PDFIn Proc. WMT'13.
Identifying the L1 of non-native writers: the CMU-Haifa system.PDFIn Proc. the 8th Workshop on Innovative Use of NLP for Building Educational Applications, 2013.
Cross-Lingual Metaphor Detection Using Common Semantic Features.PDFIn Proc. Meta4NLP Workshop, 2013.
Identification and Modeling of Word Fragments in Spontaneous Speech.PDFIn Proc. ICASSP'13.
Extraction of Multi-word Expressions from Small Parallel Corpora.PDFIn Natural Language Engineering 18(4):549-573, 2012.
Identification of Multi-word Expressions by Combining Multiple Linguistic Information Sources.PDFIn Proc. EMNLP'11.
Extraction of Multi-word Expressions from Small Parallel Corpora.PDFUniversity of Haifa M.Sc. thesis, September 2010.
Extraction of Multi-word Expressions from Small Parallel Corpora.PDFIn Proc. COLING'10.
Automatic Acquisition of Parallel Corpora from Websites with Dynamic Content.PDFIn Proc. LREC'10.