My research focuses on cross-lingual and cross-domain problems in statistical natural language processing, including statistical machine translation of both text and speech, computational phonology and morphology, and distributional and lexical semantics.
Prior to joining CMU, I have completed my Master's thesis on Hebrew multiword expressions at the Department of Computer Science, University of Haifa, where I was fortunate to work with Shuly Wintner.
Here is my CV and Google Scholar page.
*Update* In 2016, I will join the Stanford NLP Group for 1 year as a postdoc with Dan Jurafsky.
In 2017, I'll join the Language Technologies Institute at CMU as a tenure-track assistant professor.
Correlation-based Intrinsic Evaluation of Word Vector RepresentationsIn RepEval'16. PDF CODE
Problems With Evaluation of Word Embeddings Using Word Similarity TasksIn RepEval'16. PDF
Polyglot Neural Language Models: Case Study in Cross-Lingual Phonetic Representation LearningProc. NAACL'16. PDF
Morphological Inflection Generation Using Character Sequence to Sequence LearningProc. NAACL'16. PDF
Cross-Lingual Bridges with Models of Lexical Borrowing.Journal of Artificial Intelligence Research (JAIR). PDF
Not All Contexts Are Created Equal: Better Word Representations with Variable Attention.PDFIn Proc. EMNLP'15.
Lexicon Stratification for Translating Out-of-Vocabulary Words.PDFIn Proc. ACL'15.
Sparse Overcomplete Word Vector Representations.PDFIn Proc. ACL'15.
A Bottom Up Approach to Category Mapping and Meaning Change.PDFIn Proc. NetWordS'15.
Constraint-Based Models of Lexical Borrowing.PDFIn Proc. NAACL'15.
Identification of Multi-word Expressions by Combining Multiple Linguistic Information Sources.PDFComputational Linguistics, 40(2):449-468, 2014.
Augmenting Translation Models with Simulated Acoustic Confusions for Improved Spoken Language Translation.PDFIn Proc. EACL'14.
Automatic Classification of Communicative Functions of Definiteness.PDFIn Proc. COLING'14.
The CMU Machine Translation Systems at WMT 2014.PDFIn Proc. WMT'14.
Generating English Determiners in Phrase-Based Translation with Synthetic Translation Options.PDFIn Proc. WMT'13.
The CMU Machine Translation Systems at WMT 2013: Syntax, Synthetic Translation Options, and Pseudo-References.PDFIn Proc. WMT'13.
Identifying the L1 of non-native writers: the CMU-Haifa system.PDFIn Proc. the 8th Workshop on Innovative Use of NLP for Building Educational Applications, 2013.
Cross-Lingual Metaphor Detection Using Common Semantic Features.PDFIn Proc. Meta4NLP Workshop, 2013.
Identification and Modeling of Word Fragments in Spontaneous Speech.PDFIn Proc. ICASSP'13.
Extraction of Multi-word Expressions from Small Parallel Corpora.PDFIn Natural Language Engineering 18(4):549-573, 2012.
Identification of Multi-word Expressions by Combining Multiple Linguistic Information Sources.PDFIn Proc. EMNLP'11.
Extraction of Multi-word Expressions from Small Parallel Corpora.PDFUniversity of Haifa M.Sc. thesis, September 2010.
Extraction of Multi-word Expressions from Small Parallel Corpora.PDFIn Proc. COLING'10.
Automatic Acquisition of Parallel Corpora from Websites with Dynamic Content.PDFIn Proc. LREC'10.