During my PhD at Language Technologies Institute, School of Computer Science at Carnegie Mellon University, I worked on advancing machine learning techniques to tackle cross-lingual and cross-domain problems in natural language processing, focusing on computational phonology and morphology, distributional and lexical semantics, and statistical machine translation of both text and speech.
Here are my CV and Google Scholar page.
*** In 2017, I'll join the Language Technologies Institute at CMU as an assistant professor. I'll be hiring a PhD student, so please email me if you are interested!
Correlation-based Intrinsic Evaluation of Word Vector RepresentationsIn RepEval'16. PDF CODE
Problems With Evaluation of Word Embeddings Using Word Similarity TasksIn RepEval'16. PDF
Polyglot Neural Language Models: Case Study in Cross-Lingual Phonetic Representation LearningProc. NAACL'16. PDF
Morphological Inflection Generation Using Character Sequence to Sequence LearningProc. NAACL'16. PDF
Massively Multilingual Word EmbeddingsarXiv preprint PDF
Cross-Lingual Bridges with Models of Lexical Borrowing.Journal of Artificial Intelligence Research (JAIR). PDF
Not All Contexts Are Created Equal: Better Word Representations with Variable Attention.PDFIn Proc. EMNLP'15.
Lexicon Stratification for Translating Out-of-Vocabulary Words.PDFIn Proc. ACL'15.
Sparse Overcomplete Word Vector Representations.PDFIn Proc. ACL'15.
A Bottom Up Approach to Category Mapping and Meaning Change.PDFIn Proc. NetWordS'15.
Constraint-Based Models of Lexical Borrowing.PDFIn Proc. NAACL'15.
Identification of Multi-word Expressions by Combining Multiple Linguistic Information Sources.PDFComputational Linguistics, 40(2):449-468, 2014.
Augmenting Translation Models with Simulated Acoustic Confusions for Improved Spoken Language Translation.PDFIn Proc. EACL'14.
Automatic Classification of Communicative Functions of Definiteness.PDFIn Proc. COLING'14.
The CMU Machine Translation Systems at WMT 2014.PDFIn Proc. WMT'14.
Generating English Determiners in Phrase-Based Translation with Synthetic Translation Options.PDFIn Proc. WMT'13.
The CMU Machine Translation Systems at WMT 2013: Syntax, Synthetic Translation Options, and Pseudo-References.PDFIn Proc. WMT'13.
Identifying the L1 of non-native writers: the CMU-Haifa system.PDFIn Proc. the 8th Workshop on Innovative Use of NLP for Building Educational Applications, 2013.
Cross-Lingual Metaphor Detection Using Common Semantic Features.PDFIn Proc. Meta4NLP Workshop, 2013.
Identification and Modeling of Word Fragments in Spontaneous Speech.PDFIn Proc. ICASSP'13.
Extraction of Multi-word Expressions from Small Parallel Corpora.PDFIn Natural Language Engineering 18(4):549-573, 2012.
Identification of Multi-word Expressions by Combining Multiple Linguistic Information Sources.PDFIn Proc. EMNLP'11.
Extraction of Multi-word Expressions from Small Parallel Corpora.PDFUniversity of Haifa M.Sc. thesis, September 2010.
Extraction of Multi-word Expressions from Small Parallel Corpora.PDFIn Proc. COLING'10.
Automatic Acquisition of Parallel Corpora from Websites with Dynamic Content.PDFIn Proc. LREC'10.