I am interested in the intersection of Natural Language Processing, Information Retrieval, and Machine Learning. My research experience includes the following topics:
grounded language learning, learning semantic relations, topic models, mining software repositories and software-focused corpora, bootstrapping on biomedical ontologies, knowledge base population, bootstrap learning and semantic drift, seed set refinement, text alignment with Hidden Markov Models, social media analysis, and computational biology.
Before coming to CMU, I got my M.Sc. and B.Sc. degrees in the Computer Science and Computational Biology program at the School of Computer Science and Engineering of The Hebrew University of Jerusalem. During that time, I did research at the Furman Lab (Dept. of Molecular Genetics and Biotechnology), and my adviser was Prof. Ora Schueler-Furman. In this group, we used computational methods to understand protein-protein interactions from a structural bioinformatics perspective. More specifically, we made predictions of the structural changes that take place in proteins during docking.
Apart from doing research I had a chance to get some great industry experience working for IBM, Facebook and Google.
You can find my full research and work history in my CV.
One of the things I enjoy most is hiking and traveling around the world. So far one of my favorite hiking locations has been New-Zealand, and I plan to return! I have an awesome husband, who was also a CSD PhD student at CMU.
KB-LDA: Jointly Learning a Knowledge Base of Hierarchy, Relations, and Facts
Dana Movshovitz-Attias and William Cohen, 2015, Association for Computational Linguistics (ACL)
[pdf] [data] [ACL presentation] [bibtex]
Discovering Subsumption Relationships for Web-Based Ontologies
Dana Movshovitz-Attias, Steven Euijong Whang, Natalya Noy, and Alon Halevy, 2015, Proc. 18th International Workshop on the Web and Databases (WebDB) at ACM Sigmod
Winner of the WebDB Best Paper Award.
[pdf] [WebDB presentation] [bibtex]
Natural Language Models for Predicting Programming Comments
Dana Movshovitz-Attias and William Cohen, 2013, Association for Computational Linguistics (ACL)
[pdf] [corpus] [code (as Eclipse plugin)] [ACL presentation] [bibtex]
Analysis of the Reputation System and User Contributions on a Question Answering Website: StackOverflow
Dana Movshovitz-Attias*, Yair Movshovitz-Attias*, Peter Steenkiste and Christos Faloutsos, 2013, ASONAM
Alignment-HMM-based Extraction of Abbreviations from Biomedical Text
Dana Movshovitz-Attias and William Cohen, 2012, BioNLP in NAACL
[pdf] [github code (within the second-string package)] [code description and downloadable data] [abbreviations extracted from PubMed] [BioNLP presentation] [bibtex]
Detection of Peptide‐Binding Sites on Protein Surfaces: The First Step Towards the Modeling and Targeting of Peptide‐Mediated Interactions
Assaf Lavi, Chi Ho Ngan, Dana Movshovitz‐Attias, Tanggis Bohnuud, Christine Yueh, Dmitri Beglov, Ora Schueler‐Furman, Dima Kozakov, 2013, Proteins: Structure, Function and Bioinformatics
Can Self-Inhibitory Peptides Be Derived from the Interfaces of Globular Protein-Protein Interactions?
Nir London, Barak Raveh, Dana Movshovitz-Attias and Ora Schueler-Furman, 2010, Proteins: Structure, Function and Bioinformatics
On The Use of Structural Templates for High-Resolution Docking
Dana Movshovitz-Attias, Nir London and Ora Schueler-Furman, 2010, Proteins: Structure, Function and Bioinformatics
[pdf] [pubmed] [bibtex]
Poster presented at the 11th Israeli Bioinformatics Symposium at Tel-Aviv University, Israel, 4/2008.
The Structural Basis of Peptide-Protein Binding Strategies
Nir London, Dana Movshovitz-Attias and Ora Schueler-Furman, 2010, Structure
[pdf] [pubmed] [bibtex]
Poster presented at the 12th Israeli Bioinformatics Symposium at Weizmann Institute, Israel, 4/2009.
Code, tools, and research-related data.
If you have questions about this content, or if there is other data you would like to use, please contact me at: dma [at] cs.cmu.edu
Dataset based on StackOverflow that was used to train the KB-LDA model from our ACL2015 paper.
This is an abbreviation extractor based on a Hidden Markov Model. With this code you can extract abbreviations and their definitions from a text corpus. The Abbreviation Alignment HMM code is a part of the second-string open source package.
com.wcohen.ss.AbbreviationAlignmentis an implementation of the abbreviation alignment metric.
com.wcohen.ss.expt.ExtractAbbreviationsis a utility for extracting abbreviations from a text corpus using our method.
Courses and TA experience