Cho ruwächulew, kas tzij qˈuia u wäch ri quechˈaw wi ri winak, chquijujunal cˈut cˈo ri quel cubij ri qui tzij.


Matt Gormley
Associate Teaching Professor
ML Minor/Concentration Director
Machine Learning Department (ML)
School of Computer Science (SCS)
Carnegie Mellon University (CMU)
Affiliate: Language Technologies Institute (LTI)

email: mgormley at cs dot cmu dot edu
office: Gates-Hillman Center (GHC) 8103
phone: 412-268-7205 (office)

Research Interests

Natural language processing: dialogue summarization, multi-document/long-document summarization, NLP for medical text, multilinguality, low-resource languages and domains, syntactic parsing, semantic parsing, grammar induction.

Machine learning: approximate inference and search, unsupervised learning, low-resource learning, approximation-aware learning, autoregressive models, computationally efficient ML.


Papers

2023

  • It's MBR All the Way Down: Modern Generation Techniques Through the Lens of Minimum Bayes Risk.
    Amanda Bertsch, Alex Xie, Graham Neubig, Matthew R. Gormley.
    Big Picture Workshop at EMNLP. 2023.
    [paper]
  • Unlimiformer: Long-Range Transformers with Unlimited Length Input.
    Amanda Bertsch, Uri Alon, Graham Neubig, Matthew R. Gormley.
    NeurIPS. 2023.
    [paper]
  • MDACE: MIMIC Documents Annotated with Code Evidence.
    Hua Cheng, Rana Jafari, April Russell, Russell Klopfer, Edmond Lu, Benjamin Striner, Matthew Gormley.
    ACL. 2023.
    [paper]
  • SummQA at MEDIQA-Chat 2023:In-Context Learning with GPT-4 for Medical Summarization.
    Yash Mathur, Sanketh Rangreji, Raghav Kapoor, Medha Palavalli, Amanda Bertsch, Matthew R. Gormley.
    Clinical NLP Workshop at ACL 2023. 2023.
    [paper]

2022

  • He Said, She Said: Style Transfer for Shifting the Perspective of Dialogues.
    Amanda Bertsch, Graham Neubig, Matthew R. Gormley.
    Findings of EMNLP. 2022.
    [paper] [bibtex]
    @inproceedings{bertsch_he_2022,
            title = {He {Said}, {She} {Said}: {Style} {Transfer} for {Shifting} the {Perspective} of {Dialogues}},
            url = {http://arxiv.org/abs/2210.15462},
            author = {Bertsch, Amanda and Neubig, Graham and Gormley, Matthew R.},
            booktitle = {Findings of EMNLP},
            year = {2022},
    }
  • Revisiting text decomposition methods for NLI-based factuality scoring of summaries.
    John Glover, Federico Fancellu, Vasudevan Jagannathan, Matthew R. Gormley, Thomas Schaaf.
    GEM Workshop at EMNLP. 2022.
    [paper] [bibtex]
    @inproceedings{glover_revisiting_2022,
            title = {Revisiting text decomposition methods for {NLI}-based factuality scoring of summaries},
            url = {http://arxiv.org/abs/2211.16853},
            author = {Glover, John and Fancellu, Federico and Jagannathan, Vasudevan and Gormley, Matthew R. and Schaaf, Thomas},
            booktitle = {GEM Workshop at EMNLP},
            year = {2022},
    }
  • AdaFocal: Calibration-aware Adaptive Focal Loss.
    Arindam Ghosh, Thomas Schaaf, Matthew R. Gormley.
    NeurIPS. 2022.
    [paper] [bibtex]
    @inproceedings{ghosh_adafocal_2022,
            title = {{AdaFocal}: {Calibration}-aware {Adaptive} {Focal} {Loss}},
            url = {http://arxiv.org/abs/2211.11838},
            author = {Ghosh, Arindam and Schaaf, Thomas and Gormley, Matthew R.},
            booktitle = {NeurIPS},
            year = {2022},
    }
  • On Efficiently Acquiring Annotations for Multilingual Models.
    Joel Moniz, Barun Patra, Matthew R. Gormley.
    ACL. 2022.
    [paper] [bibtex]
    @inproceedings{moniz_efficiently_2022,
            address = {Dublin, Ireland},
            title = {On {Efficiently} {Acquiring} {Annotations} for {Multilingual} {Models}},
            url = {https://aclanthology.org/2022.acl-short.9},
            doi = {10.18653/v1/2022.acl-short.9},
            booktitle = {Proceedings of the 60th {Annual} {Meeting} of the {Association} for {Computational} {Linguistics} ({Volume} 2: {Short} {Papers})},
            publisher = {Association for Computational Linguistics},
            author = {Moniz, Joel and Patra, Barun and Gormley, Matthew},
            month = may,
            year = {2022},
            pages = {69--85},
    }

2021

  • Effective Convolutional Attention Network for Multi-label Clinical Document Classification.
    Yang Liu, Hua Cheng, Russell Klopfer, Matthew R. Gormley, and Thomas Schaaf.
    EMNLP. 2021.
    [paper] [bibtex]
    @inproceedings{liu-etal-2021-effective,
        title = "Effective Convolutional Attention Network for Multi-label Clinical Document Classification",
        author = "Liu, Yang  and
          Cheng, Hua  and
          Klopfer, Russell  and
          Gormley, Matthew R.  and
          Schaaf, Thomas",
        booktitle = "Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing",
        month = nov,
        year = "2021",
        address = "Online and Punta Cana, Dominican Republic",
        publisher = "Association for Computational Linguistics",
        url = "https://aclanthology.org/2021.emnlp-main.481",
        doi = "10.18653/v1/2021.emnlp-main.481",
        pages = "5941--5953",
        abstract = "Multi-label document classification (MLDC) problems can be challenging, es
    }
  • Leveraging Pretrained Models for Automatic Summarization of Doctor-Patient Conversations.
    Longxiang Zhang, Renato Negrinho, Arindam Ghosh, Vasudevan Jagannathan, Hamid Reza Hassanzadeh, Thomas Schaaf, and Matthew R. Gormley.
    Findings of EMNLP. 2021.
    [paper] [bibtex]
    @inproceedings{zhang-etal-2021-leveraging-pretrained,
      title = "Leveraging Pretrained Models for Automatic Summarization of Doctor-Patient Conversations",
        author = "Zhang, Longxiang  and  Negrinho, Renato  and   Ghosh, Arindam  and   Jagannathan, Vasudevan  and  Hassanzadeh, Hamid Reza  and    Schaaf, Thomas  and   Gormley, Matthew R.",
      booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2021",
      month = nov,
      year = "2021",
      address = "Punta Cana, Dominican Republic",
      publisher = "Association for Computational Linguistics",
      url = "https://aclanthology.org/2021.findings-emnlp.313",
      doi = "10.18653/v1/2021.findings-emnlp.313",
      pages = "3693--3712",
    }
  • Comparative Error Analysis in Neural and Finite-state Models for Unsupervised Character-level Transduction.
    Maria Ryskina, Eduard Hovy, Taylor Berg-Kirkpatrick, and Matthew R. Gormley.
    SIGMORPHON Workshop at ACL-IJCNLP. 2021.
    [paper] [bibtex]
    @inproceedings{ryskina_comparative_2021,
        author = {Ryskina, Maria and Hovy, Eduard and Berg-Kirkpatrick, Taylor and Gormley, Matthew R.},
        title = {Comparative Error Analysis in Neural and Finite-state Models for Unsupervised Character-level Transduction},
        booktitle = {SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology},
        year = {2021},
    }
  • Autoregressive Modeling is Misspecified for Some Sequence Distributions.
    Chu-Cheng Lin, Aaron Jaech, Xin Li, Matthew R. Gormley, Jason Eisner.
    NAACL. 2021.
    [paper] [bibtex]
    @inproceedings{lin_limitations_2021,
      author =      {Chu-Cheng Lin and Aaron Jaech and Xin Li and Matthew R.
                    Gormley and Jason Eisner},
      title =       {Limitations of Autoregressive Models and Their
                    Alternatives},
      booktitle =   {Proceedings of {NAACL-HLT}},
      year =        {2021},
    }

2020

  • Training for Gibbs Sampling on Conditional Random Fields with Neural Scoring Factors.
    Sida Gao, Matthew R. Gormley.
    EMNLP. 2020.
    [paper] [code] [bibtex]
    @inproceedings{gao_training_2020,
      address = {Online},
      title = {Training for {Gibbs} {Sampling} on {Conditional} {Random} {Fields} with {Neural} {Scoring} {Factors}},
      url = {https://www.aclweb.org/anthology/2020.emnlp-main.406},
      booktitle = {Proceedings of the 2020 {Conference} on {Empirical} {Methods} in {Natural} {Language} {Processing} ({EMNLP})},
      author = {Gao, Sida and Gormley, Matthew R.},
      year = {2020},
    }
  • An Empirical Investigation of Beam-Aware Training in Supertagging.
    Renato Negrinho, Matthew R. Gormley, Geoff Gordon.
    Findings of EMNLP. 2020.
    [paper] [bibtex]
    @inproceedings{negrinho_empirical_2020,
      title = {An {Empirical} {Investigation} of {Beam}-{Aware} {Training} in {Supertagging}},
      url = {https://www.aclweb.org/anthology/2020.findings-emnlp.406},
      booktitle = {Findings of the {Association} for {Computational} {Linguistics}: {EMNLP} 2020},
      author = {Negrinho, Renato and Gormley, Matthew R. and Gordon, Geoff},
      year = {2020},
    }
  • Phonetic and Visual Priors for Decipherment of Informal Romanization.
    Maria Ryskina, Matthew R. Gormley, and Taylor Berg-Kirkpatrick.
    ACL. 2020.
    [paper] [code+data] [bibtex]
    @inproceedings{ryskina_phonetic_2020,
        author = {Ryskina, Maria and Gormley, Matthew R. and Berg-Kirkpatrick, Taylor},
        title = {Phonetic and {Visual} {Priors} for {Decipherment} of {Informal} {Romanization}},
        booktitle = {Proceedings of {ACL}},
        year = {2020},
    }

2019

  • Towards modular and programmable architecture search.
    Renato Negrinho, Darshan Patil, Nghia Le, Daniel Ferreira, Matthew R. Gormley, Geoffrey Gordon.
    NeurIPS. 2019.
    [paper] [code] [bibtex]
    @inproceedings{negrinho_towards_2019,
        author = {Renato Negrinho and Darshan Patil and Nghia Le and Daniel Ferreira and Matthew R. Gormley and Geoffrey Gordon},
        title = {Towards modular and programmable architecture search},
        booktitle = {Proceedings of {NeurIPS}},
        year = {2019},
    }
  • Bilingual Lexicon Induction with Semi-supervision in Non-Isometric Embedding Spaces.
    Barun Patra, Joel Ruben Antony Moniz, Sarthak Garg, Matthew R. Gormley, Graham Neubig.
    ACL. 2019.
    [paper] [bibtex]
    @inproceedings{patra_bilingual_2019,
        author = {Barun Patra and Joel Ruben Antony Moniz and Sarthak Garg and Matthew R. Gormley and Graham Neubig},
        title = {Bilingual Lexicon Induction with Semi-supervision in Non-Isometric Embedding Spaces},
        booktitle = {Proceedings of {ACL}},
        year = {2019},
    }
  • Neural Finite-state Transducers: Beyond Rational Relations.
    Chu-Cheng Lin, Hao Zhu, Matthew R. Gormley, Jason Eisner.
    NAACL. 2019.
    [paper] [supplementary] [bibtex]
    @inproceedings{lin_neural_2019,
        author = {Chu-Cheng Lin and Hao Zhu and Matthew R. Gormley and
               Jason Eisner},
        title = {Neural finite-state transducers: Beyond rational relations},
        booktitle = {Proceedings of {NAACL}},
        year = {2019},
    }

2018

  • Learning Beam Search Policies via Imitation Learning.
    Renato Negrinho, Matthew R. Gormley, Geoffrey J. Gordon.
    NeurIPS. 2018.
    [paper] [bibtex]
    @inproceedings{negrinho_learning_2018,
        author = {Negrinho, Renato and Gormley, Matthew R. and Gordon, Geoffrey J.},
        title = {Learning Beam Search Policies via Imitation Learning},
        booktitle = {Proceedings of {NeurIPS}},
        year = {2018},
    }
  • Neural Factor Graph Models for Cross-lingual Morphological Tagging.
    Chaitanya Malaviya, Matthew R. Gormley, Graham Neubig.
    ACL. 2018.
    [paper] [bibtex]
    @inproceedings{malaviya_neural_2018,
        author = {Malaviya, Chaitanya and Gormley, Matthew R. and Neubig, Graham},
        title = {Neural {Factor} {Graph} {Models} for {Cross}-lingual {Morphological} {Tagging}},
        booktitle = {Proceedings of {ACL}},
        year = {2018},
    }

2017

  • Semantic Proto-Role Labeling.
    Adam Teichert, Adam Poliak, Benjamin Van Durme, Matthew R. Gormley.
    AAAI. 2017.
    [paper] [bibtex]
    @inproceedings{teichart_semantic_2017,
        author = {Adam Teichert and Adam Poliak and Benjamin Van Durme and Matthew R. Gormley},
        title = {Semantic Proto-Role Labeling},
        booktitle = {Proceedings of {AAAI} Conference on Artificial Intelligence},
        year = {2017},
    }

2016

  • Embedding Lexical Features via Low-rank Tensors.
    Mo Yu, Mark Dredze, Raman Arora, Matthew R. Gormley.
    NAACL. 2016.
    [paper] [bibtex]
    @inproceedings{yu_embedding_2016,
        author = {Mo Yu and Mark Dredze and Raman Arora and Matthew R. Gormley},
        title = {Embedding Lexical Features via Low-rank Tensors},
        booktitle = {Proceedings of {NAACL}},
        year = {2016},
    }

2015

  • Graphical Models with Structured Factors, Neural Factors, and Approximation-Aware Training.
    Matthew R. Gormley.
    Ph.D. Thesis. Johns Hopkins University. 2015.
    [thesis (official format)] [thesis (single-spaced)] [bibtex]
    @thesis{gormley_graphical_2015,
            location = {Baltimore, {MD}},
            title = {Graphical Models with Structured Factors, Neural Factors, and Approximation-Aware Training},
            institution = {Johns Hopkins University},
            type = {phdthesis},
            author = {Gormley, Matthew R.},
            date = {2015}
    }
  • Improved Relation Extraction with Feature-rich Compositional Embedding Models.
    Matthew R. Gormley*, Mo Yu*, Mark Dredze.
    (*The first two authors contributed equally.)
    EMNLP. 2015.
    [paper+supplement] [slides] [data] [erratum] [bibtex]

    There is a mistake in our description of the ACE 2005 dataset. Section 6.1 states that the number of relations in the training set (bn+nw) with all pairs was 43,518, but the actual number was 43,497. Likewise, Appendix A states the number was 35,990 for the Plank & Moschitti (2013) setting, but the actual number was 34,669.

    Thanks to Thien Nguyen for helping us to find the error.

    @inproceedings{gormley_improved_2015,
        author = {Matthew R. Gormley and Mo Yu and Mark Dredze},
        title = {Improved Relation Extraction with Feature-rich Compositional Embedding Model},
        booktitle = {Proceedings of {EMNLP}},
        year = {2015},
    }
  • Approximation-Aware Dependency Parsing by Belief Propagation.
    Matthew R. Gormley, Mark Dredze, and Jason Eisner.
    TACL. 2015.
    [paper] [slides] [bibtex]
    @article{gormley_approximation-aware_2015,
        author = {Matthew R. Gormley and Mark Dredze and Jason Eisner},
        title = {Approximation-aware Dependency Parsing by Belief Propagation},
        journal = {Transactions of the Association for Computational Linguistics (TACL)},
        year = {2015}
    }
  • A Concrete Chinese NLP Pipeline.
    Nanyun Peng, Francis Ferraro, Mo Yu, Nicholas Andrews, Jay DeYoung, Max Thomas, Matthew R. Gormley, Travis Wolfe, Craig Harman, Benjamin Van Durme, and Mark Dredze.
    NAACL Demonstration Session. 2015.
    [paper] [bibtex]
    @inproceedings{peng_concrete_2015,
        author = {Nanyun Peng and Francis Ferraro and Mo Yu and Nicholas Andrews and Jay DeYoung and Max Thomas and Matthew R. Gormley and Travis Wolfe and Craig Harman and Benjamin Van Durme and Mark Dredze},
        title = {A {Concrete} {Chinese} {NLP} Pipeline},
        booktitle = {Proceedings of the {NAACL} Demonstration Session},
        year = {2015}
    }
  • Combining Word Embeddings and Feature Embeddings for Fine-grained Relation Extraction.
    Mo Yu, Matthew R. Gormley, and Mark Dredze.
    NAACL. 2015.
    [paper] [bibtex]
    @inproceedings{yu_combining_2015,
        author = {Yu, Mo and Gormley, Matthew R. and Dredze, Mark},
        title = {Combining Word Embeddings and Feature Embeddings for Fine-grained Relation Extraction},
        booktitle = {Proceedings of {NAACL}},
        year = {2015}
    }

2014

  • Factor-based Compositional Embedding Models.
    Mo Yu, Matthew R. Gormley, and Mark Dredze.
    The NeurIPS 2014 Learning Semantics Workshop. 2014.
    [paper] [code] [bibtex]
    @inproceedings{yu_factor-based_2014,
        author = {Yu, Mo and Gormley, Matthew R. and Dredze, Mark},
        title = {Factor-based Compositional Embedding Models},
        booktitle = {The {NeurIPS} 2014 Learning Semantics Workshop},
        month = {December},
        year = {2014}
    }
  • Concretely Annotated Corpora.
    Francis Ferraro, Max Thomas, Matthew R. Gormley, Travis Wolfe, Craig Harman, and Benjamin Van Durme.
    The NeurIPS 2014 AKBC Workshop. 2014.
    [paper] [data+code] [bibtex]
    @inproceedings{ferraro_concretely_2014,
        author = {Ferraro, Francis and Thomas, Max and Gormley, Matthew R. and Wolfe, Travis and Harman, Craig and Van Durme, Benjamin},
        title = {Concretely Annotated Corpora},
        booktitle = {The {NeurIPS} 2014 {AKBC} Workshop},
        month = {December},
        year = {2014}
    }
  • Low-Resource Semantic Role Labeling.
    Matthew R. Gormley, Margaret Mitchell, Benjamin Van Durme, Mark Dredze.
    ACL. 2014.
    [paper] [slides (CLSP Seminar)] [code] [bibtex]
    @inproceedings{gormley_low-resource_2014,
        author    = {Gormley, Matthew R. and Mitchell, Margaret and {Van Durme}, Benjamin and Dredze, Mark},
        title     = {Low-Resource Semantic Role Labeling},
        booktitle = {Proceedings of {ACL}},
        month     = {June},
        year      = {2014},
    }

2013

  • Nonconvex Global Optimization for Latent-Variable Models.
    Matthew R. Gormley, Jason Eisner.
    ACL. 2013.
    [paper] [slides (ACL)] [slides (CLSP Seminar)] [bibtex]
    @inproceedings{gormley_nonconvex_2013,
        author    = {Gormley, Matthew R. and Eisner, Jason},
        title     = {Nonconvex Global Optimization for Latent-Variable Models},
        booktitle = {Proceedings of {ACL}},
        month     = {August},
        year      = {2013},
    }
  • Topic Models and Metadata for Visualizing Text Corpora.
    Justin Snyder, Rebecca Knowles, Mark Dredze, Matthew R. Gormley, Travis Wolfe.
    NAACL Demonstration Session. 2013.
    [paper] [bibtex]
    @inproceedings{snyder_topic_2013,
        author    = {Snyder, Justin  and  Knowles, Rebecca  and  Dredze, Mark  and  Gormley, Matthew  and  Wolfe, Travis},
        title     = {Topic Models and Metadata for Visualizing Text Corpora},
        booktitle = {Proceedings of the 2013 {NAACL} {HLT} Demonstration Session},
        month     = {June},
        year      = {2013},
    }

2012

  • Shared Components Topic Models.
    Matthew R. Gormley, Mark Dredze, Benjamin Van Durme, Jason Eisner.
    NAACL. 2012.
    [paper] [slides] [bibtex]
    @inproceedings{gormley_shared_2012,
      author    = {Gormley, Matthew R.  and Dredze, Mark and {Van Durme}, Benjamin and Eisner, Jason},
      title     = {Shared Components Topic Models},
      booktitle = {Proceedings of {NAACL}},
      month     = {June},
      year      = {2012},
    }
  • Annotated Gigaword.
    Courtney Napoles, Matthew Gormley, Benjamin Van Durme.
    AKBC-WEKEX workshop at NAACL. 2012.
    [paper] [code] [bibtex]
    @inproceedings{napoles_annotated_2012,
      author    = {Napoles, Courtney and Gormley, Matthew and {Van Durme}, Benjamin},
      title     = {Annotated Gigaword},
      booktitle = {{AKBC-WEKEX} Workshop at {NAACL} 2012},
      month     = {June},
      year      = {2012},
    }
  • Entity Clustering Across Languages.
    Spence Green, Nicholas Andrews, Matthew R. Gormley, Mark Dredze, Christopher D. Manning.
    NAACL. 2012.
    [paper] [bibtex]
    @inproceedings{green_entity_2012,
      author      = {Spence Green and Nicholas Andrews and Matthew R. Gormley and Mark Dredze and Christopher D. Manning},
      title       = {Entity Clustering Across Languages},
      booktitle   = {Proceedings of {NAACL}},
      month       = {June},
      year        = {2012},
    }

2011

  • Shared Components Topic Models with Application to Selectional Preference.
    Matthew R. Gormley, Mark Dredze, Benjamin Van Durme, Jason Eisner.
    Learning Semantics Workshop at NeurIPS. 2011.
    [extended abstract] [bibtex]
    @inproceedings{gormley_shared_2011,
      author    = {Gormley, Matthew R.  and Dredze, Mark and {Van Durme}, Benjamin and Eisner, Jason},
      title     = {Shared Components Topic Models with Application to Selectional Preference},
      booktitle = {Proceedings of Learning Semantics Workshop at {NeurIPS} 2011},
      month     = {December},
      year      = {2011},
    }
  • Cross-lingual Coreference Resolution: A New Task for Multilingual Comparable Corpora.
    Spence Green, Nicholas Andrews, Matthew R. Gormley, Mark Dredze, Christopher D. Manning.
    Technical Report 6. HLTCOE, Johns Hopkins University. 2011.
    [paper] [bibtex]
    @techreport{green_cross-lingual_2011,
      author      = {Spence Green and Nicholas Andrews and Matthew R. Gormley and Mark Dredze and Christopher D. Manning},
      title       = {Cross-lingual Coreference Resolution: A New Task for Multilingual Comparable Corpora},
      institution = {{HLTCOE}, Johns Hopkins University},
      number      = {6},
      year        = {2011}
    }

2010

  • Non-Expert Correction of Automatically Generated Relation Annotations.
    Matthew R. Gormley, Adam Gerber, Mary Harper, Mark Dredze.
    Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk. 2010.
    [paper] [code+data] [bibtex]
    @inproceedings{gormley_non-expert_2010,
      author    = {Gormley, Matthew R.  and  Gerber, Adam  and  Harper, Mary  and  Dredze, Mark},
      title     = {Non-Expert Correction of Automatically Generated Relation Annotations},
      booktitle = {Proceedings of the {NAACL} {HLT} 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk},
      month     = {June},
      year      = {2010},
    }
(View Bibtex only)

Tutorials


Teaching

Carnegie Mellon University
Johns Hopkins University

Students

Current Students
Former Students

Education

M.S.E. in Computer Science, Johns Hopkins University, 2009 - 2011.
B.S. in Computer Science with a double major in Cognitive Science, Carnegie Mellon University, 2003 - 2006.

Work Experience

Assistant Teaching Professor, Machine Learning Department, CMU, 2016-present.
Consultant, 3M | M*Modal, 2019-present.
Postdoctoral Researcher, Human Language Technology Center of Excellence, Fall 2015.
Google, Software Engineering Intern, Summer 2012.
Endeca Technologies, Software developer, 2007-2009.
Microsoft's Speech and Natural Language Group, Intern, Summer 2006.
Scone Knowledge Base Group at CMU, Research assistant for Scott Fahlman, 2005-2007.