Siddharth Dalmia

Language Technologies Institute · CMU · sdalmia[at]cs.cmu.edu

Hi, I am a third year graduate student at the Language Technologies Institute of School of Computer Science at Carnegie Mellon University.

I currently work on building multilingual and cross-lingual/cross-domain speech recognizers on low resource languages, for detecting disaster related incidents in an audio file in languages where we have no transcribed speech data. I am fortunate to be advised by Dr. Florian Metze and Dr. Alan W Black as part of the LORELEI project.

Previously, I worked at the MULTISPEECH team in INRIA, Nancy with Dr. Emmanuel Vincent and Dr. Irina Illina. I recieved my undergraduate degree in Computer Science from Birla Institute of Technology and Science, Pilani (Hyderabad Campus) in 2016.

News & Activities

Jan 2020: Our paper on Multilingual Allophone Speech Recognition System got accepted at ICASSP'20
Nov 2019: Our paper Towards Zero-shot Automatic Phonemic Transcription got accepted at AAAI'20
Summer 2019: Interning with Abdelrahman Mohamed at Facebook AI Research, Seattle.
June 2019: Our paper on Corpus Relatedness Sampling for Multilingual ASR got accepted at InterSpeech'19 [PDF]
June 2019: Our paper on Cross-Attention End-to-End ASR for Conversations got accepted at InterSpeech'19 [PDF]
May 2019: Our paper on Gated Embeddings for End-to-End Speech Recognition got accepted at ACL'19 [PDF]
Feb 2019: Our paper on Phoneme Language Models for Low Resource ASR got accepted at ICASSP'19 [PDF]
Dec 2018: Our paper on Rapid Domain Robust Low Resource ASR Development got accepted at SLT'18 [PDF]
Sept 2018: Poster Presentation about our submission for the CHiME-5 challenge [Poster]
Apr 2018: Gave a talk about our paper on Sequence-Based Multi-lingual ASR at ICASSP'18 [PDF] [Slides]
Jan 2018: Our paper on Sequence-Based Multi-lingual ASR got accepted at ICASSP'18 [PDF] [Code]
Dec 2017: Our paper on Epitran: Precision G2P on Many Languages got accepted at LREC'18 [PDF] [Code]

Publications 

[Google Scholar]
Universal Phone Recognition with a Multilingual Allophone System
Xinjian Li, Siddharth Dalmia, Juncheng Li, Patrick Littell, Matthew Lee, Jiali Yao, Antonios Anastasopoulos, David Mortensen, Graham Neubig, Alan W Black, Florian Metze

45th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2020).

2020
Towards Zero-shot Learning for Automatic Phonemic Transcription
Xinjian Li, Siddharth Dalmia, David R. Mortensen, Juncheng Li, Alan W Black, Florian Metze

34th AAAI Conference on Artificial Intelligence (AAAI 2020).

2020
Multilingual Speech Recognition with Corpus Relatedness Sampling
Xinjian Li, Siddharth Dalmia, Alan W Black, Florian Metze

20th Annual Conference of the International Speech Communication Association (InterSpeech 2019).

2019
Cross-Attention End-to-End ASR for Two-Party Conversations
Suyoun Kim, Siddharth Dalmia, Florian Metze

20th Annual Conference of the International Speech Communication Association (InterSpeech 2019).

2019
SANTLR: Speech Annotation Toolkit for Low Resource Language
Xinjian Li, Zhong Zhou, Siddharth Dalmia, Alan W Black, Florian Metze

20th Annual Conference of the International Speech Communication Association (InterSpeech 2019). Show and Tell Track.

2019
Gated Embeddings in End-to-End Speech Recognition for Conversational-Context Fusion
Suyoun Kim, Siddharth Dalmia, Florian Metze

57th Annual Meeting of the Association for Computational Linguistics (ACL 2019).

2019
Phoneme Level Language Models for Sequence Based Low Resource ASR
Siddharth Dalmia, Xinjian Li, Alan W Black, Florian Metze

44th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2019).

2019
Situation Informed End-to-End ASR for CHiME-5 Challenge
Suyoun Kim*, Siddharth Dalmia*, Florian Metze

5th International Workshop on Speech Processing in Everyday Environments (CHIME 2018).

2018
Domain Robust Feature Extraction for Rapid Low Resource ASR Development
Siddharth Dalmia*, Xinjian Li*, Florian Metze, Alan W. Black

7th IEEE Workshop on Spoken Language Technology (SLT 2018).

2018
Sequence-based Multi-lingual Low Resource Speech Recognition
Siddharth Dalmia, Ramon Sanabria, Florian Metze, Alan W. Black

43rd IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2018).

2018
Epitran: Precision G2P for Many Languages
David R. Mortensen, Siddharth Dalmia, Patrick Littell

11th International Conference on Language Resources and Evaluation (LREC 2018).

2018
An approach for self-training audio event detectors using web data
Benjamin Elizalde*, Ankit Shah*, Siddharth Dalmia*, Min Hun Lee*, Rohan Badlani*, Anurag Kumar*, Bhiksha Raj, Ian Lane

25th European Signal Processing Conference (EUSIPCO 2017).

2017
Robust ASR using neural network based speech enhancement and feature simulation
Sunit Sivasankaran, Aditya Arie Nugraha, Emmanuel Vincent, Juan A Morales-Cordovilla, Siddharth Dalmia, Irina Illina, Antoine Liutkus

14th IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU 2015).

2015

(*) - Equal Contribution This page was last modified on: 10/20/2018 13:31:34 EST