Anurag Kumar

PhD Candidate
Machine Learning and Signal Processing Group
Google Scholar

Contact:
GHC 5509
Language Technologies Institute
Carnegie Mellon University
Pittsburgh, PA, USA
Email: alnu [AT] andrew [DOT] cmu [DOT] edu
            alnu [AT] cs [DOT] cmu [DOT] edu

Theory is the first term in the Taylor series expansion of practice. - Thomas Cover

I have had my results for a long time, but I do not yet know how I am to arrive at them. - Carl Friedrich Gauss


Hi! This is my dugout on the web.

I am currently a PhD student in Language Technologies Institute, School of Computer Science at Carnegie Mellon University. I joined CMU is Fall 2013 and is advised by Prof. Bhiksha Raj, who leads the Machine Learning and Signal Processing group. The name of my research group, Machine Learning and Signal Processing more or less gives away my broad research interests. I primarily work on Acoustic Intelligence or Machine Perception of Sounds. It involves content analysis of audio recordings in terms of sound events as well as natural language understanding for sounds. Large scale Audio Event Detection using weakly labeled and web data is an important part of my research. You can find more information about my research work by looking at my publications.
Before joining CMU, I did my undergraduate at Indian Institute of Technology, Kanpur from where I obtained B.Tech-M.Tech Integrated Dual degree in Electrical Engineering in 2013.

I interned with Christian Fuegen in the Speech and Audio Team at Facebook Research during the summers of 2017. My work focused on video understanding using Audio Event Detection, relying primarily on weakly labeled data.
Previously, in 2015 summers I worked with Dinei Florencio in the Multimedia, Interaction, and Communication (MIC) group at Microsoft Research, Redmond. My research there focused on Speech Enhancement using Deep Neural Networks.

PUBLICATIONS

arXiv Preprints

Deep CNN Framework for Audio Event Recognition using Weakly Labeled Web Data
Anurag Kumar, Bhiksha Raj [arXiv]

Classifier Risk Estimation under Limited Labeling Resources
Anurag Kumar, Bhiksha Raj [arXiv]

Features and Kernels for Audio Event Detection
Anurag Kumar, Bhiksha Raj [arXiv]


Published

Audio Content based Geotagging in Multimedia
Anurag Kumar, Benjamin Elizalde, Bhiksha Raj [arXiv Version]
in Interspeech, 2017.

An Approach for Self-Training Audio Event Detectors Using Web Data
Ankit Shah, Rohan Badlani, Anurag Kumar, Benjamin Elizalde, Bhiksha Raj [arXiv Version]
in 25th European Signal Processing Conference (EUSIPCO), 2017

Audio Event and Scene Recognition: A Unified Approach using Strongly and Weakly Labeled Data
Anurag Kumar, Bhiksha Raj [arXiv Version]
IEEE International Joint Conference on Neural Networks (IJCNN), 2017.

Discovering Sound Concepts and Acoustic Relations In Text
Anurag Kumar, Bhiksha Raj, Ndapandula Nakashole [arXiv Version]
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017. Companion Webpage here

Experiments on the DCASE Challenge 2016: Acoustic Scene Classification and Sound Event Detection in Real Life Recording
Benjamin Elizalde, Anurag Kumar, et. al. [arXiv version]
Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE), 2016.

Audio Event Detection using Weakly Labeled Data
Anurag Kumar, Bhiksha Raj
in 24th ACM International Conference on Multimedia (ACM Multimedia), 2016.[arXiv Version]

Speech Enhancement In Multiple-Noise Conditions using Deep Neural Networks
Anurag Kumar, Dinei Florencio [arXiv Version]
in Interspeech, 2016. Companion Webpage here .

Weakly Supervised Scalable Audio Content Analysis
Anurag Kumar, Bhiksha Raj
IEEE International Conference on Multimedia & Expo (ICME), 2016.[arXiv Version]

A Novel Ranking Method For Multiple Classifier Systems
Anurag Kumar, Bhiksha Raj
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2015.

Informedia@ Trecvid 2014 med and mer
CMU Aladdin MED Team

Unsupervised Fusion Weight Learning in Multiple Classifier Systems
Anurag Kumar, Bhiksha Raj [arXiv]

Detecting Sound Objects In Audio Recordings
Anurag Kumar, Rita Singh, Bhiksha Raj
22nd European Signal Processing Conference (EUSIPCO), 2014.

Monaural Speaker Segregation Using Group Delay Spectral Matrix Factorization
Karan Nathwani, Anurag Kumar and Rajesh Hegde
20th National Conference on Communications (NCC), 2014.

Event Detection in Short Duration Audio Using Gaussian Mixture Model and Random Forest Classifier
Anurag Kumar, Rajesh Hegde, Rita Singh and Bhiksha Raj
21 st European Signal Processing Conference (EUSIPCO), 2013.

Audio event detection from acoustic unit occurrence patterns
Anurag Kumar, Pranay Dighe, Rita Singh, Sourish Chaudhuri and Bhiksha Raj
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2012 .


OTHERS - Other academic information