Bhiksha Raj's Projects Page


Languages of the world


Privacy Preserving Voice Processing


Machine Learning for Signal Processing

Discrete data such as text are often modelled as having been generated by draws from a discrete random variable. Continuous-valued data such as images and sound spectra, on the other hand, are commonly modelled as draws from a continuous valued RV. But how about the intersection of the two?

In this project we investigate this intermediate space. We model the discrete-valued support of the continuous-valued RV, as a discrete-valued RV, and the continuous value at the support as a normalized count of the number of draws of these discrete elements.

For example, the spectrogram of a speech signal shows the energy at a discrete set of frequencies, at a discrete number of time indices. By our model, time and frequency are treated as RVs, and the value of the spectrogram at any time-frequency as the count of the number of draws of that time-frequency pair from a discrete random process.

This model has some suprising properties, providing us with surprisingly simple algorithms for tasks such as monaural source separation, determination of atomic units from sounds, images, video and text, and even potential solutions to problems such as deblurring of images and deconvolution of sounds.

Mathematically, it can be shown to be identical to the popular technique of non-negative matrix factorization. However, it also provides us a simple framework for application of various priors, and also enables us to employ various statistical models and methods that have been developed for discrete data such as text. Conversely, the techniques we develop, particularly the model that obtains sparse overcomplete decompositions are observed to be effective models for discrete data.

For more details, click here


Speech Recognition Systems
Unusual Secondary Sensors
Speech Recognition with Spectro-Temporal Models
Knowledge-Base-Augmented Speech Recognition
Sub projects in search of a home: