Tap Detection & Synthesis

This example presents code to detect tap events. Both the time and the amplitude or strength of each tap is estimated. The data can then be used for synthesis. This analysis/synthesis approach is particularly suited to producing complex but natural-sounding rhythmic sequences.

Analysis

A wav file test.wav is provided as an example input signal. The analysis code, tap-detect.sal takes in this input sound and computes the RMS (a low-sample-rate estimate of average energy in the signal). The RMS signal is then searched for peaks.

The peak detection is not a "beat detector" for music. Instead, it is intended to detect tapping. The idea is to detect peaks, but to avoid very small peaks that naturally occur in the background. To do this, the background noise is estimated by taking a long term average of the RMS signal. The analysis then looks for regions of the input signal that are significantly stronger than the background noise. Each such region is assumed to contain one peak, and the analysis finds the maximum value of the region to estimate the strength of the tap. The time of the tap is assumed to be the time point where the tap signal first exceeds the background noise.

If the input sound has a sample rate of 44100 Hz, the RMS signal has a sample rate of 196, computed with a 50% window overlap. The long term average of this sound object is then taken with a two second window and a one sample step rate. The average is scaled and subtracted from the RMS. If there are no peaks, the RMS and the average RMS will be very similar, so an additional constant is subtracted to eliminate false peaks due to noise. The code then searches the samples for the points where the wave crosses 0 in the positive direction. This onset time is saved and a peak is found. The onset times and maximum values are saved as pairs in a list and returned.