Previous Section | Next Section | Table of Contents | Index | Title Page

Time/Frequency Transformation

Nyquist provides functions for FFT and inverse FFT operations on streams of audio data. Because sounds can be of any length, but an FFT operates on a fixed amount of data, FFT processing is typically done in short blocks or windows that move through the audio. Thus, a stream of samples is converted in to a sequence of FFT frames representing short-term spectra.

Nyquist does not have a special data type corresponding to a sequence of FFT frames. This would be nice, but it would require creating a large set of operations suitable for processing frame sequences. Another approach, and perhaps the most “pure” would be to convert a single sound into a multichannel sound, with one channel per bin of the FFT.

Instead, Nyquist violates its “pure” functional model and resorts to objects for FFT processing. A sequence of frames is represented by an XLISP object. Whenever you send the selector :next to the object, you get back either NIL, indicating the end of the sequence, or you get an array of FFT coefficients.

The Nyquist function snd-fft (mnemonic, isn't it?) returns one of the frame sequence generating objects. You can pass any frame sequence generating object to another function, snd-ifft, and turn the sequence back into audio.

With snd-fft and snd-ifft, you can create all sorts of interesting processes. The main idea is to create intermediate objects that both accept and generate sequences of frames. These objects can operate on the frames to implement the desired spectral-domain processes. Examples of this can be found in the file nyquist/lib/fft/fft_tutorial.htm, which is part of the standard Nyquist release. The documentation for snd-fft and snd-ifft follows.

snd-fft(sound, length, skip, window) [SAL]
(snd-fft sound length skip window) [LISP]
This function performs an FFT on the first samples in sound and returns a Lisp array of FLONUMs. The function modifies the sound, violating the normal rule that sounds are immutable in Nyquist, so it is advised that you copy the sound using snd-copy if there are any other references to sound. The length of the FFT is specified by length, a FIXNUM (integer) which must be a power of 2. After each FFT, the sound is advanced by skip samples, also of type FIXNUM. Overlapping FFTs, where skip is less than length, are allowed. If window is not NIL, it must be a sound. The first length samples of window are multiplied by length samples of sound before performing the FFT. When there are no more samples in sound to transform, this function returns NIL. The coefficients in the returned array, in order, are the DC coefficient, the first real, the first imaginary, the second real, the second imaginary, etc. The last array element corresponds to the real coefficient at the Nyquist frequency.

snd-ifft(time, srate, iterator, skip, window) [SAL]
(snd-ifft time srate iterator skip window) [LISP]
This function performs an IFFT on a sequence of spectral frames obtained from iterator and returns a sound. The start time of the sound is given by time. Typically, this would be computed by calling (local-to-global 0). The sample rate is given by srate. Typically, this would be *sound-srate*, but it might also depend upon the sample rate of the sound from which the spectral frames were derived. To obtain each frame, the function sends the message :next to the iterator object, using XLISP's primitives for objects and message passing. The object should return an array in the same format as obtained from snd-fft, and the object should return NIL when the end of the sound is reached. After each frame is inverse transformed into the time domain, it is added to the resulting sound. Each successive frame is added with a sample offset specified by skip relative to the previous frame. This must be an integer greater than zero and less than the frame (FFT) size. If window is not NIL, it must be a sound. This window signal is multiplied by the inverse transformed frame before the frame is added to the output sound. The length of each frame should be the same power of 2. The length is implied by the first array returned by iterator, so it does not appear as a parameter. This length is also the number of samples used from window. Extra samples are ignored, and window is padded with zeros if necessary, so be sure window is the right length. The resulting sound is computed on demand as with other Nyquist sounds, so :next messages are sent to iterator only when new frames are needed. One should be careful not to reuse or modify iterator once it is passed to snd-ifft.

Spectral Processing

There are a number of functions defined to make spectral processing easier in XLISP and SAL. The general approach, as described above, is to create an iterator object that returns spectral frames. To avoid using the XLISP object system directly, a more functional interface is defined, especially for SAL users. The sa-init function creates an iterator, and sa-next retrieves spectral frames. Various functions are also provided to transform these into amplitude (magnitude) spectra, plot them and perform other operations.

Some examples that use these spectral processing functions can be found in the Nyquist extension “fftsal” (use the NyquistIDE's Window : Nyquist Extensions menu item to download it; it will then be in your nyquist/lib/fftsal directory. You can find descriptions of the examples in nyquist/lib/fftsal/spectral-process.lsp and nyquist/lib/fftsal/spectral-process.sal.

sa-init(resolution: hz, fft-dur: dur, skip-period: skip, window: window-type, input: input) [SAL]
(sa-init :resolution hz :fft-dur dur :skip-period skip :window window-type :input input) [LISP]
Creates a spectral-analysis object that can be used to obtain spectral data from a sound. All keyword parameters are optional except input. The resolution keyword parameter gives the width of each spectral bin in Hz. It may be nil or not specified, in which case the resolution is computed from fft-dur. The actual resolution may be finer than the specified resolution because fft sizes are rounded to a power of 2. The fft-dur is the width of the FFT window in seconds. The actual FFT size will be rounded up to the nearest power of two in samples. If nil, fft-dur will be calculated from resolution. If both fft-size and resolution are nil or not specified, the default value is 1024 samples, corresponding to a duration of 1024 / signal-sample-rate. If both resolution and fft-dur are specified, the resolution parameter will be ignored. Note that fft-dur and resolution are reciprocals. The skip-period specifies the time interval in seconds between successive spectra (FFT windows). Overlapping FFTs are possible. The default value overlaps windows by 50%. Non-overlapped and widely spaced windows that ignore samples by skipping over them entirely are also acceptable. The window specifies the type of window. The default is raised cosine (Hann or "Hanning") window. Options include :hann, :hanning, :hamming, :none or nil, where :none and nil mean a rectangular window. The input can be a string (which specifies a sound file to read) or a Nyquist SOUND to be analyzed. The return value is an XLISP object that can be called to obtain parameters as well as a sequence of spectral frames. Normally, you will set a variable to this result and pass the variable to sa-next, described below.

sa-info(sa-obj) [SAL]
(sa-info sa-obj) [LISP]
Prints information about an sa-obj, which was created by sa-init (see above). The return value is nil, but information is printed.

sa-next(sa-obj) [SAL]
(sa-next sa-obj) [LISP]
Fetches the next spectrum from sa-obj, which was created by sa-init (see above). The return value is an array of FLONUMs representing the discrete complex spectrum.

sa-magnitude(frame) [SAL]
(sa-magnitude frame) [LISP]
Computes the magnitude (amplitude) spectrum from a frame returned by sa-frame. The ith bin is stored at index i. The size of the array is the FFT size / 2 + 1.

sa-normalize(frame [, max]) [SAL]
(sa-normalize frame [max]) [LISP]
Normalize a copy of frame, a magnitude (amplitude) spectrum returned by sa-magnitude. If max (a FLONUM) is provided, the spectrum will be normalized to have a maximum value of max, which defaults to 1.

sa-plot(sa-obj, frame) [SAL]
(sa-plot sa-obj frame) [LISP]
Plots a magnitude (amplitude) spectrum from frame returned by sa-magnitude. The sa-obj parameter should be the same value used to obtain the frame.

sa-print(file, sa-obj, frame, cutoff: cutoff, threshold: threshold) [SAL]
(sa-print sa-obj file frame :cutoff cutoff :threshold threshold) [LISP]
Prints an ASCII plot of frame, a magnitude (amplitude) spectrum returned by sa-magnitude (or sa-normalize). The file is either a file opened for writing or T to print to the console. The caller is responsible for closing the file (eventually). The sa-obj parameter should be the same value used to obtain the frame. If cutoff, a FLONUM, is provided, only the spectrum below cutoff (Hz) will be printed. If threshold, a FLONUM, is provided, the output may elide bins with values below the threshold.

sa-get-bin-width(sa-obj) [SAL]
(sa-get-bin-width sa-obj) [LISP]
Returns the width of a frequency bin as a FLONUM in Hz (also the separation of bin center frequencies). The center frequency of the ith bin is i * bin-width.

sa-get-fft-size(sa-obj) [SAL]
(sa-get-fft-size sa-obj) [LISP]
Returns a FIXNUM, the size of the FFT, a power of 2.

sa-get-fft-dur(sa-obj) [SAL]
(sa-get-fft-dur sa-obj) [LISP]
Returns a FIXNUM, the duration of the FFT window.

sa-get-fft-window(sa-obj) [SAL]
(sa-get-fft-window sa-obj) [LISP]
Returns a symbol representing the type of window used, :hann, :hamming or :none.

sa-get-skip-period(sa-obj) [SAL]
(sa-get-skip-period sa-obj) [LISP]
Returns the skip size in seconds (a FLONUM).

sa-get-fft-skip-size(sa-obj) [SAL]
(sa-get-fft-skip-size sa-obj) [LISP]
Returns the skip size in samples (a FIXNUM).

sa-get-sample-rate(sa-obj) [SAL]
(sa-get-sample-rate sa-obj) [LISP]
Returns the sample rate of the sound being analyzed (a FLONUM) in Hz.

Previous Section | Next Section | Table of Contents | Index | Title Page