PROJECT THESIS



An Interactive Personal Audio Database for Musicians

(Dawen Liang's Thesis)


As the high-capacity data storage devices (e.g. portable hard drives, and USB flash drives) become available to everyone, musicians are able to record all of their rehearsals and save them digitally. Given large amount of unlabeled and unorga- nized rehearsal recordings, manually organizing them can be a huge task. Therefore, managing music audio databases for practicing musicians automatically is a new and interesting challenge.

This thesis describes a systematic investigation to provide useful capabilities to musicians both in rehearsal and when practicing alone. The goal is to allow musicians to automatically record, organize, and retrieve rehearsal (and other) audio to facilitate review and practice (for example, playing along with difficult passages). In order to accomplish this task, three separate problems should be solved:

The thesis will address each of these 3 problems and provide an implementation that works in practice. Some future work is also described.



Extracting Commands From Gestures: Gesture Spotting and Recognition for Real-Time Music Performance

(Jiuqiang Tang's Thesis)


During a music performance, a violinist, trumpeter, conductor and many other musicians must actively use both hands. This makes it impossible to also interact with a computer by pressing buttons or moving faders. Musicians must often rely on offstage sound engineers and other staff acting on a predetermined schedule. However, it seems that some “natural” gestures, such as nodding the head or pointing, offer an alternative way to help musicians control the output sounds in the performance.

The goal of my work is to create an interactive music system able to spot and recognize “command” gestures from musicians in real time. These gestures trigger new sound events and control the output sounds. The system allows the musician greater control over the sound heard by the audience and the flexibility to make changes during the performance itself. In my thesis, I design and evaluate a gesture recognition strategy especially for music performance, based on the dynamic time warping algorithm and using an F-measure evaluation process to obtain the best feature and threshold combination. The proposed strategy will select features by searching over all feature combinations, obtain the optimal threshold for each gesture pattern of each feature combination in terms of the F-measure, and automatically generate a gesture recognizer.

Video demonstration: Extracting Commands from Continuous Gestures



Formal Semantics for Music Notation Control Flow

(Zeyu Jin's Thesis)


Music notation includes a specification of control flow, which governs the order in which the score is read using constructs such as repeats and endings. Music theory provides only an informal description of control flow notation and its interpretation, but interactive music systems need unambiguous models of the relationships between the static score and its performance. In this work, a framework is introduced to describe music control flow semantics using theories of formal languages and compilers. A formalization of control flow answers several critical questions: Are the control flow indications in a score valid? What do the control flow indications mean? What is the mapping from performance location to static score location? The framework can be used to describe extended control flow notation beyond conventional practice, especially nested repeats and arrangement. With the introduction of SDL, the Score Description Language, and DCL, the Dynamic Control Language, the framework can be extended to describe scores notated with word instructions and real-time controls. To demonstrate the correctness and effectiveness of this framework, a score compiler and a score model manager are implemented and evaluated using case-based tests. A software, Live Score Display, is built upon this framework and is offered as a component for interactive music display.

Live Score Display website on SourceForge.