Projects - Fall 2021
I wrote this to share with students and prospective students what I’m working on, and interested in working on, during the next year.
If you are looking for an internship, I cannot offer a salary or cover travel expenses, but I have some funds for minor research expenses.
If you are looking for a Ph.D., CMU is a great place, but I'm unlikely to take on a new student. We have a new endowed chair for a professorship in Music and Computation, so I hope there will soon be a new faculty member looking for computer music Ph.D. students, and I hope to participate in research, possibly as co-advisor.
I receive a lot of requests for internships and supervision. Prospective interns and Ph.D.s should read the sidebar at left. Here’s what I’m doing and thinking about these days.
O2 is a network protocol especially for music control. It is intended to be an OSC “do over” given that even tiny low-cost controllers can communicate using IP, and also given what we’ve learned from experience with OSC (Open Sound Control).
Mainly, O2 introduces discovery so users do not have to type in IP addresses and port numbers. O2 also supports clock synchronization, timed message delivery, and some publish-subscribe capabilities.
I’m currently working on O2lite, where an O2 host acts as proxy to a connected process. O2lite allows the O2 protocol to extend into browsers, shared memory processes (such as audio threads), and microcontrollers with limited memory (such as the ESP32).
What comes next are some extensions and further work to make O2 even more complete and interesting:
I’m working with Shuqi Dai on automatic composition of popular songs, particularly songs that are similar to existing “seed” songs. One thing that might help is a lot of data. It‘s fairly easy to get data from MIDI files, and there are at least 100,000 easy-to-get music files out there, but most of these require analysis to identify chords, melody, bass lines, and structure that could be valuable for learning and further analysis.
Previously, I worked with Zheng Jiang, who created a system to automatically analyze and label MIDI files. It’s a great start, but I think it is not robust enough to turn loose on 100K files and expect satisfactory results. For one thing, not all MIDI is a popular song or even useful as an example.
Therefore, a project waiting to get done is to push this existing work forward and try to obtain tens of thousands of songs with labeled melody, chords, bass, structure, bar-lines, etc.
There are some early works in music composition that, rather than relying on sophisticated machine learning, simply implemented very insightful rules or algorithms of music theory and music composition. I've tried to understand what’s going on in these programs, because many of them outperform the so-called “state-of-the-art” methods that have become popular recently. Some efforts to recreate some of this early work could be very interesting and allow better understanding as well as additional experimentation. This could lead to advances by revealing forgotten secrets of music.
I’ve written a number of libraries or frameworks for building interactive real-time music systems. The latest usable one is Aura, which I have used for a number of compositions. I’ve learned a lot from Aura, and I’ve started yet another system that's simpler in some ways and more powerful in others.
The new system Arco, uses O2 for communication between threads. This is a little simpler than Aura, which used more efficient pre-processing to create a distributed real-time object system, but now I think O2 would be fast enough, simpler, and allow Aura-like systems to run as servers or run on multiple machines.
One of the goals of Arco is to flexible with audio DSP, making it as simple as possible to use everything from low-level unit generators such as “multiply” up to large modules such as off-the-shelf synthesizers and reverberators. I've been working on a representation of DSP objects that allows audio connections to be dynamic, multi-channel, block-based (which is much more efficient than sample-by-sample, e.g., it allows vector instructions), and still about as efficient as Aura.
I think audio objects could be specified in FAUST, which would open up most existing work in FAUST to be immediatly brought into the Arco framework.
I have a fairly extensive course on computer music that is about 90% online and ready to open to the world. I finished a textbook last year. The current work is in getting the course onto a new server and then I will be trying to manage the system in its initial offering to students.
I've mapped out a number of interactive scenarios such as call-and-response, follow-the-leader, alternating group and solo play, etc., and we've done some prototypes and testing, so I believe it is possible to make drumming online an enjoyable experience.
From here, I envision a 3-step implementation: First, create an interactive system with a human expert who calls the shots and leads the drum circle. Second, learn from this to create an automated AI drum circle leader. Third, scale up to multiple drum circles around the world that run 24/7 and where people can join and leave, and depending on how many participants are online, drum circles can split and merge.
Our hypothesis is that by identifying patterns and repetition in music we can create better models for music generation and listening. Our approach is based on prediction: We rate models on their ability to predict the next element in a sequence (of pitches, durations, intervals, or whatever), and we measure this quantitatively in terms of entropy.
The actual work here consists of gathering and pre-processing music in machine-readable to form datasets, writing and debugging models, and running experiments to evaluate different models and parameters on different datasets.
Many students write to say they know all about machine learning and would love to come to be interns. I can understand the excitement and enthusiasm. Unfortunately, my experience is that by the time students “tool up” and get enough experience to tackle some real problems, most of a summer or semester (or 2) has gone by, and there’s no time to make any advances. I would not say this is a bad area for research, but it seems that most of the obvious things are already done (and a great deal more). When the low-hanging fruit is gone, you really need a ladder or some secret advantage, whether it is a supercomputer, experience and insight, or just a good novel idea. I do not feel I can offer that now to undergrads in search of a quick but rewarding research experience. Many of the other topics listed above have some potential for completing something interesting and even publishable in a couple of months, but if you are only excited by machine learning applications, you should follow your heart and passion. That is where you will find the greatest happiness and accomplishments.