by Cyrus Omar
The clip above is from The Diving Bell and the Butterfly, a film that tells the story of Jean-Dominique Bauby, a man left nearly completely paralyzed after a stroke. He communicates by blinking when his assistant reads the next letter he wishes to convey.
Today, many patients in situations like Jean-Do's have replaced human assistants with prosthetic devices. The well-known physicist Stephen Hawking, for instance, uses such a device to help him muse about the cosmos. Click play below to get a better sense for how his device works.
As you can see, it takes Prof. Hawking several minutes to answer a question about alien life. This delay is a consequence of the fact that the bandwidth between his mind and the device is very low -- his movements can convey just a single bit every few seconds.
Ensuring that every bit carries as much information as possible is key. Information theory is the branch of applied mathematics that covers how to communicate information as efficiently as possible. Let's use insights from information theory to see how we can design these kinds of devices better!
A basic rule of thumb in information theory is that we can use fewer bits to convey a message by giving the receiver prior knowledge about the statistical structure of incoming messages. This is how many data compression schemes work, in fact.
Jean-Do's assistant makes use of this fact, reading out the letters in order of frequency. But this isn't the best we can do! When Jean-Do selects "Q", for example, we know that "U" is almost certainly next -- so the assistant should read it out first. This kind of a predictive model is known as a first-order Markov model. More complex models, like Prediction by Partial Matching, look further back, using sophisticated heuristics to make the necessary statistical calculations more tractable.
How much better could we do, in theory? Claude Shannon, the father of information theory, did some experiments with human predictors to conjecture that the best compression ratio we can hope for is about 1.5 bits per character. Modern compression schemes like PPM are starting to approach this "maximum", hovering around 2 bpc today.
OK, let's assume we have a predictor that achieves this maximum ratio of about 1.5 bpc. Have we created an optimal communication prosthetic? Well, not quite. As it turns out, asking a series of questions of the form
Blink if the next letter is "X".
is not the best strategy at all!
To see why, consider again our "Q" example. According to my trusty Scrabble dictionary, there are in fact a few very rare words where a "U" does not follow the "Q". This means that we must consume at least 1 bit from the user to confirm that he or she really intended to convey a "U". However, this bit doesn't carry much information because we already know with high probability that it will be 1.
What we really want is a question for which the answer is 0 or 1 with roughly equal probability, so that when we get the answer our uncertainty about the user's overall intent is reduced as much as possible.
Turns out that the following question fits the bill:
Blink if your intended string alphabetically precedes "MEDIAN".
where "MEDIAN", as its name suggests, is the string for which the probabilities of all strings preceding it alphabetically add up to 0.5.
How do we compute the probability of a string? Well, we begin with a prior probability -- that is, a language model, as above. After each piece of new evidence comes in -- the bits from the user -- we update our model of the user's intent, forming a new posterior probability distribution over all strings. We show the user the median of this distribution. You might recognize this as a form of Bayesian inference.
Now we can indeed say that this policy is the best we can possibly do -- given a language model, you can't devise a better series of questions to ask! For the data compression buffs, this idea is strongly connected to arithmetic coding.
Great! But what about patients who have lost all muscle movement, even eye blinks and cheek twitches? Well, there is a high-tech solution for them as well - a direct neural interface.
Despite the suggestive name, mind reading devices these are not. In fact, they operate in the same basic way as eyeblink or cheek twitch detectors. Rather than physically moving, the patient imagines a movement and an electrode cap picks up the electrical signature of the resulting neural activity -- his or her brainwaves.
Rui wearing an EEG headset and using a brain-computer interface.
What distinguishes a neural interface from a standard rehabilitative interface is the presence of significant amounts of noise. The brain itself is noisy, the skull introduces further noise, the sensors are imperfect and the classifier used to determine which movement was imagined is not always correct.
Does this throw a wrench into our works? Well, turns out that if we use the policy described above, we can handle noise quite elegantly. Rather than modeling the evidence as a known binary variable, we model it as a random variable. We can then use Bayes' rule to incorporate even noisy estimates of the evidence into our system straightforwardly.
The user doesn't have to change his or her strategy significantly in the presence of noise because we use noiseless feedback to tell the user where the median is. Though the forward channel, from the patient to the computer, is noisy and has low bandwidth, the feedback channel, from the computer to the patient -- a computer monitor -- is effectively noiseless and has extremely high bandwidth.
In a paper in the International Journal of Human-Computer Interaction this month , my colleagues and I describe all of this in much greater (read: more mathematical) detail. We also develop a full brain-computer interface based on this formulation and analyze its performance with several subjects.
Here is a video of the interface in action:
In grey you see a prompt. This is for experimental purposes only -- during normal operation, it would not appear. Below the prompt appears the (truncated) median of the current posterior distribution. Every time the subject, in this case Rui Ma, one of the coauthors of the paper, conveys a bit by imagining a left or right hand movement, this distribution is updated, and a new median appears. We highlight letters in black when the system believes with very high certainty that the subject intended to convey those letters.
Rui is "typing" at a clip of about 5 characters per minute. This is state-of-the-art when using a non-invasive system conveying binary information like the one pictured. What's interesting is that we can now say for the first time that we can't do any better without improving the signal processing algorithms (see paper) or language model (PPM).
Thus far we have discussed spelling using the english alphabet. But in fact, any alphabet which is ordered and for which a reasonable prior probability model exists can be used.
In the paper, we also describe a path specification task. In this case, the alphabet is a set of short arcs. A sequence of these arcs forms a curved path in 2D space. One can imagine a patient using this capability to navigate on his or her own.
Some of my coauthors have actually done one better, connecting this interface to an unmanned aircraft . Yes, you read that correctly -- we are helping locked-in patients fly planes. Here is a video using a flight simulator:
Abdullah, another coauthor, has also started to expand upon the artistic angle taken in Diving Bell, leveraging this alphabet to allow a subject to make freeform drawings .
And many more possibilities exist!
The beautiful theme music from The Diving Bell and the Butterfly.
Imagine waking up one day locked inside your head, unable to move or communicate, but otherwise fully aware. Even the thought can be terrifying. Unfortunately, thousands of people find themselves in this situation every year.
The stories of people like Jean-Dominique Bauby and Stephen Hawking remind us that these people are still capable of living full, loving lives, accomplishing great things and expressing themselves in fun and creative ways if given the right tools. I hope this essay has given you a taste of the technologies and theory behind these tools.
The things I've described are the result of several years of hard work by the Beckman Brain-Machine Interface group at the University of Illinois at Urbana-Champaign.
The group website has a nice video summarizing what I talked about here in even simpler terms. Abdullah's website also has a lot more details on the fun applications described above.
 Omar, C., Akce, A., Johnson, M., Bretl, T., Ma, R., Maclin, E., McCormick, M. and Coleman, T. P. A Feedback Information-Theoretic Approach to the Design of Brain-Computer Interfaces, International Journal of Human-Computer Interaction, 27: 1, 5 — 23 (2011). [pdf*] A. Akce, M. Johnson, and T. Bretl, Remote teleoperation of an unmanned aircraft with a brain-machine interface: Theory and preliminary results, IEEE International Conference on Robotics and Automation (ICRA), May 2010. [overview,pdf,slides]  A. Akce, and T. Bretl, A probabilistic language model for hand drawings, International Conference on Pattern Recognition (ICPR), August 2010. [overview,pdf,slides]
* Author Posting. (c) 'Taylor & Francis Group, LLC', 2010. This is the author's version of the work. It is posted here by permission of 'Taylor & Francis Group, LLC' for personal use, not for redistribution. The definitive version was published in International Journal of Human-Computer Interaction, Volume 27 Issue 1, January 2011. doi:10.1080/10447318.2011.535749 (http://dx.doi.org/10.1080/10447318.2011.535749)