JKanji: Wavelet-based Kanji Recognition

Researchers: Robert Stockton, Rahul Sukthankar

Abstract

JKanji is an interactive character completion system that provides stroke-order-independent recognition of complex hand-written glyphs such as Japanese kanji or Chinese hanzi. As the user enters each stroke, JKanji offers a menu of likely completions, generated from a robust multiscale matching algorithm augmented with a statistical language model. Drawbacks of traditional wavelet-based approaches are addressed by a redundant, phase-shifted basis that is insensitive to variations of the input character across quadrant boundaries. Unlike many existing systems, JKanji can incrementally incorporate new training examples, either to adapt to the idiosyncrasies of a particular user, or to increase its vocabulary. On a kanji input task with a vocabulary of 6369 kanji and English characters, JKanji has demonstrated 93%-96% recognition accuracy and up to 80% reduction in the number of input strokes. JKanji is computationally efficient, processing images at 5-10Hz on an inexpensive portable computer, and is well-suited for integration into personal digital assistants (PDAs) as an input method. JKanji's recognition system also processes low-quality digital camera images and has been integrated into a prototype tourist's guide that interprets unfamiliar kanji in the environment.

Related Publications


Rahul Sukthankar (rahuls@cs.cmu.edu),