Shinjae Yoo
Faculty Advisor Lori Levine

Title: A Smart Optical Character Recognition for Inupiaq

   
     
Short
Bio
 

Shinjae Yoo is a sixth-year Ph.D. student at Carnegie Mellon University's Language Technologies Institute, where he is working on his thesis on personalized email prioritization, advised by Yiming Yang. His research interests include text classification and clustering and their real-world applications. Shinjae received his undergraduate degree at Soongsil University in Korea and his Master's degree from Seoul National University.

     
Project Synopsis
 

Inupiaq is one of the endangered Alaskan languages and the research on Inupiaq such as computational linguistics is quite limited due to lack of soft-copy materials. Having soft-copies of language resources is the first step toward computational linguistic research and real-world applications such as grammar checkers, machine translations, and information retrieval. Also, this computational linguistics research will improve Inupiaq language usability due to better web search and error correction in Editors, for example. However, Inupiaq is polysynthetic in which words are composed of many morphemes. So a dictionary-based error correction which usually works well in English will not work in this case. To overcome this problem, we propose character n-gram based error correction method and we believe it will produce better recognized OCR (Optical Character Recognition) performance.