Nguyen Bach (Bạch Hưng Nguyên)
Graduate Research Assistant
Currently at CMU, I am exploring methods for preserving cohesions cross languages. Some questions which I want to find solutions are 1) How to incorporate dependency structures into machine translation frameworks? ; 2) How to exploit dependency structures in reordering models? ; 3) How to use parallel dependency structures?. Arabic, Iraqi, Farsi, Pashto, Dari, Spanish, Chinese, Japanese, Vietnamese, and English are my working languages. The other things I am working on are speech-to-speech translation and named entities translation.

I am a member of the CMU SMT Team in the DARPA TRANSTAC Evaluation 2007, 2008 and 2009, the DARPA GALE "Go-NoGo" Evaluation 2006, 2007, 2008, 2009, the NIST MT Evaluation 2006, 2008, 2009 and the IWSLT Evaluation in 2006, 2007. I also worked on NSA STEEM which focused on summarization and translation of multimedia data to improve the reliability and usefulness of machine translation

I was in NLP Lab at Johns Hopkins University where I obtained the Master of Computer Science degree in May 2005. I closely worked with Gideon Mann in the Personal Name Disambiguation project. I did an independent research in Vietnamese-English Statistical Machine Translation with Prof. David Yarowsky. I presented my work in poster session at VEF Conference at National Academy of Science, Dec 2004. If some one need a biligual corpus, a dictionary, or advice, please contact me and I will try to help.


In the summer 2004, I attend the JHU Summer School on Human Language Technology supported by NSF and NAACL.

Before I came to Hopkins, I have worked at the Institute of Information Technology of Vietnam in the Department of Pattern Recognition & Knowledge Discovery. I was involved in a speech synthesis and recognition project funded by the Vietnamese government. I mainly worked with Dr. Luong Chi Mai, and was in charge of studying Fujisaki’s model. I also participated in building Vietnamese speech corpus from radio broadcast resources (VOV corpus) and CSLU’s speech corpus. My colleagues and I discovered a method that automatically downloads, splits, and labels in Vietnamese phoneme level speech wave files. All of these activities helped our group to establish strong connections with Prof. Fujisaki, Prof. Mixdorff, and Prof. Hosom.




