Bing Zhao
Language Technologies
Institute School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 (412) 320-0377 (Cell) (412) 268-4546 (Office) |
bzhao [at] CS [dot] CMU [dot] EDU http://www.cs.cmu.edu/~bzhao |
EDUCATION
Ph.D. in
Language Technologies,
Computer Science, Carnegie Mellon University,
2007
"Statistical Alignment Models for Translational Equivalence";
M.S. in Language Technologies, in May 2003.
Advisors:
Alex Waibel, Eric P. Xing and Stephan Vogel.
M.S. in
Pattern Recognition and AI, Institute of Automation, Chinese Academy of
Sciences, July 2001
"A
Continuous Chinese Digit Speech Recognition System: Acoustic Modeling,
Speaker Adaptation, and Decoding”,
Advisors:
Taiyi Huang and
Bo Xu.
B.S. in
Electronic Engineering, University of Science and Technology
of China , July 1998
"A
Wavelet Transformation based Compression Algorithm for Seismogram",
Advisor: Zhengkai Liu
RESEARCH INTERESTS
SKILLS
SUMMARY
PROFESSIONAL EXPERIENCE
Statistical Alignment Models for Translational Equivalence:
§ Robust statistical alignment models of for machine translation;
§ Bilingual Topical AdMixture models for machine translation;
§ Key player in CMU-team for projects including GALE and TIDES
Intern at IBM, supervised by Dr. Kishore Papineni and Dr. Niyu Ge, working on:
§ Inner-outer bracket models for word alignment;
§ Detailed experiments comparing different alignment approaches.
1998/8–2001/7
Research
Assistant, Chinese Academy of Sciences, China
Graduate research assistant at National Laboratory of Pattern Recognition.
§ Mandarin continuous digits recognition system with Nokia;
§ Acoustic Model Adaptation: several MLLR based algorithms; MRF based MAP, SMAP;
§ Chinese Trigram Language Modeling; Chinese pinyin to character conversion.
2001/4–2001/7
Visiting Student, Microsoft Research Asia, China
Visiting student at MSRA, supervised by Eric Chang, working on:
§ Discriminative
training for large scale continuous speech recognition;
§ Discriminative
training for
vowel recognition;
1993/9–1998/7
Research
Assistant, Univ. of Science & Technology of China
Undergraduate research assistant
at Information
Processing Center
of USTC.
TEACHING EXPERIENCE
§ 2005,
CMU undergraduate course 15-381, Artificial Intelligence
Assisted with the design and grading of homework, exams, and projects. Held
office hours.
§ 2005,
CMU graduate course 11-751, Speech Recognition and Understanding
Design and grading of homework, exams, and projects. Held office hours.
PUBLICATIONS
Refereed Papers:
[1] Bing Zhao, Nguyen Bach, Ian Lane, and Stephan Vogel, "A Log-linear Block Transliteration Model based on Bi-Stream HMMs", to appear in HLT/NAACL-2007.
[2] Bing Zhao and Eric P. Xing, “BiTam: Bilingual Topic AdMixture Models for Word Alignment”, in the proceedings of Joint Conference of Computational Linguists and Meeting of Association for Computational Linguists (ACL/Coling 2006), July, 2006.
[3] Muntsin Kolss, Bing Zhao, Stephan Vogel, Ashish Venugopal, and Ying Zhang, “The ISL Statistical Machine Translation System for the TC-STAR Spring 2006 Evaluation”, TC-Star Workshop on Speech-to-Speech Translation, TC-STAR-WS 2006, Barcelona, Spain, 2006.
[4] Matthias Eck, Ian Lane, Nguyen Bach, Sanjika Hewavitharana, Muntsin Kolss, Bing Zhao, Almut Silja Hildebrand, Stephan Vogel and Alex Waibel, “The UKA/CMU Statistical Machine Translation System for IWSLT 2006”, in the proceedings of IWSLT 2006.
[5] Sanjika Hewavitharana, Bing Zhao, Almut Silja Hildebrand, Matthias Eck, Chiori Hori, Stephan Vogel and Alex Waibel, “ The CMU Statistical Machine Translation System for IWSLT2005”, in the proceedings of International Workshop on Spoken Language Translation (IWSLT 2005), Sept. 2005..
[6] Bing Zhao and Alex Waibel, “Learning a Log-Linear Model with Bilingual Phrase-Pair Features for Statistical Machine Translation”, in the proceedings of Fourth SigHan workshop on Chinese Language Processing (SigHan 2005), October, 2005.
[7] Bing Zhao, Niyu Ge, and Kishore Papineni, “Inner-Outer Bracket Models for Word Alignment using Hidden Blocks”, in the proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP 2005), Oct. 2005.
[8] Bing Zhao and Stephan Vogel, “A Generalized Alignment-Free Phrase Extraction”, in the proceeding of ACL 2005 Workshop on Building and using Parallel Texts: Data Driven Machine Translation and Beyond (ACL WPT-05), June 2005.
[9] Bing Zhao, Eric P. Xing, and Alex Waibel, “Bilingual Word Spectral Clustering for Statistical Machine Translation”, in the proceeding of ACL 2005 Workshop on Building and using Parallel Texts: Data Driven Machine Translation and Beyond (ACL WPT-05), June 2005.
[10] Bing Zhao, Stephan Vogel, and Alex Waibel, “Phrase Pair Rescoring with Term Weightings for Statistical Machine Translation”, in the proceeding of Conference on Empirical Methods in Natural Language Processing (EMNLP 2004), July 2004.
[11] Bing Zhao, Matthias Eck, and Stephan Vogel, “Language Model Adaptation for Statistical Machine Translation with Structured Query Models”, in the proceeding of The 20th International Conference on Computational Linguistics (Coling 2004), Aug. 2004.
[12] Bing Zhao, Klaus Zechner, Stephan Vogel, and Alex Waibel, “Efficient Optimization for Bilingual Sentence Alignment based on Linear Regression”, in the proceeding of HLT/NAACL 2003 Workshop on Building and using Parallel Texts: Data Driven Machine Translation and Beyond, May, 2003.
[13] Bing Zhao and Stephan Vogel, “Word Alignment Based on Bilingual Bracketing”, in the proceeding of HLT/NAACL 2003 Workshop on Building and using Parallel Texts: Data Driven Machine Translation and Beyond (HLT/NAACL WPT-03), May, 2003.
[14] Stephan Vogel, Ying Zhang, Alicia Tribble, Fei Huang, Ashish Venugopal, Bing Zhao, and Alex Waibel. "The CMU Statistical Translation System." in Proceedings of the MT Summit IX. New Orleans, LA. September 2003..
[15] Ying Zhang, Bing Zhao, Jie Yang, and Alex Waibel, “Automatic SIGN Translation”, in the proceeding of International Conference on Spoken Language Processing (ICSLP2002), Aug. 2002.
[16] Bing Zhao and Stephan Vogel, “Full-text Story Alignment Models for Chinese-English Bilingual News Corpora”, in the proceeding of International Conference on Spoken Language Processing (ICSLP2002), 2002
[17] Bing Zhao and Stephan Vogel, “ Adaptive Parallel Sentences Mining From Web Bilingual News Collection”, in the proceeding of the 2002 IEEE International Conference on Data Mining (ICDM 2002), December 2002.
[18] Yun Zhou, Chengqing Zong, and Bing Zhao, The corpus oriented analysis of Chinese spoken dialog understanding”, in the proceeding of International Symposium of Chinese Spoken Language Processing (ISCSLP 2000), July, 2000.
[19] Bing Zhao and Bo Xu, MLLR Speaker Adaptation using Acoustic Correlation Information”, in the proceeding of The National Conference on Man-Machine Speech Communication (NCMMSC 2000), 2000.
[20] Sheng Gao, Bo Xu, Hong Zhang, Bing Zhao, Chengrong Li and Taiyi Huang,Updated Progress of SINOHEAR: Advanced Mandarin LVCSR System at NLPR”, in the proceeding of International Conference on Spoken Language Processing (ICSLP 2000), 2000.
[21] Bing Zhao and Bo Xu, Incorporating HMM State Sequence Confusion for Rapid MLLR Adaptation to New Speakers”, in the proceeding of International Conference on Spoken Language Processing (ICSLP 2000), 2000.
Technical Report:
[22] Bing Zhao, Nguyen Bach, Ian Lane, and Stephan Vogel, A Log-linear Block Transliteration Model based on Bi-Stream HMMs”, CMU-LTI-06-007, Technical Report 2006 Fall (Conference version in HLT/NAACL-07).
Thesis:
[23] Bing Zhao, “Statistical Alignment Models for Translational Equivalence”, Expected in May 2007
SELECTED HONORS AND AWARDS
2001-present Computer science graduate fellowship, Carnegie Mellon University.
1999 Tung's Oriental Scholarship, Chinese Academy of Sciences.
1998 Excellent Bachelor Thesis Award, University of Science and Technology of China (USTC)
1995-1997 "Yu-Cai", “P&G” and “Ding Xin” Scholarships at USTC.
1993-1994 Excellent Student Scholarships at USTC.
RESEARCH
Machine Translation |
||
2002– |
Statistical Alignment Models for Translational Equivalence |
CMU |
|
In my current work, I focus on bilingual topic AdMixture
(BiTAM) translation models leveraging bilingual document-level context.
The parallel sentence-pairs within a document-pair are assumed to
constitute a mixture of hidden topics; each observed word-pair follows a
topic-specific translation lexicon. With such topical information
inferred from document level context, the translation models are
expected to be sharper and the word alignment process less ambiguous.
Traditional IBM translation models are word-mixture models, which simply
ignore the parallel document boundaries in their generative processes.
|
|
2001 – |
Hands-on Experience in Statistical Machine Translation |
CMU |
|
I have about 5-year experiences covering many aspects of
statistical machine translation. I designed and implemented
cross-lingual bag-of-word models to align parallel documents from
comparable data, and a dynamic programming algorithm to extract parallel
sentence-pairs from the aligned document pairs. I applied these
techniques to align the 10-year XinHua news comparable corpora and
generated the collection released under LDC2002E18. I implemented a SMT
beam search decoder. I also designed models for transliterations of
Arabic unknown words. I directly participated in a number of
international machine translation evaluations including GALE, NIST,
IWSLT and TC-STAR. |
REFERENCES
Available upon request