Workshop Theme


Over the past decade there has been some progress on the computational processing of Semitic languages.  Several workshops in recent years – both regional and affiliated with international conferences – have addressed the spectrum of issues relating to the processing of Arabic and other Semitic languages.  The progress of recent years has opened the door to advanced computational applications such as machine translation. Research on machine translation of Semitic languages is, however, still in its early stages. Accurate translation of Arabic, Hebrew and other Semitic languages requires treatment of unique linguistic characteristics, some of which are common to all Semitic languages, others specific to each of these individual languages and their dialects.


The goal of this workshop was to bring together researchers and research specifically concerned with issues pertaining to machine translation to, from, and among Semitic languages.  The workshop was well-attended and quite successful in bringing together work on different topics and languages.


Workshop Papers and Presentation


In accordance with the policy of the Association for Machine Translation in the Americas (AMTA), we are making available on-line the papers presented at the workshop. Many of the presenters have also been kind enough to provide us with the presentation slides.



Invited Talk


Semitic Linguistic Phenomena and Variations, by Nizar Habash, U. of Maryland


Panel Presentations


In the last few years, an annual competition has been sponsored by DARPA under the TIDES program for machine translation from specific languages to English. In 2002 and 2003 one of the languages was Arabic. We asked four participants in the 2003 competition to come and present their work in a panel format. Stephan Vogel (CMU) graciously accepted to give an introductory presentation on Statistical MT (SMT) and the DARPA-TIDES competition.


Introduction: SMT – TIDES – and all that, Stephan Vogel, Carnegie Mellon University


A. Tribble and S. Vogel, The CMU Arabic-to-English Statistical MT System, Carnegie Mellon University


A. Fraser, Issues in Arabic MT, University of Southern California/Information Sciences Institute


Y. Al-Onaizan, [No Slides Yet], IBM


C. Schafer, [No Slides Yet], Johns Hopkins University


Paper Presentations

J. Dichy & A. Farghaly, Roots & Patterns vs. Stems: on what grounds should a multilingual database centred on Arabic be built?, Université Lumičre-Lyon 2 & SYSTRAN

[Paper ]    [Slides]

A. Farghaly & J. Snellart, Intuitive Coding of the Arabic Lexicon, SYSTRAN
[Paper]    [Slides]

S. Fissaya & J. Haller, Application of Corpus-based Techniques to Amharic Texts, University of Saarland, Saarbruücken
[Paper]   [No Slides] 

B. Haddad & M. Yaseen, Semantic Composition of Arabic: A Unification Based Approach, Amman University
[Paper]   [No Slides] 

A., Itai & E. Segal, A Corpus Based Morphological Analyzer for Unvocalized Modern Hebrew, The Technion
[Paper]    [Slides]

E. Othman, K. Shaalan & A. Rafea, A Chart Parser for Analyzing Modern Standard Arabic Sentence, Cairo University
[Paper]    [Slides]

C. Schafer & D. Yarowsky, A Two-Level Syntax-Based Approach to Arabic-English Statistical Machine Translation, Johns Hopkins University
[Paper]    [No Slides Yet]

S. Wintner & S. Yona, Resources for Processing Israeli Hebrew, Haifa University
[Paper]    [Slides]




Violetta Cavalli-Sforza Carnegie Mellon University
Alon Lavie Carnegie Mellon University
Nizar Habash University of Maryland



