The Arabic language poses a number of challenges for any speech translation system. The first problem is the wide range of dialects of the language. Just as Jamaican and Glaswegian speakers may find it difficult to understand each other's dialect of English, Arabic speakers of different dialects may find it impossible to communicate.
Modern Standard Arabic (MSA) is well-defined and widely understood by educated speakers across the Arab world. MSA is principally a written language and not a spoken language, however. Our interest was in dealing with a normal spoken dialect, and we chose Egyptian Arabic; speakers of that dialect were readily accessible to us, and media influences have made it perhaps the most broadly understood of the regional dialects.
Another feature of Arabic is that the written form, except in specific rare cases, does not include vowels. For speech recognition and synthesis, this makes pronunciations hard. Solutions have been tested for recognition where the vowels are not explicitly modeled, but implicitly modeled by context. This would not work well for synthesis; we have defined an internal romanization, based on the CallHome  romanization, from which full phonetic forms can easily be derived. This romanization is suitable for both recognizer and synthesis systems, and can easily be transformed into the Arabic script for display.