next up previous
Next: Unit considerations Up: Corpus Previous: Elicitation


The challenges of transcribing spoken Arabic have already been described. Namely, without official voweling to fall back on, speakers must rely on their own intuitions of what vowels are being pronounced, and this intuition varies from speaker to speaker.

In order to remove some of the influence of the written language, transcribers worked with a roman alphabet. Transcription conventions were based on the LDC conventions for CALLHOME with some extensions.

Maintaining inter-coder consistency in transcription was very difficult, and required multiple iterations of hand-checking and converting to the Arabic script for verification.

Alan W Black 2003-10-27