LVCSR-BASED LANGUAGE IDENTIFICATION Tanja Schultz, Ivica Rogina, Alex Waibel Interactive Systems Laboratories University of Karlsruhe (Germany) Carnegie Mellon University (USA) published at: ICASSP 96 Automatic language identification is an important problem in building multilingual speech recognition and understanding systems. Building a language identification module for four languages we studied the influence of applying different levels of knowledge sources on a large vocabulary continuous speech recognition (LVCSR) approach, i.e. the phonetic, phonotactic, lexical, and syntactic-semantic knowledge. The resulting language identification (LID) module can identify spontaneous speech input and can be used as a front-end for our multilingual speech-to-speech translation system JANUS-II. A comparison of five LID systems showed that the incorporation of lexical and linguistic knowledge reduces the language identification error for the 2-language tests up to 50%. Based on these results we build a LID module for German, English, Spanish, and Japanese which yields 84% identification rate on the Spontaneous Scheduling Task (SST).