Error Region Analysis (ERA) program input file format. Lin Chase Carnegie Mellon University 20 July 1995 (Modified December 21 1995 to cover input phone segmentations for both REFerence and HYPothesis information.) ----- UTTERANCES UTT REF SEGMENTS . . . . . . HYP SEGMENTS . . . . . . UTT REF . . . HYP . . . EOF ------------- Notes: 1. The integer in front of the token "UTTERANCES" indicates how many "UTT", "REF" and "HYP" entries there will be in the file. 2. For each REF the integer in front of the token "SEGMENTS" indicates the number of word segmentations that should be included before the next instance of the "HYP" token is encountered. 3. For each HYP the integer in front of the token "SEGMENTS" indicates the number of word segmentations that should be included before the next instance of the "REF" token is encountered. 4. "language_score_source" strings can be used to indicate algorithmic origins of language model scores, such as the branch of the Katz backoff algorithm used. The blank string "" should be used if you'd like to skip this bit. 5. The start frames of the REF and HYP sequences must be the same. The end frames of the REF and HYP sequences must be the same. The start frame of one segment within a REF/HYP sequence must be one integer count greater than the end frame of the previous segment in the sequence. 6. Phone segmentation start frames for initial phones in words must be exactly the same as start frames of their parent word. 7. Phone segmentation end frames for final phones in words must be exactly the same as end frames of their parent word. 8. Phone segmentations within words must not overlap and must completely partition the set of frames that make up the parent word.