Lectures

Homework 2
Out: Dec-20 Due: Jan-4 Wednesday night (12:00)

Important note:
* Please use Janus/Tcl command only! No perl, shell script, etc.
* If you have a question about the homework,
feel free to ask Stan at scjou@cs.cmu.edu

* Making a demi-syllable Mandarin dictionary

In the Mandarin language, every character's pronunciation is a syllable,
which is often decomposed as an initial-final (I-F) pair.
You can imagine the I-F structure as the consonant-vowel structure,
but they are not the same thing.
We call the I and F units demi-syllable.
For example, the syllable 'zhong' can be decomposed to
two demi-syllables 'zh' and 'ong',
while 'biao' can be decomposed to 'b' and 'iao'.

Since Mandarin is a tonal language,
we often attach the tonal marker to the syllables.
For example, 'zhong3' means the syllable 'zhong' with the 3rd tone.
It is commonly assumed the tonal information can be ignored
in the initial part,
so we decompose 'zhong3' to 'zh' and 'ong3', instead of 'zh3' and 'ong3'.

- Task 1.a: to write a Janus tcl script to
convert the syllable-based raw dictionary
into a demi-syllable-based Janus-format dictionary

Input: /project/Class-11-753/data/CH/dict/train-orig.syl.dict
For example, an entry of the input raw dictionary reads
ge4zhong3 ge4 zhong3
the first column is a Mandarin word in romanized form (Pinyin)
and the rest are the syllables of the word.
Your Janus tcl script should convert the entry to the Janus dictionary format
using a Dictionary object (say, 'dict') by the 'add' command:
dict add ge4zhong3 { {g WB} {e T4} zh {ong WB T3} }

Note that here we have six tags:
WB for word boundary, which appears at the begining and the end of the word,
and T1 to T5 for the five tones, respectively.

To seperate a Mandarin syllable into initial and final,
just cut the syllable at the left of the first occurance of
the set of characters: { a, e, i, o, u, v, - }
For example:
zhong -> zh ong
biao -> b iao
sh-i -> sh -i

Note that you should also additionally put an entry of
the 'silence word', as described in the session 2 web page.

- Task 1.b: generate a phonesSet description file
based on the demi-syllables you found from Task 1.a

The phonesSet description file should contain at least
four classes: PHONES SILENCE INITIAL FINAL
for example
PHONES @ SIL zh ong b iao sh -i
SILENCE SIL
INITIAL zh b sh
FINAL ong iao -i

Use a Janus PhonesSet object to do this task.

Hint:
Tcl has some powerful commands:
string - Manipulate strings
regexp - Match a regular expression against a string
You may see http://tcl.activestate.com/man/tcl8.3/TclCmd/contents.htm
for the tcl command reference, and try to google some command usage example.

Submission of Task 1.a and 1.b:
Send to Stan (scjou@cs.cmu.edu) the following NFS paths (not the files!):
1. the Tcl script for 1.a and 1.b:
You should let Stan be able to reproduce your result by
UNIX> Janus YourScript.tcl
2. the generated demi-syllable dictionary and phonesSet description file.

* Making a Janus database

Here we want to process the data at
/project/Class-11-753/data/CH
so we firstly need to generate a Janus database
for data interpretation and organization.
Under the aforementioned directory,
the sub-directory adc/ is where the waveform (adc) files are located,
rmn/ is where the romanized transcripts are located.

- Task 2: Use the method described in the Session 2 web page
to generate a Janus dbase containing the following information:
+ A dbase key which is a unique ID. Usually we use the utterance ID.
+ Speaker ID: SPKID
+ Utterance ID: UTTID
+ Waveform path: ADCPATH
+ Waveform filename: ADCFILE
+ Utterance start time: FROM . The FROM value in this task is always '0'.
+ Utterance end time: TO . The TO value in this task is always 'last'.
+ Transcript: TEXT

For example, with the file
/project/Class-11-753/data/CH/rmn/CH094.rmn
we know the SPKID from either the filename or the first line of the file:
";SprecherID 094".
The following lines contain the utterance ID number and
the romanized transcript alternatively, line by line.

Therefore, for example, your script should do something like:
db add spk094_utt1 { {SPKID spk094} {UTTID spk094_utt1} {ADCPATH /project/Class-11-753/data/CH/adc/094/} {ADCFILE CH094_1.adc.shn} {FROM 0} {TO last} {TEXT wai4jiao1bu4 fa1yan2ren2 da1 ji4zhe3 wen4 zhong1mei3 zhi1shi5chan3quan2 cuo1shang1 da2cheng2 yi1zhi4 you3li4 yu2 shuang1bian1 guan1xi5 gai3shan4 he2 fa1zhan3} }

Do this "db add" command for all the speaker-utterance pairs.

Hint:
Tcl has some powerful commands:
glob - Return names of files that match patterns
file - Manipulate file names and attributes
string - Manipulate strings
regexp - Match a regular expression against a string
You may see http://tcl.activestate.com/man/tcl8.3/TclCmd/contents.htm
for the tcl command reference, and try to google some command usage example.

Submission of Task 2:
Send to Stan (scjou@cs.cmu.edu) the following NFS paths (not the files!):
1. the Tcl script:
You should let Stan be able to reproduce your result by
UNIX> Janus YourScript.tcl
2. the dbase files

Last modified: Tue Dec 20 18:23:09 EST 2005
Maintainer: scjou@cs.cmu.edu.

Homework 2 Out: Dec-20 Due: Jan-4 Wednesday night (12:00)

Homework 2
Out: Dec-20 Due: Jan-4 Wednesday night (12:00)