next up previous
Next: A FORMALISM BASED ON Up: BACKGROUND Previous: String processing

Multi-level data structures

In recognition of these shortcomings, many current systems have abandoned string based processing and now use multi-level data structures (MLDS). The most famous of these systems, known as Delta, was developed by Hertz [7], but many other systems, such as Chatr [4], the Bell labs system [8], Polyglot [5], and early versions of Festival [2] use similar formalisms. In the multi-layered formalism, different types of linguistic information are held in separate streams which are linear lists or arrays of linguistic items. For example we may have a word stream, a phone stream and a syllable stream. Some systems have a fixed set of these while others an allow arbitrary number.

Often algorithms need to know what phones are related to a given word, and hence the streams must be co-indexed. There are two main types of co-indexing. In Delta, streams are aligning by the edges of items. To find the phones in a word, one goes to the beginning of the word, traces the edge ``down'' to the phone stream, and they progresses along the phone stream until the edge relating to the end of the word is found. An alternative strategy is to align by the ``centres'' of items. In this case, a word contains a set of links to the phones that are related to it, and the phones in a word can be found by following these links.

While multi-level data structures are far preferable to string structures they still have serious drawbacks. The main drawback stems from the fact that this forces all information to be represented by linear structures: other types of structure, specifically trees, are very hard to represent. Partly because of this, the number of streams in a system can become considerable which leads to difficulties in co-indexing the items in streams. With the delta co-indexing, it is often the case that for a given item in one stream, there is no corresponding item in other streams. For example, a pause is represented by a item in the phone stream, but there is no equivalent item in the syllable or word streams. Thus a ``hole'' must be created in these streams to ensure the co-indexing works. With a large number of streams, the number of holes can become considerable which makes processing awkward. In the centre-linking paradigm, the hole problem is absent, but because each item must be explicitly linked to items in other streams, the number of links can become very large. Furthermore, if a new stream is added, one would have to link every existing item to the items in the new stream to ensure full connectivity. As this isn't possible in practice, the streams are often left partially connected. This can cause confusion in writing module, as one may be unsure of what streams are linked to what.

Another more subtle problem occurs in multi-level structures due to a lack of clarity as to what an item really represents. Often (not always) each item has a single value, typically a name. While it is obvious that a suitable phone name could be something like /h/ or /e/ and a word ``hello'', what should the name of a syllable be? Commonly, syllables are regarded as organisational units, which serve to group together phones. As such, they don't have a distinct name. One could group together the names of the phones and make that the syllable name (e.g. ``/h e l/'' for the first syllable of ``hello''), but this is redundant, somewhat artificial and likely to cause errors if the phone representation is changed.


next up previous
Next: A FORMALISM BASED ON Up: BACKGROUND Previous: String processing
Alan W Black
1999-03-20