Newsgroups: comp.speech
Path: pavo.csi.cam.ac.uk!warwick!pipex!uknet!festival!leeds.ac.uk!news
From: csxdtm@scs.leeds.ac.uk (D T Modd)
Subject:  please help - Lattice Corpus training sets requested
Message-ID: <1993Dec10.145529.1238@gps.leeds.ac.uk>
Originator: csxdtm@csgi45
Keywords: word lattices, speech, handwriting and OCR recognition
Sender: nntp@gps.leeds.ac.uk
Organization: The University of Leeds, School of Computer Studies
Date: Fri, 10 Dec 1993 14:55:29 GMT
Lines: 30


My Computer Science final year project involves collecting together a wide 
range of word-hypothesis recognition lattices, as output from large-vocabulary
speech and handwriting recognition systems. These word-candidate lattices look
something like this:

        stephen stiffen stiffens
        left    lift
        school  scowl   scull
        lest    last
        yearn   your    year

The collected lattices will constitute a standard Lattice Corpus which, 
hopefully, could be used as an evaluation resource for research in linguistic
constraint models for English speech and handwriting recognition systems.

Initially, I need to compare the range of word-lattice formats used by language modelling researchers to arrive at a standard representation format.

If your research is in this area, I would be very grateful if you could send 
me one or more example lattices (preferably as an ascii text file). Any 
information about the format of the lattices (e.g. documentation, references,
e.t.c.) would also be welcome.

Thanks for your help,

Dan Modd
Centre for Computer Analysis of Speech and Language,
School of Computer Studies, University of Leeds.
                                                 csxdtm@scs.leeds.ac.uk

