Newsgroups: comp.ai.nat-lang
Subject: Re: Character based language models?
Reply-To: lieske@di.epfl.ch
From: lieske@lith.di.epfl.ch (Christian Lieske)
Message-ID: <33c5da22.0@dinews.epfl.ch>
Date: 11 Jul 97 07:00:50 GMT
NNTP-Posting-Host: disunms-sidi.epfl.ch
Path: cantaloupe.srv.cs.cmu.edu!nntp.club.cc.cmu.edu!eecs-usenet-02.mit.edu!news.kei.com!nntprelay.mathworks.com!europa.clark.net!dispatch.news.demon.net!demon!rill.news.pipex.net!pipex!join.news.pipex.net!pipex!oleane!in2p3.fr!news-ge.switch.ch!news-zh.switch.ch!epflnews.epfl.ch!lieske
Lines: 40


> In article <5pajjk$iuo@netserv.waikato.ac.nz>, wjt@cs.waikato.ac.nz (Bill Teahan) writes:
> |> Does anybody know of any work with character based
> |> language models i.e. using bigraphs, trigraphs etc?
> |> There is plenty of work done on n-gram and n-pos
> |> language models, but finding references on n-graph
> |> (i.e. character gram) approaches seems to be more
> |> difficult for some reason...
> |>
> |> n-graph models are ideal for problems which
> |> involve character processing e.g. cryptography,
> |> spell-checking, OCR, handwriting recognition,
> |> language identification etc.
> |>
> |> So does anybody know of some references on the
> |> subject?
> |>
> |> Bill Teahan
> |>
> 
> Bill

It seems that Peter Ingel's phD on lexical error recovery with Hidden
Markov Models falls into this area. The thesis is available at
the Computation and Language E-Print Archive as entry 9702003.

Hope this helps,
Christian

-----------------------------------------------------------------------------
 Christian Lieske                   |  E-mail: Christian.Lieske@di.epfl.ch
 DI-LITH (Project ROTA)             |  Phone : ++41 21 693 25 89
 EPFL                               |  Fax   : ++41 21 693 52 78 
 CH-1015 Lausanne (Switzerland)     |  URL   : http://lithwww.epfl.ch/~lieske
-----------------------------------------------------------------------------
-- 
-----------------------------------------------------------------------------
 Dipl.-Inform. Christian Lieske     |  E-mail: Christian.Lieske@di.epfl.ch
 DI-LITH (Project ROTA)             |  Phone : ++41 21 693 25 89
 EPFL                               |  Fax   : ++41 21 693 52 78 
