Newsgroups: comp.ai.nat-lang
Path: cantaloupe.srv.cs.cmu.edu!bb3.andrew.cmu.edu!newsfeed.pitt.edu!gatech!news.mathworks.com!nntp.primenet.com!tank.news.pipex.net!pipex!usenet2.news.uk.psi.net!uknet!usenet1.news.uk.psi.net!uknet!uknet!newsfeed.ed.ac.uk!edcogsci!colin
From: colin@cogsci.ed.ac.uk (Colin Matheson)
Subject: Research on Tokenisation
Message-ID: <DwFyyo.9z6.1.dun@cogsci.ed.ac.uk>
Reply-To: colin@cogsci.ed.ac.uk
Organization: Centre for Cognitive Science, Edinburgh, UK
Date: Tue, 20 Aug 1996 14:59:11 GMT
Lines: 17

I'd be very grateful for any pointers to recent research on
tokenisation, which I take to mean the initial stages in text
processing in which `words' (or whatever) are identified and labelled.

Part of the Language Technology Group at the University of Edinburgh
is about to start work on a two-year project which aims to produce a
general tool, to be made available to the research community, hence
our interest.

Please reply by email unless your response is of general interest.

Colin
-- 
Colin Matheson                         | Human Communication Research Centre
Phone: +44 131 650 4656                | University of Edinburgh
Fax:   +44 131 650 4587                | 2 Buccleuch Place
Email: Colin.Matheson@ed.ac.uk         | Edinburgh EH8 9LW
