Newsgroups: comp.speech
Path: cantaloupe.srv.cs.cmu.edu!rochester!udel!gatech!swrinde!pipex!uknet!bcc.ac.uk!phonetics.ucl.ac.uk!mark
From: mark@phonetics.ucl.ac.uk (Mark Huckvale)
Subject: Re: ASCII representations for IPA
Sender: news@ucl.ac.uk (Usenet News System)
Message-ID: <mark.72.00127407@phonetics.ucl.ac.uk>
Date: Fri, 5 May 1995 17:27:04 GMT
References:  <1995Apr25.143032.18275@lmpsbbs.comm.mot.com>
Organization: University College London
X-Newsreader: Trumpet for Windows [Version 1.0 Rev A]
Lines: 231

In article <1995Apr25.143032.18275@lmpsbbs.comm.mot.com> corrigan@corp.mot.com (JerryCorrigan) writes:

>I am looking for a means of representing the International Phonetic Alphabet
>in ASCII.  Does anyone know of any work that has been done on this?  Is there
>a representation that has wide acceptance?

I offer you this posting:

Subject: Computer-coding the IPA

Coding the IPA

John Wells, Department of Phonetics and Linguistics, University
College London <wells@phon.uc.ac.uk>

What follows is a proposed keyboard-compatible coding for the
entire set of IPA symbols. It covers everything on the 1993 IPA
Chart, including diacritics and tone marks, and is put forward
as a proposed standard way to transmit IPA-transcribed material
by e-mail and for similar purposes.

It is an extension of the SAMPA standard, with which colleagues
may be familiar. The most frequently used symbols are mapped
onto single keystrokes in the ASCII range 33..126. Less
frequently used symbols are mapped onto a single keystroke plus
\. Diacritics (other than those already catered for in SAMPA)
are mapped onto a keystroke with a preceding _. Thus for example
the voiced velar fricative (gamma) becomes G, the voiced uvular
plosive G\, and the velarization diacritic _G (velarized d =
d_G). Note that upper-case must be distinguished from
lower-case, but that there is no need to separate successive
symbols by spaces.

These proposals are fully set out with a reasoned explanation in
my 7000-word draft article "Computer-coding the IPA: a proposed
extension of SAMPA". This is available as a postscript file and
can be downloaded by anonymous ftp from <pitch.phon.ucl.ac.uk>
(Internet Address: 128.40.52.11) in directory </pub/sam>, file
name <ipasam-x.ps>. Log in with username ftp, password ftp. The
file should be fetched in ascii mode and sent to a postscript
printer.

Reactions from colleagues will be very welcome. Feel free to
pass this file on to anyone interested. ju k@n si: D@t Di:z
pr@p@Uzlz IneIbl @s t@ k@Ud ImpreS@nIstIkli @z wel @z
sIst@m{tIkli, {l@fQnIkli @z wel @z f@ni:mIkli, p{T@lQdZIkl
m@tI@ri@l @z wel @z nO:ml, @nd f@r eni l{NgwIdZ wi wIS.

This summary is in the form of two columns. In the first is a
phonetic label (since this is a simple ASCII file, I can't show
phonetic symbols); in the second is the proposed coding, which
we can refer to as X-SAMPA (extended SAMPA). The listing follows
the order of the Chart, and should be read in conjunction with
it.

1. IPA symbols belonging to the ordinary Roman alphabet (e.g. u,
x) remain the same. They are not listed below.

2. Consonants (pulmonic)

retroflex plosive, voiceless		t`	(` = ASCII 096)
retroflex plosive, voiced		d`

labiodental nasal			F
retroflex nasal			n`
palatal nasal				J
velar nasal				N
uvular nasal				N\

bilabial trill				B\
uvular trill				R\
alveolar tap				4
retroflex flap				r`
bilabial fricative, voiceless		p\
bilabial fricative, voiced		B
dental fricative, voiceless		T
dental fricative, voiced		D
postalveolar fricative, voiceless	S
postalveolar fricative, voiced	        Z
retroflex fricative, voiceless	        s`
retroflex fricative, voiced		z`
palatal fricative, voiceless		C
palatal fricative, voiced		j\
velar fricative, voiced		        G
uvular fricative, voiceless		X

uvular fricative, voiced		R
pharyngeal fricative, voiceless	        X\
pharyngeal fricative, voiced	        ?\
glottal fricative, voiced		h\

alveolar lateral fricative, vl.	        K
alveolar lateral fricative, vd.	        K\

labiodental approximant		        P (or v\)
alveolar approximant		        r\
retroflex approximant		        r\`
velar approximant			M\

retroflex lateral approximant	        l`
palatal lateral approximant	        L
velar lateral approximant		L\

2. Clicks

bilabial				O\	(O = capital letter)
dental					|\
(post)alveolar			        !\
palatoalveolar			        =\
alveolar lateral			|\|\

3. Vowels

close central unrounded		        1
close central rounded		        }
lax i					I
lax y					Y
lax u					U

close-mid front rounded		        2
close-mid central unrounded	        @\
close-mid central rounded		8
close-mid back unrounded		7

schwa					@

open-mid front unrounded		E
open-mid front rounded		        9
open-mid central unrounded	        3
open-mid central rounded		3\
open-mid back unrounded		        V
open-mid back rounded		        O

ash (ae digraph)			{
open schwa (turned a)		        6

open front rounded			&
open back unrounded		        A
open back rounded			Q

4. Other symbols
voiceless labial-velar fricative	W
voiced labial-palatal approx.	        H
voiceless epiglottal fricative	        H\
voiced epiglottal fricative		<\
epiglottal plosive			>\

alveolo-palatal fricative, vl. 	        s\
alveolo-palatal fricative, voiced	z\
alveolar lateral flap			l\
simultaneous S and x		        x\
tie bar					_

5. Suprasegmentals
primary stress			        "
secondary stress			%
long					:
half-long				:\
extra-short				_X
linking mark				-\

6. Tones & word accents
level extra high			_T
level high				_H
level mid				_M
level low				_L
level extra low			        _B
downstep				!
upstep				        ^

contour, rising			        _R
contour, falling			_F
contour, high rising			_H_T
contour, low rising			_B_L
contour, rising-falling		        _R_F
(NB Instead of being written as diacritics with _, all prosodic
marks can alternatively be placed in a separate tier, set off by
<>, as recommended for the next two symbols.)
global rise				<R>
global fall				<F>

7. Diacritics
voiceless				_0	(0 = figure), e.g. n_0
voiced					_v
aspirated				_h
more rounded			        _O	(O = letter)
less rounded				_c
advanced				_+
retracted				_-
centralized				_"
syllabic				=	(or _=) e.g. n= (or _=)
non-syllabic				_^
rhoticity				`

breathy voiced			        _t
creaky voiced			        _k
linguolabial				_N
labialized				_w
palatalized				'	(or _j) e.g. t' (or t_j)
velarized				_G
pharyngealized			        _?\

dental					_d
apical					_a
laminal				        _m
nasalized				~	(or _~) e.g. A~ (or A_~)
nasal release				_n
lateral release			        _l
no audible release			_}

velarized or pharyngealized	        _e
velarized l, alternatively		5
raised					_r
lowered				        _o
advanced tongue root		        _A
retracted tongue root		        _q

=====
*  Prof. J.C. Wells
*  Dept. of Phonetics & Linguistics, UCL, Gower Street, London WC1E 6BT
*  Tel. +44 (0)171 380 7175        Fax +44 (0)171 383 4108
  


+-----------------------------------------------------+
| M.A.Huckvale, Director,                             |
| MSc Programme in Speech and Hearing Sciences,       |
| Phonetics & Linguistics, University College London, |
| Gower Street, London, WC1E 6BT, U.K.                |
| Tel + 44 71 387 7050, Fax + 44 71 383 0752          |
+-----------------------------------------------------+
