Newsgroups: comp.lang.smalltalk
Path: cantaloupe.srv.cs.cmu.edu!rochester!cornellcs!uw-beaver!nntp.cs.ubc.ca!unixg.ubc.ca!van-bc!news.mindlink.net!sol.ctr.columbia.edu!spool.mu.edu!usenet.eel.ufl.edu!news.mathworks.com!uunet!in1.uu.net!rcogate.rco.qc.ca!nic.mtl.hookup.net!vertex.tor.hookup.net!loki.tor.hookup.net!newsserv.mtnlake.com!news
From: Rachel Shaw-Ng <noelng@mail.cibc.com>
Subject: Re: phonetic search
Content-Type: text/plain; charset=us-ascii
Sender: news@mtnlake.com (Root)
Content-Transfer-Encoding: 7bit
Nntp-Posting-Host: 207.61.173.132
Organization: Mountain Lake Software Corporation
Message-ID: <3249FDE3.2F08@mail.cibc.com>
References: <32462BFC.7F65@mail.knipp.de> <52b4fi$s5k@grimsel.zurich.ibm.com>
X-Mailer: Mozilla 2.01 (Win95; I)
Mime-Version: 1.0
Date: Thu, 26 Sep 1996 03:52:03 GMT
Lines: 37

Paul_Gover@uk.ibm.com wrote:
> 
> In <32462BFC.7F65@mail.knipp.de>, debis user <debis@mail.knipp.de> writes:
> > ...
> >I'm looking for a "phonetic search" alrorithmn.
> > ...
> 
> Michael, I think the most commonly used algorithm is called
> "Soundex".  It used to be patented, but I expect the patent
> has expired (someone please correct me if I am wrong here!).
> I don't have it on-line.
> 
> You group letters into classes (plosives, fricatives,
> sibilants, vowels etc), remove the vowels and duplicates, and
> use just the classes, not the original characters, except you
> retain the first character unclassed and truncate after three.
> It suffers large problems:
> 
> 1) The classification of letters is difficult: the algorithm
> has no way to handle the different pronounciations of "ough"
> as in "cough, through, hiccough, ought" and so forth.
> 2) The classes have to be compromises: for example you have to
> group S, C and K together.
> 3) It's hopeless when the native language is not English.
> 
> It's not too smart, but anything better is a lot more complex,
> and I guess needs a dictionary.
> 
> Paul Gover
> IBM Warwick Development Group
> Mumbling for myself, not IBM

PC Magazine May 30, 1995 has an article and a listing of the Soundex 
algorithm (in "C"). If you can't find it anywhere, email me and I'll 
snailmail you a copy.

Noel Ng
