Newsgroups: comp.ai.nat-lang
Path: cantaloupe.srv.cs.cmu.edu!das-news2.harvard.edu!oitnews.harvard.edu!purdue!lerc.nasa.gov!magnus.acs.ohio-state.edu!math.ohio-state.edu!howland.reston.ans.net!news.cac.psu.edu!news.tc.cornell.edu!newsserver.sdsc.edu!nic-nac.CSU.net!charnel.ecst.csuchico.edu!csusac!csus.edu!netcom.com!ludemann
From: ludemann@netcom.com (Peter Ludemann)
Subject: Re: Help : Translate natural language into SQL
Message-ID: <ludemannDCHo2r.BAL@netcom.com>
Organization: NETCOM On-line Communication Services (408 261-4700 guest)
 complete my degree of Graduate Diploma of Commercial
 Computing at University of Western Sydney, Nepean.
References: <3v4hpj$dnd@transfer.stratus.com>
Date: Sat, 29 Jul 1995 17:47:14 GMT
Lines: 42
Sender: ludemann@netcom9.netcom.com

In article <3v4hpj$dnd@transfer.stratus.com>,
Luke Tung <ltun@skippy.au.stratus.com> wrote:
> Currently, I'm stocked with the design of the database
> front-end and would like to hear from anybody who has
> any idea regarding translating the plain English questions
> into Structured Query Language (SQL) to access the
> relational database.

IBM sells (sold?) a product called Language Access that allows queries
from various languages to generate SQL.  Perhaps you could get a
manual from them.  A major piece of customization was setting up the
vocabulary and normalizing the databases (if I correctly recall a
conversation I had with one of the developers, unnormalized databases
make life very difficult for generating queries).

There also is (was?) a company near Berkeley California called Natural
Language or something like that.  Again, vocabulary customization is a
large part of the set-up process.

To understand why vocabulary customization is so important, consider
the question "who earns more than $50,000?"  The system needs to infer
that the "who" here refers to the employee table, that "earn" refers
to salary; and it would have to build a join-query between the
employee table and the salary table (this is the easy part).

IBM has a product that runs on OS/2 (and Windows?) for building
vocabularies (Thesaurus/2, I think is the name).

Good luck; this is a *big* problem.

I would suggest taking a different approach, of guiding people in
building queries.  Check out Syllog, described in Adrian Walker's book
on Prolog (2nd edition; I can't remember the exact title off-hand).

By the way, for a research project, have you considered using plain
Prolog instead of SQL?  It makes building prototypes much easier; and
there are translators for Prolog to SQL when you want to access the
database (info in comp.lang.prolog FAQ, I think).


-- 
Peter Ludemann                      ludemann@netcom.com
