From gps0@harvey.gte.com Wed Mar  2 18:19:21 EST 1994
Article: 20911 of comp.ai
Xref: glinda.oz.cs.cmu.edu comp.ai:20911
Newsgroups: comp.ai
Path: honeydew.srv.cs.cmu.edu!nntp.club.cc.cmu.edu!news.mic.ucla.edu!library.ucla.edu!europa.eng.gtefsd.com!MathWorks.Com!noc.near.net!ceylon!harvey.gte.com!gps0
From: gps0@harvey.gte.com (Gregory Piatetsky-Shapiro)
Subject: Data Mining / Knowledge Discovery references
Message-ID: <CM1LxK.1tG@gte.com>
Keywords: Data Mining, Knowledge Discovery, Databases
Sender: news@gte.com (USENET News System)
Organization: GTE Laboratories, Inc.
Date: Wed, 2 Mar 1994 15:05:42 GMT
Lines: 72

Several people have recently requested references on 
Knowledge Discovery/Data Mining.  I enclose a brief list of recent 
references. 
You can be up to date on those topics by subscribing to KDD Nuggets List
(e-mail to kdd-request@gte.com). 

-- Gregory Piatetsky-Shapiro (gps@gte.com)
================================================================
Gregory Piatetsky-Shapiro, Ph.D.  Principal Member of Technical Staff
GTE Laboratories, MS-45   	  e-mail: gps@gte.com 
40 Sylvan Road               	  fax:  (617) 466-2960 
Waltham MA 02154-1120  USA   	  phone:(617) 466-4236
====================================================

--------- Overview Articles -----------
C. Matheus, P. Chan, G. Piatetsky-Shapiro,
Systems for Knowledge Discovery in Databases, 
Special Issue on Learning and Discovery in Databases, 
IEEE Transactions on Knowledge and Data Engineering, Vol 5, No 6, Dec. 1993.

W. Frawley, G. Piatetsky-Shapiro, and C. Matheus, 1992.
Knowledge Discovery in Databases: An Overview. 
AI Magazine, Fall 1992. Reprint of the introductory chapter of
{\em Knowledge Discovery in Databases} collection, AAAI/MIT Press, 1991.

Data Mining: Intelligent Technology Gets down to Business",
PC AI (Nov - Dec 1993).

------ Collections and Books
IEEE Transactions on Knowledge and Data Engineering,
special issue on Learning and Discovery in Databases, 
N. Cercone and M. Tsuchiya, guest editors, Vol 5, No 6, Dec 1993

Machine Learning Journal, special issue on Machine Discovery, 
Jan Zytkow, guest editor, 12(1-3), 1993.

KDD-93: Proceedings of AAAI-93 Knowledge Discovery in Databases workshop,
G. Piatetsky-Shapiro, editor, 
AAAI Press technical report WS-02, July 1993

K. Parsaye and M. Chignell, 1993.
Intelligent Database Tools & Applications, 
John Wiley.

Special issue on Knowledge Discovery in Databases and KnowledgeBases, 
International Journal of Intelligent Systems, Vol 7, no, 7, Sep 1992,
G. Piatetsky-Shapiro, guest editor.  
	edited selection of best papers from KDD-91 workshop

G. Piatetsky-Shapiro and W. Frawley, 1991.
Editors, {\em Knowledge Discovery in Databases}, 
Cambridge, Mass.: AAAI/MIT Press. 
	a collection of state-of the art research papers

W. H. Inmon and S. Osterfelt, 1991.  
{\em Understanding Data Pattern Processing:
the key to Competitive Advantage}. QED Technical Publishing Group, 
Wellesley, MA.
	a business-oriented, nontechnical book


-- 
           Gregory Piatetsky-Shapiro    
GTE Laboratories, MS-45      email: gps0@gte.com 
40 Sylvan Road               fax: (617) 466-2960  
Waltham MA 02254  USA        phone:  (617) 466-4236


Article 21055 of comp.ai:
Xref: glinda.oz.cs.cmu.edu comp.ai:21055
Newsgroups: comp.ai
Path: honeydew.srv.cs.cmu.edu!fs7.ece.cmu.edu!europa.eng.gtefsd.com!howland.reston.ans.net!newsserver.jvnc.net!raffles.technet.sg!newsserver.iti.gov.sg!news
From: sherry@iti.gov.sg (Long Ai Sin Sherry)
Subject: Summary of Replies for "Help: Data Mining Tools"
Message-ID: <CMHzCB.2Dp@exodus.iti.gov.sg>
Sender: news@exodus.iti.gov.sg (USENET News System)
Reply-To: sherry@iti.gov.sg
Organization: Information Technology Institute, National Computer Board, S'pore
Date: Fri, 11 Mar 1994 11:16:59 GMT
Lines: 751


Hello all,


Due to popular demand, here is a summary of replies for the
subject : "Help : Data Mining Tools" which i posted 2 weeks
ago. 

I would like to take this opportunitly to thank all netters
who responded to the subject. Hope you will be able to find
some useful information from here.

Any more replies to this subject are welcome.

Questions posted :

>Hi!
>
>I'm currently doing some evaluation on data mining tools.
>I would appreciate it very much if anyone could :
>
>1) refer me a list of data mining tools available in the
>   market; or
>
>2) recommend me some good data mining tools; or
>
>3) recommend me some data mining tools that are capable of 
>   doing unsupervised learning; or
>
>4) provide me pointers on any tool evaluation/comparison
>   reports;


***********************************************************************
                      REPLY 1 (consolidated version)
***********************************************************************
Overview report :

Data Mining - The Search for Knowledge in Databases
Marcel Holsheimer, Arno Siebes
Univ. of Amsterdam

available via anonymous ftp to :
ftp.cwi.nl
cd pub/CWIreports/AA
get CS.R9406.ps.Z

***********************************************************************
                       REPLY 2
***********************************************************************
From: tgorb@rrc.chevron.com (Joe Gorberg)
Organization: Chevron, Richmond, California

A couple of suggestions:
1. Contact HNC, Inc. in San Diego.  They developed a tool called Database Mining.
I think they even registered the name as a trademark or something.  Anyway, check
out what they have to offer.

2. I just purchased and received IDIS from IntelligenceWare, Inc.  I can't recommend
the package yet as I have only used it for a few hours.  Not too impressed so far, 
but I really need to understand what it's doing and how to interpret the 
results. It develops a set of rules to define correlations and cause-effect based on
one or more goals which you set.

3. On the mac side a good visualization tool I like and recommend is Data Desk (you 
can get it from Egghead and MacWarehouse). Its a stat. package with excellent graphics
for x-y-z rotating plots, histograms and much more.  It really has helped me get value
out of neural nets and understanding the data.

Good luck.  If you come across anything else, please let me know.

Joe Gorberg
Chevron Research and Technology Co.
tgorb@chevron.com
(510) 242-2378

***********************************************************************
                       REPLY 3
***********************************************************************
From: saswss@hotellng.unx.sas.com (Warren Sarle)
Organization: SAS Institute Inc.

In article <16023@lhdsy1.lahabra.chevron.com>, tgorb@rrc.chevron.com (Joe Gorberg) writes:
|> ...
|> 2. I just purchased and received IDIS from IntelligenceWare, Inc.  I can't recommend
|> the package yet as I have only used it for a few hours.  Not too impressed so far,
|> but I really need to understand what it's doing and how to interpret the
|> results. It develops a set of rules to define correlations and cause-effect based on
|> one or more goals which you set.

Cause and effect cannot be established without running an experiment
(a real experiment, not some simulation) in which the potential
causes are experimentally manipulated. Any AI software or stat
software that claims otherwise is lying.

|> 3. On the mac side a good visualization tool I like and recommend is Data Desk (you
|> can get it from Egghead and MacWarehouse). Its a stat. package with excellent graphics
|> for x-y-z rotating plots, histograms and much more.  It really has helped me get value
|> out of neural nets and understanding the data.

JMP is also good.


Warren S. Sarle       SAS Institute Inc.   The opinions expressed here
saswss@unx.sas.com    SAS Campus Drive     are mine and not necessarily
(919) 677-8000        Cary, NC 27513       those of SAS Institute.

***********************************************************************
                       REPLY 4
***********************************************************************
From A.N.Pryke@computer-science.birmingham.ac.uk Fri Mar 11 03:37:13 1994


Hi.
   I've not got much info on tools, what I have got is a posting by Sandra
Oudshoff sumarizing replies to her request for info on tools, I guess
you've got this already, but I'll send it anyway. I've also got a short
article on dblearn, which I'll include. 

Andy.

-----------------------------------------------------------------------
Start Enclosure 1
-----------------------------------------------------------------------
----------------------------------------------------------------------------
Published by The Centre For Systems Science          Simon Fraser University
Burnaby, BC  Canada V5A 1S6  604-291-3455  Editor: Barry Shell  shell@sfu.ca
----------------------------------------------------------------------------
************************************************************************
Data Mining
************************************************************************
New computer programs can probe vast databases searching for patterns.
They promise to extract useful knowledge from the rapidly mounting store
of boring data created by the information age.
========================================================================

We are drowning in a flood of computer information. Raw data from
banks, hospitals, and credit card transactions; digitized images
from space, geographical information systems, and computer scanners;
mailing lists, gene maps, on-line news, market reports, and demographic
surveys. "We've got lots and lots of data in computer databases. It's
everywhere!" says Jiawei Han, CSS member and associate professor of
computing science at SFU, "But people are getting bored of searching the
raw data."  

Sure, it's easy to find John Doe's bank balance, but what general
knowledge can be drawn from all the bank's data? Han, together with Nick
Cercone and associates, has created a computer program called dblearn
that, if unleashed on a bank's database, might ferret out the fact that
38% of the largest savings accounts are maintained by little old ladies
living within two kilometres of the bank. Now the data starts to get
interesting.

Easy to program and designed for the most common type of database in use
(so-called relational databases because they store information in
related tables), dblearn is starting to attract attention. The group has
invitations to speak at the 1st International Conference on Knowledge
and Information Management in Baltimore, MD and at Computer World '92 in
Kobe, Japan.  

"People are excited," says Han, "because the new technology can
influence policy making. You can now get precise general information
that was originally buried in tons of little details. Now, in minutes,
you can get what you need to put forth good arguments about policy."

How it works

Data tables in a relational database are organised in columns and rows,
each column holding one attribute like a person's age in years, while
each row might correspond to a different person. dblearn works its magic
in three phases. First the database must be preconditioned by a computer
science professional called a knowledge engineer who creates a framework
for mining the information. Han explains that many data attributes can
be clarified through the formation of conceptual hierarchies. For
example, one conceptual hierarchy for age might look like this:
0-1 -> infant, 2-5 -> preschool, 6-12 -> primary school, 13-19 -> teenager,
20-30 -> young adult, 31-65 -> middle age, 66+ -> senior citizen. Depending on
the nature of the data, the hierarchy could be further refined. For
instance, if most of the people in the database were middle aged and
older, the knowledge engineer might add the following information:
{infant, preschool, primary school, teenager}fichild. This tells dblearn
that it can lump anyone under 19 into the category "child" adding a
further level to the conceptual hierarchy.  

Once the database is prepared, a minimally trained clerk can enter a
learning request to dblearn . This small English-like program defines
how to extract the knowledge from the raw data (see text box, next
page). Finally, dblearn sets about the task by applying what Han calls
internal learning strategies to the data.  

dblearn "learns" from a database in three steps while it creates a
simplified temporary version of the data in memory. First it checks all
the attributes to see if any can be decomposed into smaller units. For
example, a birthdate might actually contain three pieces of information:
day, month, and year. "We must be careful not to throw away anything
potentially useful during the learning process," says Han. This is
especially important because the next step involves removing attribute
columns that are of no use. For instance, people's last names are mostly
unique so they won't yield any general rules.  

After decomposition and removal, dblearn tries to reduce the complexity
of the data by substituting actual values with the more general terms
defined in its conceptual hierarchy. It might replace "Manitoba" with
"Prairies". Then it counts up rows with matching values and copies them
over to become one row in the temporary table. The program also creates
a new column where it stores this count. If it can move up the
conceptual hierarchy to generalize the data further, it does more
substitution, counting and copying. Eventually a table of information
emerges revealing previously hidden facts.

In the future, Han and Cercone feel concept hierarchies could be
generated by automatic statistical analysis of a column's contents. With
easy to use interfaces that understand plain English, such data-mining
programs will become commonplace in the information society.


TEXT BOX
========================================================================
FINDING GOLD IN A MOUNTAIN OF INFORMATION

NSERC (the Natural Sciences and Engineering Research Council of Canada)
gave the group access to their Grants Information database containing
information about all the research grants awarded in 1990-91. The
central relation table, award, contains 10,087 tupples (rows) with
eleven attributes (columns). The dblearn software extracts knowledge in
three steps as follows:

Phase I: The Conceptual Hierarchy

For attribute province: {Alberta, Saskatchewan, Manitoba} -> Prairies

{New Brunswick, Nova Scotia, Newfoundland, Prince Edward
Island} -> Maritimes

For attribute amount: 1-19,999 -> 1_20K, 20,000-39,999 -> 20_40K, etc...

For attribute disc_code (discipline code): 26000-26499 -> AI

Phase II: The Learning Request

learn characteristic rule for disc_code = "AI"

from award

where grant_code = "Operating_Grants"

in relevance to amount, province, prop(vote),  

prop(amount)

("prop" is an internal function that gives the percentage of total)

Phase III: Internal Learning and Results

Attribute-oriented induction based on the above conceptual hierarchy and
learning request has interesting results. We discover, among
other things, that for operating grants in AI between $20,000 and
$40,000, BC beats lion's-share grant-winner Ontario; or that Quebec AI
research funding clusters at the low and high ends.  
========================================================================


-----------------------------------------------------------------------
End Enclosure 1
-----------------------------------------------------------------------


-----------------------------------------------------------------------
Start Enclosure 2
-----------------------------------------------------------------------
Article: 5687 in comp.ai
Newsgroups: comp.ai
From: oudshoff@sun019.research.ptt.nl (Sandra Oudshoff)
Subject: Summary: tools for information harvesting
Message-ID: <1993Oct8.142256.11343@spider.research.ptt.nl>
Keywords: information harvesting, data mining, tools
Sender: oudshoff@sun019 (Sandra Oudshoff)
Nntp-Posting-Host: sun019.research.ptt.nl
Organization: PTT Research, Groningen, The Netherlands
Date: Fri, 8 Oct 1993 14:22:56 GMT

Hi all,

This posting summarizes the information sent to me by several netters in
reply to my post for information about commercial software tools for
information harvesting (or data mining). I hope you will find some useful
information in here.

------------------------------------------------------------------------
>From kdd%eureka@gte.com Thu Sep 23 15:38:28 1993

	Some Commercially Available Products for
	Intelligent Discovery in Databases.

	Gregory Piatetsky-Shapiro (gps0@gte.com)
	GTE Laboratories,
	40 Sylvan Road, Waltham MA 02154		

	Last updated: July 1993


Here I will discuss only the products with AI-related approaches.
Other tools, such as statistical and forecasting methods or scientific
visualization packages, are not discussed.

This is an informal list, representing only MY PERSONAL opinions, and
not opinions of GTE or GTE Laboratories.  It is
not intended to be a complete survey or endorsement of any kind.
I also do not have any financial interest in any of the companies below.

Ads for other intelligent tools can be found in AI Magazine, AI
Expert, IEEE Expert, PC AI, Expert Systems, and similar magazines.

Index to tools (listed alphabetically by tool name)

	AIM from AbTech
	AUTOCLASS from NASA
	Database Mining software from HNC
	Datalogic/R from Reduct Systems
	Information Harvesting from Ryan Associates
	IXL/IDIS from IntelligenceWare
	KnowledgeSeeker from FirstMark Technologies
	NEXTRA from Neuron Data
	PC-MARS from Data Patterns,
	RECON for Data Mining from Lockheed

Detailed descriptions:
------------------------
AIM
	from: AbTech, 700 Harris Street, Charlottesville, VA 22901.
	(804) 977-0686.
	It automatically synthesizes network solutions from databases of
	examples.  It uses 1-, 2, and 3-dimensional polynomials.


-------------------------
AUTOCLASS
	from NASA
	"AutoClass: A Bayesian Classification System", Peter Cheeseman,
	James Kelly, Matthew Self, John Stutz, Will Taylor, Don Freeman.
	Presented at the Fifth International Conference on Machine
	Learning.

WHAT IS AUTOCLASS:

   AutoClass is an unsupervised Bayesian classification system for
   independent data. It seeks a maximum posterior probability
   classification.

   Inputs consist of a database of attribute vectors and a class model
   defined by a parametric class probability function and corresponding
   parameter priors.  Models are constructed from a specified set of terms
   appropriate to both discrete and real valued attributes.  AutoClass
   attempts to find the set of classes that is maximally probable with
   respect to the data and model.  The output is a set of classes given as
   instances of the model with specific parameters.  There are facilities
   for reporting on the classes, the influence of the attributes on the
   classes, and the probability weighting of the data over the classes.

   Running AutoClass requires a Common Lisp environment.  It has been
   successfully run on Symbolics and Explorer Lisp machines, on the Franz
   and Sun/Lucid Lisp implementations on the Sun and similar Un*x platforms,
   and on the Macintosh personnel computer.


The most recent release I could find is AutoClass III (Version 3.0.3);
you should be able to locate your nearest server using

archie -s autoclass

If you don't have an archie client installed, telnet to archie.ans.net
and login as archie.
-------------------------


RECON for Data Mining
	from Lockheed. Advertised at AAAI-93.
	from RECON Brochure:
	Capabilities: Pattern Discovery, Pattern Validation, Summarization,
	Decision Support.
	Top-down and Bottom-Up data mining.
	Contact: Dr. Evangelos Simoudis, Lockheed AI Center,
	3251 Hanover Street, Palo Alto CA 94304
	Voice: (415) 354-5271	Fax: 415-424-3425

------------------------
IXL/IDIS Discovery Machine
	from: IntelligenceWare, 5933 West Century Blvd.,
	Los Angeles, CA 90045,  (213) 216-6177
	
	IXL is a sophisticated product, with fancy screen layout and many
	features. IXL finds the most interesting rules in data, using
	a symbolic learning approach.  I have used it and it is nice.
	IntelligenceWare also has several other related products.

	IntelligenceWare also sells Data Visualization Tool -
	a package for automatically generating 2D and 3D graphs from data,
	-- pattern discovery by using human visual abilities

------------------------
KnowledgeSeeker
	from: FirstMark Technologies Ltd, 14 Concourse Gate,
	Suite 600, Ottawa, Ontario, Canada K2E 7S8, 613-723-8020.
	It automatically builds a decision tree for your concept,
	and does many other interesting things.  I have used and it
	has a good user interface.

--------------------
NEXTRA
	from: Neuron Data, 156 University Ave., Palo Alto, CA 94301.
	1-800-876-4900
	It is an impressive tool able to synthesize rules from
	user preferences.  Nice graphical abilities.

--------------------
Database Mining Software (Last Updated 7-28-93)
	from: HNC (San Diego, CA), 1-800-HNC-EXPR.
	1-619-546-8877 (ask for Scott Crispie).

	It uses a more classical neural net approach.
	After training a net to recognize a concept, it uses a patented
	method to extract rules that correspond to the net.
	(a partial description of that method in paper by Steve Gallant
	Connectionist Expert Systems,
	Communications of ACM 31(2):153-168, 1988)

-------------------------
PC-MARS, Data Patterns, 528 S. 45th street, Philadelphia, PA 19104,
	(215) 387-1844.  495 (Dec 92).
	is a software package for developing models of non-linear multivariable
	processes from past input/output data, useful for predicting future
	outputs. Advertised as an alternative to neural networks,
	helps user to understanfd the process being modelled.  Provides
	graphical tools. IBM PC and compatibles.
	
--------------------
Datalogic/R (formerly DataQuest)
 	from: Reduct Systems, Regina, Canada. (306) 586-9408,
	fax (306) 586 9442.
	Software for data mining using a rough set approach.
	(see AI Expert, March 1993, for an Ad)

------------------------------------------------------------------------
RECON software, contact:

Dr. Evangelos Simoudis,
Lockheed AI Center
3251 Hanover Street
Palo Alto, Ca 94304
simoudis@aic.lockheed.com

Indeed Recon is able to perform the type of tasks you are interested
in accomplishing with information harvesting. Today I will send you a
video showing how two of Recon's components can be used to extract
rule-based models from a database with data about stocks. I will also
send you other documentation that describes some of the applications
we have developed, as well as pricing information.  I hope you find
the information useful.

I was glad to hear you received the information I sent you.  Lockheed,
has offices throughout Europe including one in Brussels and a
representative in the Netherlands. We are currently negotiating with
three European software firms to provide Recon support, in addition to
the support we provide from the United States.

Thus far, our work on data mining has been performed through large
contracts with large companies and the federal government.  For this
reason, we have been able to provide support from our home base in
California as well as by traveling directly to the customer's site, if
the situation warranted it. Given that all of our customers are on the
East Coast of the United States, 4.000km from California, I hope you
can appreciate that we can deliver support to anywhere we need.

The tape that I sent you mentions that Recon includes neural and
statistical modules.  For example, Lockheed has developed the
Probabilistic Neural Network and General Regression Neural Network
that have been recognized as providing the best results among
competing neural network algorithms. Of course, we are also working
with the more traditional neural network algorithms such as the back
propagation and its variants. The operation of these modules is not
shown in the tape.  Furthermore, the tape does not show the data
visualization module of Recon.  Our approach to work on data mining is
the following:

1. We know that a single mining technique will not be appropriate for
*every* type of data.  For example, neural networks can deal with
certain data sets that statistics cannot. Similarly, symbolic learning
techniques can work better than neural networks with other data sets.
For this reaon we have develped a toolbox of techniques to perform
top-down and bottom-up data mining.

2. We evaluate the customer's data and work with the customer to
define the types of data mining operations that the customer will need
to perform.  Lockheed then recommends to the customer the components
of the toolbox that will be most appropriate to the customer's data
and the type of operations that must be performed.

3. We tailor the Recon system to include only the techniques that the
customer and Lockheed have agreed upon.  In this way, each time the
customer achieves the best possible results from the data mining
operation.

The video tape I sent you demonstrates the operation of an actual
system we delivered to a financial company.  This customer did not
want any visualization capabilities in the version of Recon we
delivered. As a result, the visualization component of our toolbox was
not delivered.

Of course, being a market driven group we will be willing to discuss
other possible configurations of the Recon system which will of
interest to your company.

Please do not hesitate to contact me for any other information you may
want on Recon's capabilities that will help you in your evaluation.

Thank you and regards,
Evangelos Simoudis

-------------------------------------------------------------------------
AUTOCLASS

>From schmid@bastille.berkeley.edu Thu Sep  9 12:01:25 1993
Organization: University of California, Berkeley

Check out the Knowledge Discovery in Databases proceedings.

Check out AutoClass, an unsupervised Bayesian classifcation
system which learns classifications from data.  Developed by
Peter Cheeseman et al at NASA Ames.  (cheesem@ptolemy.arc.nasa.gov)
This has done some pretty impressive things.  Lots of papers
on it if you want background.  Sorry that I'm only familiar
with probabilistic approaches.

scott.

I am sorry to say that our latest version of the program, AutoClass X, is not
available internationally.  However, an earlier version, AutoClass III, is
available from COSMIC, see below.  Also, below that is a list of references.

Will Taylor      Recom Technologies	(415)604-3364
Artificial Intelligence Research Branch - Code FIA
NASA Ames Research Center
MS 269-2, Moffett Field, CA 94035-1000  taylor@ptolemy.arc.nasa.gov

   AutoClass III is the official released implementation of AutoClass
   available from COSMIC (NASA's software distribution agency):

	COSMIC
	University of Georgia
	382 East Broad Street
	Athens, GA  30602  USA
	voice: (706) 542-3265  fax: (706) 542-4807
	telex: 41- 190 UGA IRC ATHENS
	e-mail:	cosmic@@uga.bitnet  or service@@cossack.cosmic.uga.edu																							       	
   Request "AutoClass III - Automatic Class Discovery from Data (ARC-13180)".

   ----------------------------------------------------------------------
   ARC-13180 - AutoClass: Automatic Class Discovery from Data
   -------------------------------------------------------------
   The standard approach to classification in much of artificial
   intelligence and statistical pattern recognition research
   involves partitioning of the data into separate subsets, known as
   classes.  AUTOCLASS III, from NASA Ames Research Center, uses the
   Bayesian approach in which classes are described by probability
   distributions over the attributes of the objects, specified by a
   model function and its parameters.  The calculation of the
   probability of each object's membership in each class provides a
   more intuitive classification than absolute partitioning
   techniques.

   AUTOCLASS III is applicable to most data sets consisting of
   independent instances, each described by a fixed length vector of
   attribute values.  An attribute value may be a number, one of a
   set of attribute specific symbols, or omitted.  The user
   specifies a class probability distribution function by
   associating attribute sets with supplied likelihood function
   terms.  AUTOCLASS then searches in the space of class numbers and
   parameters for the maximally probable combination.  It returns
   the set of class probability function parameters, and the class
   membership probabilities for each data instance.

   AUTOCLASS III, ARC-13180, is written in Common Lisp, and is
   designed to be platform independent.  This program has been
   successfully run on Symbolics and Explorer Lisp machines.  It has
   been successfully used with the following implementations of
   Common LISP on the Sun: Franz Allegro CL, Lucid Common Lisp, and
   Austin Kyoto Common Lisp and similar UNIX platforms; under the
   Lucid Common Lisp implementations on VAX/VMS v5.4, VAX/Ultrix
   v4.1, and MIPS/Ultrix v4, rev. 179; and on the Macintosh personal
   computer.  The minimum Macintosh required is the IIci.  This
   program will not run under CMU Common Lisp or VAX/VMS DEC Common
   Lisp.  A minimum of 8Mb of RAM is required for Macintosh
   platforms and 16Mb for workstations.  The standard distribution
   medium for this program is a .25 inch streaming magnetic tape
   cartridge in UNIX tar format.  It is also available on a 3.5 inch
   diskette in UNIX tar format and a 3.5 inch diskette in Macintosh
   format.  An electronic copy of the documentation is included on
   the distribution medium.  Domestic pricing is $900 for the program,
   and $21 for the documentation -- there is a 50% educational discount.
   International pricing is $1800 for the program, and $42 for the
   documentation -- there is *no* educational discount.

REFERENCES

P. Cheeseman, et al. "Autoclass: A Bayesian Classification System",
  Proceedings of the Fifth International Conference on Machine Learning,
  pp. 54-64, Ann Arbor, MI. June 12-14 1988.

P. Cheeseman, et al. "Bayesian Classification", Proceedings of the
  Seventh National Conference of Artificial Intelligence (AAAI-88),
  pp. 607-611, St. Paul, MN. August 22-26, 1988.

J. Goebel, et al. "A Bayesian Classification of the IRAS LRS Atlas",
  Astron. Astrophys. 222, L5-L8 (1989).

P. Cheeseman, et al. "Automatic Classification of Spectra from the Infrared
  Astronomical Satellite (IRAS)", NASA Reference Publication 1217 (1989)

P. Cheeseman, "On Finding the Most Probable Model", Computational Models
  of Discovery and Theory Formation, ed. by Jeff Shrager and Pat Langley,
  Morgan Kaufman, Palo Alto, 1990, pp. 73-96.

R. Hanson, J. Stutz, P. Cheeseman, "Bayesian Classification with
  Correlation and Inheritance", Proceedings of 12th International Joint
  Conference on Artificial Intelligence, Sydney, Australia. August 24-30,
  1991.

-----------------------------------------------------------------------------
DATA MARINER

.. My sponsoring company, Logica Cambridge Ltd, market a product
called `Data Mariner' which is based among other things on the
IDn algorithms - so far as I'm familiar with it, it carries out
similarity-driven induction to abstract regularities from large data
sets. I've no idea about prices and so on, I'm afraid.

E-mail address for the site is

logcam.co.uk

but I don't know who you'd need to contact. Jim Kennedy is technical
manager of the group dealing with knowledge-based products, and could
doubtless refer you to marketing personnel, if you can reach him.
Try jimk or postmaster. The one user name I do know is Marc Foote
(marcf) who works under Jim; he might also be able to help.

Hope this can help!

Tony Griffiths			tony@minster.york.ac.uk

_________________________________________________________________________
RTWORKS

Rtworks has no built-in tools for extracting useful information / rules /
patterns from (large) amounts of data.

One could use such a tool to generate rules for RTworks. There is a
book called: "C4.5: Programs for Machine Learning" by J. Ross
Quinlan which comes with a C prgoram to generate rules/decision trees
from large data sets. You should look into that.

| Tom Laffey                   phone:  (415)-965-8050            |
| Talarian Corporation         fax:    (415)-965-9077            |
| 444 Castro Str, Suite 140    E-mail: tom@talarian.com          |
| Mtn. View, CA 94041                  uunet!talarian!tom 	 |
			
				RTworks
  A family of products for large scale, distributed time critical systems

RTworks is a family of independent software modules developed for intelligent
real-time data acquisition, data analysis, data archiving, data playback,
data distribution, and message/data display. RTworks offers a number of
sophisticated problem-solving strategies including knowledge-based systems,
a point-and-click graphical user interface, temporal and statistical
reasoning, and the ability to distribute an application over a network.

--------------------------------------------------------------------------
IXL

A research project I was associated with tried IXL, by IntelligenceWare.
Two of us (with many years computing experience) never managed to make
it cope with our data ( about 380 records of about 90 fields). But then
we couldn't get it to cope with some of the test data distributed with
it either, in spite of trying several different machines with better than
the recommended configuration and memory management system! The promotional
blurb suggested that our size of data was well within its capabilities.
The kinds of errors we got were memory errors which caused system crashes.

The reason for these failures was never established. IntelligenceWare
assured us that all their test data worked fine for them.

Eventually, after discussions via the university lawyer, they refunded us.

Jane Hesketh (hesketh@ed.ac.uk)
jane@aisf.edinburgh.ac.uk

------------------------------------------------------------------------
NLToolset

Our NLToolset is capable of performing the functions you describe.
I am passing your note on to someone who hopefully will be able
to provide you with the pricing information you requested.

Regards,
Lisa Rau

200 South 33rd Street                       Visiting Assistant Professor
Dept of Computer and Information Science    lrau@cis.upenn.edu
University of Pennsylvania                  FAX: (215) 898-0587
Philadelphia, NY 19104-6389                 Phone: (215) 573-2815

--------------------------------------------------------------------------
Dear Sandra Oudshoff!

We have developed a system for discovery in databases (Explora) which is
generally available and can be run on  Macintosh. If you are interested in
using the system, you can get it via anonymous ftp from ftp.gmd.de in
directory gmd/explora: Open a connection to "ftp.gmd.de" and transfer the
file "Explora.sit.hqx" from the directory "gmd/explora". The file "README"
informs about the installation of Explora. An user manual is included.

If you need some further support, please contact me.

Best wishes

Willi Kloesgen
Willi Kloesgen, GMD, D-53757 Sankt Augustin
Phone ++49/2241-14-2723, Fax ++49/2241-14-2618
E-mail: kloesgen@gmd.de


-----------------------------------------------------------------------
End Enclosure 2
-----------------------------------------------------------------------

---
   Andy Pryke                                Email : A.N.Pryke@cs.bham.ac.uk

******************************************************************************


That's  all  for the summary. Hope you found some info useful to you!

Sherry Long
sherry@iti.gov.sg