Newsgroups: sci.math.stat,comp.ai
Path: cantaloupe.srv.cs.cmu.edu!nntp.club.cc.cmu.edu!miner.usbm.gov!news.er.usgs.gov!stc06.ctd.ornl.gov!fnnews.fnal.gov!uwm.edu!cs.utexas.edu!swrinde!news-peer.gsl.net!news.gsl.net!news.sgi.com!enews.sgi.com!ames!eos!kronos.arc.nasa.gov!kronos.arc.nasa.gov!taylor
From: taylor@ptolemy.arc.nasa.gov (Will Taylor)
Subject: Version 2.8 of AutoClass C Bayesian Classifier
Message-ID: <TAYLOR.96Sep23155841@muir.arc.nasa.gov>
Followup-To: sci.math.stat
Sender: usenet@ptolemy-ethernet.arc.nasa.gov (usenet@ptolemy.arc.nasa.gov)
Nntp-Posting-Host: muir.arc.nasa.gov
Reply-To: taylor@ptolemy.arc.nasa.gov
Organization: NASA/Ames Information Sciences
Date: Mon, 23 Sep 1996 22:58:41 GMT
Lines: 57

Announcing the release of version 2.8 of AutoClass C, the Bayesian
classifier which seeks a maximum posterior probability classification.

Key features:
 - determines the number of classes automatically;
 - can use mixed discrete and real valued data;
 - can handle missing values;
 - processing time is roughly linear in the amount of the data;
 - cases have probabilistic class membership;
 - allows correlation between attributes within a class;
 - generates reports describing the classes found; and 
 - predicts "test" case class memberships from a "training"
   classification.

Inputs consist of a database of attribute vectors (cases), either real
or discrete valued, and a class model.  Default class models are provided.
AutoClass finds the set of classes that is maximally probable with
respect to the data and model.  The output is a set of class descriptions,
and partial membership of the cases in the classes.

The initial release was on 19 April 1995:

   Version: 2.8	   03 Sep 96    Add search parameter "read_compact_p",
        which directs AutoClass to read the "results" and "checkpoint"
        files in either binary format or ascii format; redefine make
        files with -I and -L parameters for SunOS 4.1.3; change make
        file naming conventions; prevent corruption of discrete data 
        translation tables when translations are longer than 40
        characters; increase from 3000 to 20000 the value of 
        VERY_LONG_STRING_LENGTH to handle very large datum lines;
        increase DATA_ALLOC_INCREMENT from 100 to 1000 for reading very
        large datasets; add DATA_ALLOC_INCREMENT logic of READ_DATA
        to XREF_GET_DATA -- this will prevent segmentation faults
        encountered when reading very large .db2 files into the 
        reports processing function of AutoClass; in
        FORMAT_DISCRETE_ATTRIBUTE, do not process attributes with
        warning or error messages -- this prevents segmentation faults;
        in XREF_GET_DATA, free database allocated memory after it is 
        transferred into report data structures --this reduces the
        amount of memory required when generating reports for very
        large data bases, and prevents running out of memory; in all 
        functions calling malloc/realloc for dynamic memory allocation, 
        checks have been added to notify the user if memory is exhausted;
        and port the "make" file for HP-UX operating system using the
        bundled "cc" compiler.  (See "autoclass-c/version-2-8.text") 

For information on how to get this public domain software, see the following
WWW page,
   http://ic-www.arc.nasa.gov:
     /ic/projects/bayes-group/group/autoclass/autoclass-c-program.html
or send e-mail to taylor@ptolemy.arc.nasa.gov

<<<------------------------------------------------------------------->>>
--
Will Taylor > RECOM Technologies, Computational Sciences Div., Code IC
NASA Ames Research Center - voice:(415)604-3364, fax:(415)604-3594
MS 269-2, Moffett Field, CA 94035-1000  taylor@ptolemy.arc.nasa.gov
