        Article: 6792 in comp.ai
        Date: Tue, 18 Jan 1994 16:28:39 GMT
        Organization: CWI, Amsterdam
        Subject: Data Mining report available
        From: marcel@cwi.nl (Marcel Holsheimer)

This following report can be obtained by ftp:

_____________________________________________________________________

                            DATA MINING

                The Search for Knowledge in Databases

                    Marcel Holsheimer, Arno Siebes


                              Abstract
Data mining is the search for relationships and global patterns that
exist in large databases, but are `hidden' among the vast amounts of
data, such as a relationship between patient data and their medical
diagnosis. These relationships represent valuable knowledge about the
database and objects in the database and, if the database is a
faithful mirror, of the real world registered by the database.

One of the main problems for data mining is that the number of
possible relationships is very large, thus prohibiting the search for
the correct ones by simple validating each of them. Hence, we need
intelligent search strategies, as taken from the area of machine
learning.

Another important problem is that information in data objects is often
corrupted or missing. Hence, statistical techniques should be applied
to estimate the reliability of the discovered relationships.

The report provides a survey of current data mining research, it
presents the main underlying ideas, such as inductive learning, and
search strategies and knowledge representations used in data mine
systems. Furthermore, it describes the most important problems and
their solutions, and provides an survey of research projects.

CR subject classification (1991):
Database applications (H.2.8),
Information search and retrieval (H.3.3),
Learning (I.2.6) concept learning, induction, knowledge acquisition,
Clustering (I.5.3)

keywords: database applications, machine learning, inductive learning,
knowledge acquisition, data summarization
_____________________________________________________________________

The report can be obtained by anonymous ftp:

& ftp ftp.cwi.nl
Name: ftp
331 Guest login ok, send ident (your e-mail address) as password.
Password:
ftp> binary
ftp> cd pub/CWIreports/AA
ftp> get CS-R9406.ps.Z
ftp> bye

________________________________________________________________________
Marcel Holsheimer     | Centre for Mathematics and Computer Science (CWI)
phone +31 20 592 4134 | Kruislaan 413, Amsterdam, The Netherlands

