Newsgroups: comp.ai,sci.stat.math,comp.ai.neural-nets
Path: cantaloupe.srv.cs.cmu.edu!bb3.andrew.cmu.edu!nntp.sei.cmu.edu!news.psc.edu!hudson.lm.com!news.math.psu.edu!psuvax1!news.eecs.nwu.edu!newsfeed.acns.nwu.edu!news.luc.edu!chi-news.cic.net!uwm.edu!fnnews.fnal.gov!usenet.eel.ufl.edu!tank.news.pipex.net!pipex!news.mathworks.com!newsfeed.internetmci.com!news.sprintlink.net!siemens!murthy
From: murthy@scr.siemens.com (Sreerama Murthy)
Subject: PhD thesis: On Building Better Decision Trees from Data
Message-ID: <DHHoIp.CHu@scr.siemens.com>
Sender: news@scr.siemens.com (NeTnEwS)
Nntp-Posting-Host: sunra.scr.siemens.com
Organization: Siemens Corporate Research, Princeton NJ
Date: Fri, 3 Nov 1995 22:44:48 GMT
Lines: 51
Xref: glinda.oz.cs.cmu.edu comp.ai:34494 sci.stat.math:7758 comp.ai.neural-nets:27753

The following PhD thesis is now available on WWW and FTP. You may
retrieve the postscript files for whole thesis as well as just the
chapter(s) of interest. There is also a HTML version that allows
retrieval of even individual subsections.

Title and Abstract:
-------------------

             On Growing Better Decision Trees from Data

  			Sreerama K. Murthy
	Department of Computer Science, Johns Hopkins University
		Thesis Advisor: Steven L. Salzberg


  This thesis investigates the problem of growing decision trees from
  data, for the purposes of classification and prediction.

  After a comprehensive, multi-disciplinary survey of work on decision
  trees, some algorithmic extensions to existing tree growing methods
  are considered. The implications of using (1) less greedy search and
  (2) less restricted splits at tree nodes are systematically studied.
  Extending the traditional axis-parallel splits to {\it oblique}
  splits is shown to be practical and beneficial for a variety of
  problems.  However, the use of more extensive search heuristics than
  the traditional greedy heuristic is argued to be unnecessary, and
  often harmful.

  Any effort to build good decision trees from real-world data
  involves ``massaging'' the data into a suitable form.  Two forms of
  data massaging, domain-independent and domain-specific, are
  distinguished in this work. A new framework is outlined for the
  former, and the importance of the latter is illustrated in the
  context of two new, complex classification problems in astronomy.
  Highly accurate and small decision tree classifiers are built for
  both these problems through a collaborative effort with astronomers.


To retrieve:
----------------

World-Wide-Web: http://www.cs.jhu.edu/grad/murthy.

FTP: Anonymous ftp to blaze.cs.jhu.edu. Directory pub/murthy.
     The file thesis.ps.gz is the whole thesis.
     For individual chapters, first get contents.ps.gz, which has the 
     table of contents. The directory contains a postscript file for 
     each chapter. The filenames should be self-explanatory.
     Dont forget to use binary transfer mode!


