		 Machine Learning List: Vol. 1 No. 5
			 Saturday,  August 5, 1989

Contents:
	ML91
        SRI-NIC Name Database for U.S. Internet Users
	Experimentation
	Notes:  Reviewers of IJCAI-89 & CogSci sessions for ML-LIST solicited

		
The Machine Learning List is moderated.  Contributions should be relevant to
the scientific study of machine learning. Mail contributions to ml@ics.uci.edu.
Mail requests to be added or deleted to ml-request@ics.uci.edu


----------------------------------------------------------------------
Date: Mon, 31 Jul 89 13:28:44 CDT
>From: "B. Porter and R. Mooney" <ml90@cs.utexas.EDU>
Subject: ML91

A decision on the location of the Machine Learning Workshop for 1991
will be made soon.  If you are interested in hosting this meeting, you 
should send a bid to either Jaime Carbonell (carbonell@nl.cs.cmu.edu) 
or Bruce Porter and Ray Mooney (ml90@cs.utexas.edu).  Be sure to include:

  1. Institution/dept
  2. Proposed conference site and dates
  3. Chairperson & local committee
  4. ML work in the institution & geographical area
  5. Industrial support (if any)
  6. Accessibility/housing cost of location
  7. Willingness to cooperate openly with ML community in organization

A couple of bids have already been submitted and a final decision will
be made at IJCAI-89.

----------------------------------------------------------------------
Subject: SRI-NIC Name Database for U.S. Internet users
Date: Mon, 31 Jul 89 16:40:37 -0700
>From: "David W. Aha" <aha@ICS.UCI.EDU>


I frequently need to access the e-mail address of USA researchers in our
community and often turn to the SRI-NIC name database.  Unfortunately, their 
names are seldomly there.  So I'd like to encourage you to register
yourselves.  Registering means that you send mail to:

	registrar@sri-nic.arpa

and tell them your full name, U.S. mail address, phone number (optional),
and (of course) e-mail address.  If you do this, other people on the net
will be able to query the database and find out how to send you e-mail.
At UCI the command is ``whois.''  Try the following command, and, if you
want researchers elsewhere to be able to reach you, please register.

        % whois aha

Thanks,
    David Aha


----------------------------------------------------------------------
Subject: Experimentation
Date: Sat, 05 Aug 89 10:52:44 -0700
>From: Michael Pazzani <pazzani@ICS.UCI.EDU>
Message-ID:  <8908051052.aa18584@ICS.UCI.EDU>


In Ml-LIST 1.1, Bernd Nordhausen writes:
>I am interested to hear from other people what they think about the
>subject of experimentation, so let the flames roll.

I feel that experimentation is being overemphasized in the current machine
learning research to the extent that it is often misapplied.  First, I'll
start with some good points about experimentation:
   1.  Experimentation forces the researcher to test his program and
       theory under a wide variety of circumstances, and with examples
       in many different presentation orders.   (In an early version
       of OCCAM, I tested the economic sanctions database in chronological
       order and didn't find several bugs in the program until I tested
       with random orders.)
   2.  Intuitive arguments and anecdotal evidences aren't a firm foundation
       upon which to build future results.   (After reading about ABSTRIPS
       I thought that searching in an abstract space would reduce search costs.
       More recent work has shown that this result does not generalize to many
       problems.)

I do have several concerns however:
   1.  Currently, in ML, experimentation is done poorly.  First, many 
       experiments are not run to test a particular hypothesis.  Instead
       they are exploratory post hoc data analyses.   It is much easier to
       prove statistical significance of results when testing a particular
       hypothesis.   This leads me to my second point.  Too few people are
       worrying about the statistical significance of their results.  I worry
       that many results may not replicated.
   2.  Experimentation on "real-world" data has little scientific value.  It
       makes nice PR to show people outside of machine learning that one
       program can perform slightly better than another on the soybean data,
       but this alone does not help us understand why one algorithm performs
       better than another.  Experimentation on artificial domains, in which
       the complexity of the hypothesis and other characteristics of the data
       are known and can be systematically varied are more useful in 
       understanding our algorithms.   (Of course, "real-world" data sets are
       not useless; they are very useful in pointing out research topics, etc.)
   3.  Overemphasis on performance measures obscures analysis of why algorithms
       work or fail to work.  An algorithm that cannot learn x-or can
       perform very well even if the "correct" hypothesis requires an x-or.
       Many experiments confound the class of situations on which an algorithm
       will fail with how often that situation occurs in a given data set.  The
       former is more important to machine learning than the latter.


----------------------------------------------------------------------
	
Subject: Reviewers of IJCAI-89 & CogSci sessions for ML-LIST solicited
>From: ml-request <ml-request@ICS.UCI.EDU>

People willing to write a short review for Ml-LIST of sessions at the
upcoming CogSci & IJCAI conference are solicited.   If you want to
comment on 3 or 4 papers, send your name and the session you'd like to
comment on to ml-request@ics.uci.edu

----------------------------------------------------------------------
END of ML-LIST 1.5


