Newsgroups: comp.ai.neural-nets
Path: cantaloupe.srv.cs.cmu.edu!bb3.andrew.cmu.edu!newsfeed.pitt.edu!gatech!news.sprintlink.net!news-peer.sprintlink.net!cs.utexas.edu!math.ohio-state.edu!howland.erols.net!newsfeed.internetmci.com!in3.uu.net!nntp.newsfirst.com!nntp.crosslink.net!news.magicnet.net!news.sprintlink.net!news-atl-21.sprintlink.net!interpath!news.interpath.net!news.interpath.net!sas!newshost.unx.sas.com!saswss
From: saswss@hotellng.unx.sas.com (Warren Sarle)
Subject: Re: Q: training set distribution
Originator: saswss@hotellng.unx.sas.com
Sender: news@unx.sas.com (Noter of Newsworthy Events)
Message-ID: <E0HIF8.J83@unx.sas.com>
Date: Thu, 7 Nov 1996 05:11:32 GMT
X-Nntp-Posting-Host: hotellng.unx.sas.com
References: <Pine.SOL.3.91.961105003258.6440G-100000@miles> <marzban.847296933@particle>
Organization: SAS Institute Inc.
Lines: 55


In article <marzban.847296933@particle>, marzban@particle.nhn.uoknor.edu (Caren Marzban) writes:
|> Greg Heath <heath@ll.mit.edu> writes:
|> 
|> >The way I see it there are four important sets of prior probabilities:
|> >   1. The population prior: The ratio that the category in question
|> >      occurs in the general population.
|> >   2. The operational prior: The ratio that the category in question
|> >      occurs during operation of the NN.
|> >   3. The design prior: The ratio that is used during the design of the
|> >      NN
|> >   4. The sample prior: The ratio that occurs in the sampled data used
|> >      for design.
|> 
|> 
|> True! But I've been wondering, lately, about another "type" of
|> a-priori probability. Note that the four types that you mention are
|> all ratios, (hinting at the frequentist approach to probability).
|> However, since the a-priori probabilities in Bayes' Thm are
|> unspecified quantities anyway, why not determine their values
|> according to some other criterion, e.g. minimization of some measure
|> of error? In other words, why not treat the a-priori probability of
|> belonging to some class as a free parameter, independent of the
|> class sizes (or ratios), and simply set them to values that minimize
|> some measure of classification (based, of course, on posterior
|> probabilities) error? 
|> 
|> It's easy to show that for some measures of error, these optimal values
|> of the a-priori probabilities are equal to ratios of the respective
|> class sizes. However, I've shown that for other measures of error
|> (or more precisely, measures of skill), the optimal values of a-priori
|> probabilities occur far from the values of the class-size-ratios. 
|> 
|> I do have one paper on this, but the referees failed to even notice my
|> suggestion of treating a-priori probability as a free parameter to be
|> determined from some other criterion; instead they picked on things
|> that I could address very easily. So, I'm hungry for some critical 
|> response. Any ideas?

Well, I'm baffled.  Most measures of classification error are weighted
averages of class-conditional errors. The weights in these weighted
averages should be the prior probabilities of the classes (during
operation of the network--Greg's prior #2). Treating the prior
probabilities as free parameters would result in the class with the
lowest error receiving a weight of 1 and all the other classes receiving
a weight of 0. This result seems useless. More details, please?




-- 

Warren S. Sarle       SAS Institute Inc.   The opinions expressed here
saswss@unx.sas.com    SAS Campus Drive     are mine and not necessarily
(919) 677-8000        Cary, NC 27513, USA  those of SAS Institute.
