Newsgroups: comp.ai.neural-nets
Path: cantaloupe.srv.cs.cmu.edu!nntp.club.cc.cmu.edu!eecs-usenet-02.mit.edu!nntprelay.mathworks.com!europa.clark.net!206.229.87.25!news-peer.sprintlink.net!news-pull.sprintlink.net!news-in-east.sprintlink.net!news.sprintlink.net!Sprint!199.72.1.20!interpath!news.interpath.net!news.interpath.net!sas!newshost.unx.sas.com!saswss
From: saswss@hotellng.unx.sas.com (Warren Sarle)
Subject: Re: Unbalanced classes revisited
Originator: saswss@hotellng.unx.sas.com
Sender: news@unx.sas.com (Noter of Newsworthy Events)
Message-ID: <EED0Mq.9zu@unx.sas.com>
Date: Sun, 3 Aug 1997 22:47:14 GMT
X-Nntp-Posting-Host: hotellng.unx.sas.com
References:  <EDzwCw.857@unx.sas.com>
Organization: SAS Institute Inc.
Lines: 47


I wrote:
|> ...
|> There are several ways to twiddle the prior probabilities:
|>  1) Make multiple copies of training cases
|>  2) Weight the error function for each case
|>  3) Adjust the output biases when using a softmax activation function
|>  4) Adjust the posterior probabilities (divide by the design prior,
|>     multiply by the operational prior, renormalize to sum to one;

Greg replied regarding (3):
|> I think Warren included the softmax constraint to insure positive, unit-
|> sum posteriors. However, after the "Posterior Estimation Validity"
|> thread, I think it is accepted that, with the appropriate assumptions
|> w.r.t. God, mother, and country, linear and logistic activations can be
|> used.
|> 
|> Anyway, if the targets are ideal {0,1} posteriors, the average output
|> over the design set yields the design priors. Obviously, twiddling with
|> the output biases can result in a valid prior transformation

I think the algebra only works with softmax, for which (3) and (4) are
algebraically equivalent. For (3), you add to the bias the log
of the factor used in (4), i.e:

                              operational prior
   new bias = old bias + log( ----------------- )
                                design prior

If there is any activation function other than softmax for which this
works exactly, please let me know.

|> In fact, I think I remember Warren suggesting the use of fixed (i.e.,
|> nonlearnable) output biases equal to the design priors with linear and
|> tanh output and hidden node activations, respectively. Then it would
|> seem that making a prior transformation is trivial.

Wasn't me. I don't think fixed output biases would work right unless
the hidden units had fixed means.


-- 

Warren S. Sarle       SAS Institute Inc.   The opinions expressed here
saswss@unx.sas.com    SAS Campus Drive     are mine and not necessarily
(919) 677-8000        Cary, NC 27513, USA  those of SAS Institute.
* Do not send me unsolicited commercial, political, or religious email *
