Newsgroups: comp.ai.neural-nets
Path: cantaloupe.srv.cs.cmu.edu!bb3.andrew.cmu.edu!newsfeed.pitt.edu!gatech!news.mathworks.com!nntp.primenet.com!news.sprintlink.net!news-stk-3.sprintlink.net!interpath!news.interpath.net!sas!newshost.unx.sas.com!saswss
From: saswss@hotellng.unx.sas.com (Warren Sarle)
Subject: Re: More questions about Sigmoid function
Originator: saswss@hotellng.unx.sas.com
Sender: news@unx.sas.com (Noter of Newsworthy Events)
Message-ID: <DxL7x2.DnL@unx.sas.com>
Date: Wed, 11 Sep 1996 21:34:14 GMT
X-Nntp-Posting-Host: hotellng.unx.sas.com
References: <50jrj3$kjd@hecate.umd.edu> <50vn24$s5p@llnews.ll.mit.edu> <5101ai$lcn@delphi.cs.ucla.edu>
Organization: SAS Institute Inc.
Lines: 51


In article <5101ai$lcn@delphi.cs.ucla.edu>, edwin@cs.ucla.edu (E. Robert Tisdale) writes:
|> ...
|> I think Jian-Zheng Zhou is referring to my assertion that sigmoidal functions
|> are only VALID for binary ({0, 1}, {-1, +1}, etc.) outputs.  Warren Sarle
|> took exception to my remark and seemed to imply that he thought that sigmoidal
|> output functions were valid estimators of probability!

Yes, sigmoid output activation functions are routinely used by
statisticians to estimate probabilities, and I am frankly baffled
by Bob's objection to such usage. 

In logistic regression and similar methods with other types of sigmoids,
the form of the sigmoid function is part of the model specification and
should be validated, just as other parts of the model specification such
as linearity and independence are validated. The logistic function is a
particularly useful variety of sigmoid function, since it is the form
taken by the posterior probability in a discriminant analysis of two
multivariate normal populations with equal covariance matrices, and it
has useful interpretations in terms of log-odds. The inverse of the
logistic function is also the canonical link function for a binomial
distribution. So from a statistical point of view, a logistic function
is the most obvious output activation function to use for estimating
probabilities.

In MLPs, the form of the output activation function is much less
critical due the universal approximation property of MLPs. It is often
convenient to use a logistic function for the log-odds interpretation,
but many other sigmoid functions will work just as well when you are not
dealing with such convenient situations as multivariate normal
distributions. Non-sigmoidal output activation functions, such as
Gaussians, can also be used to estimate probabilities. It is convenient
to use an output activation function with a range of (0,1) to keep the
log likelihood finite, but this is by no means necessary.

References:

   McCullagh, P. and Nelder, J.A. (1989) Generalized Linear Models,
   2nd ed., London: Chapman & Hall.

   Jordan, M.I. (1995), "Why the logistic function? A tutorial
   discussion on probabilities and neural networks",
   ftp://psyche.mit.edu/pub/jordan/uai.ps.Z



-- 

Warren S. Sarle       SAS Institute Inc.   The opinions expressed here
saswss@unx.sas.com    SAS Campus Drive     are mine and not necessarily
(919) 677-8000        Cary, NC 27513, USA  those of SAS Institute.
