Newsgroups: comp.ai.neural-nets
Path: cantaloupe.srv.cs.cmu.edu!das-news2.harvard.edu!news2.near.net!news.mathworks.com!news.duke.edu!concert!sas!mozart.unx.sas.com!saswss
From: saswss@hotellng.unx.sas.com (Warren Sarle)
Subject: Re: K-Means distribution
Originator: saswss@hotellng.unx.sas.com
Sender: news@unx.sas.com (Noter of Newsworthy Events)
Message-ID: <CzoKJz.CEs@unx.sas.com>
Date: Tue, 22 Nov 1994 17:57:35 GMT
References: <kerog-1711941413540001@sectrl> <CzF8FB.7M0@acsu.buffalo.edu> <CzFMM5.8q3@unx.sas.com> <kerog-2111941942370001@sectrl>
Nntp-Posting-Host: hotellng.unx.sas.com
Organization: SAS Institute Inc.
Lines: 45


In article <kerog-2111941942370001@sectrl>, kerog@sp.isl.secom.co (Keith Rogers) writes:
|> In article <CzFMM5.8q3@unx.sas.com>, saswss@hotellng.unx.sas.com (Warren
|> Sarle) wrote:
|> ...
|> > |> Lets say you have n nodes or centres
|> > |> Initialise them to random values
|> >
|> > This type of initialization is prone to degenerate solutions--one or
|> > more clusters are likely to have zero members. It is better to choose a
|> > subset of the training cases as initial centers. This can be done
|> > randomly or systematically. The systematic algorithm used in the
|> > FASTCLUS procedure in the SAS/STAT product is guaranteed to produce a
|> > global optimum if the data contain well-separated clusters. For RBF
|> > applications, however, one would not often expect to have well-separated
|> > clusters.
|>
|> Thanks for the info, but could you please add a little more there?
|> What does one do instead for RBF applications?

I suspect that RBF results are fairly insensitive to the clustering
method as long as you avoid degeneracies and really bad local minima,
neither of which are likely to occur with any decent k-means algorithm.
You are simply trying to divide up the training cases into convenient
cells of roughly the same probability. To get a major improvement over
any of the usual k-means algorithms, you would need to take the target
values into account. But if you take the targets into account somehow
when forming the clusters, then it seems to me that you might as well go
all the way and train the RBF centers.

Counterpropagation is one way of taking the targets into account.  You
can do essentially the same thing with k-means by clustering on both the
inputs and the target values simultaneously. The trouble with this
approach is that you need to weight the inputs and targets rather
carefully, so that the targets exert some influence but not so much that
any clusters are formed mainly on the basis of the targets.  I have
experimented with this a bit and have been unable to get it to work any
better than simply clustering the inputs.


-- 

Warren S. Sarle       SAS Institute Inc.   The opinions expressed here
saswss@unx.sas.com    SAS Campus Drive     are mine and not necessarily
(919) 677-8000        Cary, NC 27513, USA  those of SAS Institute.
