Newsgroups: comp.ai.neural-nets
Path: cantaloupe.srv.cs.cmu.edu!nntp.club.cc.cmu.edu!news.duq.edu!newsfeed.pitt.edu!portc02.blue.aol.com!howland.erols.net!news.mathworks.com!newsgate.duke.edu!interpath!news.interpath.net!news.interpath.net!sas!newshost.unx.sas.com!saswss
From: saswss@hotellng.unx.sas.com (Warren Sarle)
Subject: Re: classification with partly labelled data
Originator: saswss@hotellng.unx.sas.com
Sender: news@unx.sas.com (Noter of Newsworthy Events)
Message-ID: <E1nM7u.8Do@unx.sas.com>
Date: Fri, 29 Nov 1996 22:52:42 GMT
X-Nntp-Posting-Host: hotellng.unx.sas.com
References:  <329C2152.5591@esat.kuleuven.ac.be>
Organization: SAS Institute Inc.
Lines: 41


In article <329C2152.5591@esat.kuleuven.ac.be>, Yves Moreau <moreau@esat.kuleuven.ac.be> writes:
|> I have a somewhat unconventional classication task, and I do not know
|> how to handle it best. The problem goes as follows:
|> 
|> * The data belongs to two classes A and B.
|> 
|> * Class A is for normal behavior and class B is for abnormal behavior;
|> class A is much more probable than class B (class A is maybe 100 to
|> 10000 times more probable than class B, I do not know exactly). 
|> 
|> * I have a lot of unlabelled data U available (thus, the vast majority
|> of it is of class A, but not all). 
|> 
|> * But I also have a small data set L with data from class B only (maybe
|> 10000 times smaller than the unlabelled data set).
|> 
|> Thus, since I only have labelled data for one of the two classes, I
|> cannot directly do supervised learning. One possibility would be to
|> discard the small labelled data set and do purely unsupervised learning.
|> But I am not willing to do that if there is any way around. 

See section 2.6, "Partially Classified Training Data", and subsequent
sections on pp. 37-46 in:

   McLachlan, G.J. (1992) Discriminant Analysis and Statistical Pattern
   Recognition, Wiley: NY.

The usual approach is to fit a mixture model for which the mixture
component is known for each case in data set L but is unknown for data
set U.



-- 

Warren S. Sarle       SAS Institute Inc.   The opinions expressed here
saswss@unx.sas.com    SAS Campus Drive     are mine and not necessarily
(919) 677-8000        Cary, NC 27513, USA  those of SAS Institute.
 *** Do not send me unsolicited commercial or political email! ***

