Newsgroups: comp.ai.neural-nets
Path: cantaloupe.srv.cs.cmu.edu!das-news2.harvard.edu!news2.near.net!howland.reston.ans.net!news.sprintlink.net!redstone.interpath.net!sas!mozart.unx.sas.com!saswss
From: saswss@hotellng.unx.sas.com (Warren Sarle)
Subject: Re: Question:  Training data with unknown values.
Originator: saswss@hotellng.unx.sas.com
Sender: news@unx.sas.com (Noter of Newsworthy Events)
Message-ID: <D2xoM2.n4q@unx.sas.com>
Date: Tue, 24 Jan 1995 23:44:26 GMT
References: <3fgvqj$a5c@canopus.cc.umanitoba.ca> <3g3juk$9uc@aplcomm.jhuapl.edu>
Nntp-Posting-Host: hotellng.unx.sas.com
Organization: SAS Institute Inc.
Lines: 30


In article <3g3juk$9uc@aplcomm.jhuapl.edu>, randy@aplcorejhuapl.edu (Randall C. Poe) writes:
|> In article <3fgvqj$a5c@canopus.cc.umanitoba.ca>, umengbr0@cc.umanitoba.ca (Jonathan Thomas Engbrecht) writes:
|> |> I am working on a backprop NN where lines of input are being added twice daily.
|> |> Problem...  occasionally an input may be missing from the incoming data.  The software
|> |> (Brainmaker Professional) that I am using has no built in method of dealing with
|> |> missing data and with 200+ inputs it seems a shame to discard all of the incoming
|> |> information just because one is missing.  Although it may be possible to make a "best
|> |> guess" at the missing value, it seems to me there should be an easier way to do this.
|> |>
|>
|> Suggestion:  Assign a random value.

That would work for missing inputs if you make several copies of each
traning case that has missing inputs and assign different random
values (sampled from the distribution of the input variable) to each
case. If you made c copies of a case, then you would also need to
down-weight each copy by giving it a weight of 1/c instead of 1.
Instead of sampling randomly, you could select values systematically,
perhaps quantiles of the input distribution.

If you have missing target values, you should just omit those
particular values from training; there is no need to omit the entire
case unless all of the targets are missing.

-- 

Warren S. Sarle       SAS Institute Inc.   The opinions expressed here
saswss@unx.sas.com    SAS Campus Drive     are mine and not necessarily
(919) 677-8000        Cary, NC 27513, USA  those of SAS Institute.
