Newsgroups: comp.ai.neural-nets
Path: cantaloupe.srv.cs.cmu.edu!bb3.andrew.cmu.edu!newsfeed.pitt.edu!scramble.lm.com!news.math.psu.edu!news.cse.psu.edu!uwm.edu!math.ohio-state.edu!howland.reston.ans.net!swrinde!newsfeed.internetmci.com!in2.uu.net!news.interpath.net!sas!newshost.unx.sas.com!saswss
From: saswss@hotellng.unx.sas.com (Warren Sarle)
Subject: Re: testing < training
Originator: saswss@hotellng.unx.sas.com
Sender: news@unx.sas.com (Noter of Newsworthy Events)
Message-ID: <Dr99JA.420@unx.sas.com>
Date: Sat, 11 May 1996 19:16:22 GMT
X-Nntp-Posting-Host: hotellng.unx.sas.com
References:  <4msn2k$1md@mbox.wins.uva.nl>
Organization: SAS Institute Inc.
Keywords: FFNN,Levenberg-Macquardt,Rolling Force pred.
Lines: 40


Please type carriage returns!!

In article <4msn2k$1md@mbox.wins.uva.nl>, dbrouwer@fwi.uva.nl (Dennis Brouwer) writes:
|> ...
|> To my big surprise I noticed that the SSE on a trainingset was
|> almost always higher than on a testset.Of course the first thing I
|> thought,was that I must have reversed the two quantities in my
|> program,but I couldnt find any bugs.  Of course,I was expecting the
|> generalisation on the testset to be slightly worse,or maybe equal
|> and sometimes maybe better than the training error, but this happens
|> to me every time in 84 experiments with different quantities.
|> 
|> So,I was wondering has anybody else ever experienced this kind of
|> thing when training neural networks. Or could somebody come up with
|> an explanation of how this strange phenonema might occur,except for
|> a bug in the software.
|> 
|> Some details: I have 2 inputs,one output and 17000 samples.
|>               I select 4000 random samples from the 17000.
|>               I have left 13000 samples,and I take the first
|>               12000 samples to check the generalisation.

SSE (Sum of Squared Errors) depends on the number of cases. When you
are comparing results from data sets with different numbers of cases,
you should use the average squared error (SSE/number of target values)
or mean squared error (SSE/degrees of freedom).

But if I interpret the above figures correctly, you have more cases
in the test set (12000) than in the training set (4000). So if the
test SSE is less than the training SSE, the test ASE must be less than
1/3 the training SSE. If that happens consistently, something weird
is going on, but I have no idea what.


-- 

Warren S. Sarle       SAS Institute Inc.   The opinions expressed here
saswss@unx.sas.com    SAS Campus Drive     are mine and not necessarily
(919) 677-8000        Cary, NC 27513, USA  those of SAS Institute.
