Newsgroups: comp.ai.neural-nets
Path: cantaloupe.srv.cs.cmu.edu!nntp.club.cc.cmu.edu!goldenapple.srv.cs.cmu.edu!das-news2.harvard.edu!fas-news.harvard.edu!oitnews.harvard.edu!news.dfci.harvard.edu!camelot.ccs.neu.edu!news.mathworks.com!newsgate.duke.edu!interpath!news.interpath.net!news.interpath.net!sas!newshost.unx.sas.com!hotellng.unx.sas.com!saswss
From: saswss@unx.sas.com (Warren Sarle)
Subject: changes to "comp.ai.neural-nets FAQ" -- monthly posting
Originator: saswss@hotellng.unx.sas.com
Sender: news@unx.sas.com (Noter of Newsworthy Events)
Message-ID: <nn.changes.posting_859608042@hotellng.unx.sas.com>
Supersedes: <nn.changes.posting_857188841@hotellng.unx.sas.com>
Date: Sat, 29 Mar 1997 04:00:43 GMT
Expires: Sat, 3 May 1997 04:00:42 GMT
X-Nntp-Posting-Host: hotellng.unx.sas.com
Reply-To: saswss@unx.sas.com (Warren Sarle)
Organization: SAS Institute Inc., Cary, NC, USA
Keywords: modifications, new, additions, deletions
Followup-To: comp.ai.neural-nets
Lines: 1281

==> nn1.changes.body <==
*** nn1.oldbody	Fri Feb 28 23:00:15 1997
--- nn1.body	Fri Mar 28 23:00:13 1997
***************
*** 1,4 ****
  Archive-name: ai-faq/neural-nets/part1
! Last-modified: 1997-01-07
  URL: ftp://ftp.sas.com/pub/neural/FAQ.html
  Maintainer: saswss@unx.sas.com (Warren S. Sarle)
--- 1,4 ----
  Archive-name: ai-faq/neural-nets/part1
! Last-modified: 1997-03-28
  URL: ftp://ftp.sas.com/pub/neural/FAQ.html
  Maintainer: saswss@unx.sas.com (Warren S. Sarle)
***************
*** 69,72 ****
--- 69,75 ----
     Who is concerned with NNs?
     How are layers counted?
+    What are cases and variables?
+    What are the population, sample, training set, design set, validation
+    set, and test set?
     How are NNs related to statistical methods?
  
***************
*** 271,275 ****
  The following archives are available for comp.ai.neural-nets: 
  
!  o Deja News 
   o ftp://ftp.cs.cmu.edu/user/ai/pubs/news/comp.ai.neural-nets 
   o http://asknpac.npac.syr.edu 
--- 274,278 ----
  The following archives are available for comp.ai.neural-nets: 
  
!  o Deja News at http://xp8.dejanews.com/ 
   o ftp://ftp.cs.cmu.edu/user/ai/pubs/news/comp.ai.neural-nets 
   o http://asknpac.npac.syr.edu 
***************
*** 414,418 ****
     http://www.mbfys.kun.nl/snn/siena/cases/ 
   o The DTI NeuroComputing Web's Applications Portfolio at 
!    http://www.globalweb.co.uk/nctt/portfolio/ 
   o The Applications Corner, provided by NeuroDimension, Inc., at 
     http://www.nd.com/appcornr/purpose.htm 
--- 417,421 ----
     http://www.mbfys.kun.nl/snn/siena/cases/ 
   o The DTI NeuroComputing Web's Applications Portfolio at 
!    http://www.globalweb.co.uk/nctt/portfolo/ 
   o The Applications Corner, provided by NeuroDimension, Inc., at 
     http://www.nd.com/appcornr/purpose.htm 
***************
*** 495,498 ****
--- 498,660 ----
  ------------------------------------------------------------------------
  
+ Subject: What are cases and variables?
+ ======================================
+ 
+ A vector of values presented at one time to all the input units of a neural
+ network is called a "case", "example", "pattern, "sample", etc. The term
+ "case" will be used in this FAQ because it is widely recognized,
+ unambiguous, and requires less typing than the other terms. A case may
+ include not only input values, but also target values and possibly other
+ information. 
+ 
+ A vector of values presented at different times to a single input unit is
+ often called an "input variable" or "feature". To a statistician, it is a
+ "predictor", "regressor", "covariate", "independent variable", "explanatory
+ variable", etc. A vector of target values associated with a given output
+ unit of the network during training will be called a "target variable" in
+ this FAQ. To a statistician, it is usually a "response" or "dependent
+ variable". 
+ 
+ A "data set" is a matrix containing one or (usually) more cases. In this
+ FAQ, it will be assumed that cases are rows of the matrix, while variables
+ are columns. 
+ 
+ Note that the often-used term "input vector" is ambiguous; it can mean
+ either an input case or an input variable. 
+ 
+ ------------------------------------------------------------------------
+ 
+ Subject: What are the population, sample, training set,
+ =======================================================
+ design set, validation set, and test set?
+ =========================================
+ 
+ There seems to be no term in the NN literature for the set of all cases that
+ you want to be able to generalize to. Statisticians call this set the
+ "population". Neither is there a consistent term in the NN literature for
+ the set of cases that are available for training and evaluating an NN.
+ Statisticians call this set the "sample". The sample is usually a subset of
+ the population. 
+ 
+ In NN methodology, the sample is often subdivided into "training",
+ "validation", and "test" sets. The distinctions among these subsets are
+ crucial, but the terms "validation" and "test" sets are often confused.
+ There is no book in the NN literature more authoritative than Ripley (1996),
+ from which the following definitions are taken (p.354): 
+ 
+ Training set: 
+    A set of examples used for learning, that is to fit the parameters
+    [weights] of the classifier. 
+ Validation set: 
+    A set of examples used to tune the parameters of a classifier, for
+    example to choose the number of hidden units in a neural network. 
+ Test set: 
+    A set of examples used only to assess the performance [generalization] of
+    a fully-specified classifier. 
+ 
+ Bishop (1995), another indispensable reference on neural networks, provides
+ the following explanation (p. 372): 
+ 
+    Since our goal is to find the network having the best performance on
+    new data, the simplest approach to the comparison of different
+    networks is to evaluate the error function using data which is
+    independent of that used for training. Various networks are trained
+    by minimization of an appropriate error function defined with respect
+    to a training data set. The performance of the networks is then
+    compared by evaluating the error function using an independent 
+    validation set, and the network having the smallest error with
+    respect to the validation set is selected. This approach is called
+    the hold out method. Since this procedure can itself lead to some
+    overfitting to the validation set, the performance of the selected
+    network should be confirmed by measuring its performance on a third
+    independent set of data called a test set. 
+ 
+ The crucial point is that a test set, by definition, is never used to choose
+ among two or more networks, so that the error on the test set provides an
+ unbiased estimate of the generalization error (assuming that the test set is
+ representative of the population, etc.). Any data set that is used to choose the
+ best of two or more networks is, by definition, a validation set, and the error of
+ the chosen network on the validation set is optimistically biased. 
+ 
+ There is a problem with the usual distinction between training and validation
+ sets. Some training approaches, such as early stopping, require a validation
+ set, so in a sense, the validation set is used for training. Other approaches,
+ such as maximum likelihood, do not inherently require a validation set. So the
+ "training" set for maximum likelihood might encompass both the "training" and
+ "validation" sets for early stopping. Greg Heath has suggested the term
+ "design" set be used for cases that are used solely to adjust the weights in a
+ network, while "training" set be used to encompass both design and validation
+ sets. There is considerable merit to this suggestion, but it has not yet been
+ widely adopted. 
+ 
+ But things can get more complicated. Suppose you want to train nets with 5 ,10,
+ and 20 hidden units using maximum likelihood, and you want to train nets with
+ 20 and 50 hidden units using early stopping. You also want to use a validation
+ set to choose the best of these various networks. Should you use the same
+ validation set for early stopping that you use for the final network choice, or
+ should you use two separate validation sets? That is, you could divide the
+ sample into 3 subsets, say A, B, C and proceed as follows: 
+ 
+  o Do maximum likelihood using A. 
+  o Do early stopping with A to adjust the weights and B to decide when to stop
+    (this makes B a validation set). 
+  o Choose among all 3 nets trained by maximum likelihood and the 2 nets
+    trained by early stopping based on the error computed on B (the validation
+    set). 
+  o Estimate the generalization error of the chosen network using C (the test
+    set). 
+ 
+ Or you could divide the sample into 4 subsets, say A, B, C, and D and proceed
+ as follows: 
+ 
+  o Do maximum likelihood using A and B combined. 
+  o Do early stopping with A to adjust the weights and B to decide when to stop
+    (this makes B a validation set with respect to early stopping). 
+  o Choose among all 3 nets trained by maximum likelihood and the 2 nets
+    trained by early stopping based on the error computed on C (this makes C a
+    second validation set). 
+  o Estimate the generalization error of the chosen network using D (the test
+    set). 
+ 
+ Or, with the same 4 subsets, you could take a third approach: 
+ 
+  o Do maximum likelihood using A. 
+  o Choose among the 3 nets trained by maximum likelihood based on the error
+    computed on B (the first validation set) 
+  o Do early stopping with A to adjust the weights and B (the first validation
+    set) to decide when to stop. 
+  o Choose among the best net trained by maximum likelihood and the 2 nets
+    trained by early stopping based on the error computed on C (the second
+    validation set). 
+  o Estimate the generalization error of the chosen network using D (the test
+    set). 
+ 
+ You could argue that the first approach is biased towards choosing a net
+ trained by early stopping. Early stopping involves a choice among a potentially
+ large number of networks, and therefore provides more opportunity for
+ overfitting the validation set than does the choice among only 3 networks
+ trained by maximum likelihood. Hence if you make the final choice of networks
+ using the same validation set (B) that was used for early stopping, you give an
+ unfair advantage to early stopping. If you are writing an article to compare
+ various training methods, this bias could be a serious flaw. But if you are using
+ NNs for some practical application, this bias might not matter at all, since you
+ obtain an honest estimate of generalization error using C. 
+ 
+ You could also argue that the second and third approaches are too wasteful in
+ their use of data. This objection could be important if your sample contains 100
+ cases, but will probably be of little concern if your sample contains
+ 100,000,000 cases. For small samples, there are other methods that make more
+ efficient use of data; see "What are cross-validation and bootstrapping?" 
+ 
+ References: 
+ 
+    Bishop, C.M. (1995), Neural Networks for Pattern Recognition, Oxford:
+    Oxford University Press. 
+ 
+    Ripley, B.D. (1996) Pattern Recognition and Neural Networks, Cambridge:
+    Cambridge University Press. 
+ 
+ ------------------------------------------------------------------------
+ 
  Subject: How are NNs related to statistical methods? 
  =====================================================
***************
*** 505,516 ****
  statistics. Some neural networks do not learn (e.g., Hopfield nets) and
  therefore have little to do with statistics. Some neural networks can learn
! successfully only from noise-free data (e.g., ART or the perceptron rule)
! and therefore would not be considered statistical methods. But most neural
! networks that can learn to generalize effectively from noisy data are
! similar or identical to statistical methods. For example: 
  
   o Feedforward nets with no hidden layer (including functional-link neural
!    nets and higher-order neural nets) are basically generalized linear
!    models. 
   o Feedforward nets with one hidden layer are closely related to projection
     pursuit regression. 
--- 667,677 ----
  statistics. Some neural networks do not learn (e.g., Hopfield nets) and
  therefore have little to do with statistics. Some neural networks can learn
! successfully only from noise-free data (e.g., ART or the perceptron rule) and
! therefore would not be considered statistical methods. But most neural
! networks that can learn to generalize effectively from noisy data are similar or
! identical to statistical methods. For example: 
  
   o Feedforward nets with no hidden layer (including functional-link neural
!    nets and higher-order neural nets) are basically generalized linear models. 
   o Feedforward nets with one hidden layer are closely related to projection
     pursuit regression. 
***************
*** 524,539 ****
  
   o Kohonen's self-organizing maps. 
!  o Reinforcement learning ((although this is treated in the operations
!    research literature on Markov decision processes). 
!  o Stopped training (the purpose and effect of stopped training are similar
!    to shrinkage estimation, but the method is quite different). 
  
  Feedforward nets are a subset of the class of nonlinear regression and
! discrimination models. Statisticians have studied the properties of this
! general class but had not considered the specific case of feedforward neural
! nets before such networks were popularized in the neural network field.
! Still, many results from the statistical theory of nonlinear models apply
! directly to feedforward nets, and the methods that are commonly used for
! fitting nonlinear models, such as various Levenberg-Marquardt and conjugate
  gradient algorithms, can be used to train feedforward nets. 
  
--- 685,700 ----
  
   o Kohonen's self-organizing maps. 
!  o Reinforcement learning ((although this is treated in the operations research
!    literature on Markov decision processes). 
!  o Stopped training (the purpose and effect of stopped training are similar to
!    shrinkage estimation, but the method is quite different). 
  
  Feedforward nets are a subset of the class of nonlinear regression and
! discrimination models. Statisticians have studied the properties of this general
! class but had not considered the specific case of feedforward neural nets
! before such networks were popularized in the neural network field. Still, many
! results from the statistical theory of nonlinear models apply directly to
! feedforward nets, and the methods that are commonly used for fitting
! nonlinear models, such as various Levenberg-Marquardt and conjugate
  gradient algorithms, can be used to train feedforward nets. 
  
***************
*** 540,580 ****
  While neural nets are often defined in terms of their algorithms or
  implementations, statistical methods are usually defined in terms of their
! results. The arithmetic mean, for example, can be computed by a (very
! simple) backprop net, by applying the usual formula SUM(x_i)/n, or by
! various other methods. What you get is still an arithmetic mean regardless
! of how you compute it. So a statistician would consider standard backprop,
! Quickprop, and Levenberg-Marquardt as different algorithms for implementing
! the same statistical model such as a feedforward net. On the other hand,
! different training criteria, such as least squares and cross entropy, are
! viewed by statisticians as fundamentally different estimation methods with
! different statistical properties. 
  
  It is sometimes claimed that neural networks, unlike statistical models,
! require no distributional assumptions. In fact, neural networks involve
! exactly the same sort of distributional assumptions as statistical models,
! but statisticians study the consequences and importance of these assumptions
  while most neural networkers ignore them. For example, least-squares
  training methods are widely used by statisticians and neural networkers.
! Statisticians realize that least-squares training involves implicit
! distributional assumptions in that least-squares estimates have certain
! optimality properties for noise that is normally distributed with equal
! variance for all training cases and that is independent between different
! cases. These optimality properties are consequences of the fact that
! least-squares estimation is maximum likelihood under those conditions.
! Similarly, cross-entropy is maximum likelihood for noise with a Bernoulli
! distribution. If you study the distributional assumptions, then you can
! recognize and deal with violations of the assumptions. For example, if you
! have normally distributed noise but some training cases have greater noise
! variance than others, then you may be able to use weighted least squares
! instead of ordinary least squares to obtain more efficient estimates. 
  
  Hundreds, perhaps thousands of people have run comparisons of neural nets
! with "traditional statistics" (whatever that means). Most such studies
! involve one or two data sets, and are of little use to anyone else unless
! they happen to be analyzing the same kind of data. But there is an
! impressive comparative study of supervised classification by Michie,
! Spiegelhalter, and Taylor (1994), and an excellent comparison of
! unsupervised Kohonen networks and k-means clustering by Balakrishnan,
! Cooper, Jacob, and Lewis (1994). 
  
  Communication between statisticians and neural net researchers is often
--- 701,740 ----
  While neural nets are often defined in terms of their algorithms or
  implementations, statistical methods are usually defined in terms of their
! results. The arithmetic mean, for example, can be computed by a (very simple)
! backprop net, by applying the usual formula SUM(x_i)/n, or by various other
! methods. What you get is still an arithmetic mean regardless of how you
! compute it. So a statistician would consider standard backprop, Quickprop, and
! Levenberg-Marquardt as different algorithms for implementing the same
! statistical model such as a feedforward net. On the other hand, different
! training criteria, such as least squares and cross entropy, are viewed by
! statisticians as fundamentally different estimation methods with different
! statistical properties. 
  
  It is sometimes claimed that neural networks, unlike statistical models,
! require no distributional assumptions. In fact, neural networks involve exactly
! the same sort of distributional assumptions as statistical models, but
! statisticians study the consequences and importance of these assumptions
  while most neural networkers ignore them. For example, least-squares
  training methods are widely used by statisticians and neural networkers.
! Statisticians realize that least-squares training involves implicit distributional
! assumptions in that least-squares estimates have certain optimality
! properties for noise that is normally distributed with equal variance for all
! training cases and that is independent between different cases. These
! optimality properties are consequences of the fact that least-squares
! estimation is maximum likelihood under those conditions. Similarly,
! cross-entropy is maximum likelihood for noise with a Bernoulli distribution. If
! you study the distributional assumptions, then you can recognize and deal with
! violations of the assumptions. For example, if you have normally distributed
! noise but some training cases have greater noise variance than others, then you
! may be able to use weighted least squares instead of ordinary least squares to
! obtain more efficient estimates. 
  
  Hundreds, perhaps thousands of people have run comparisons of neural nets
! with "traditional statistics" (whatever that means). Most such studies involve
! one or two data sets, and are of little use to anyone else unless they happen to
! be analyzing the same kind of data. But there is an impressive comparative
! study of supervised classification by Michie, Spiegelhalter, and Taylor (1994),
! and an excellent comparison of unsupervised Kohonen networks and k-means
! clustering by Balakrishnan, Cooper, Jacob, and Lewis (1994). 
  
  Communication between statisticians and neural net researchers is often
***************
*** 612,617 ****
     Constrained and Unconstrained Systems, Springer-Verlag. 
  
!    Michie, D., Spiegelhalter, D.J. and Taylor, C.C. (1994), Machine
!    Learning, Neural and Statistical Classification, Ellis Horwood. 
  
     Ripley, B.D. (1993), "Statistical Aspects of Neural Networks", in O.E.
--- 772,777 ----
     Constrained and Unconstrained Systems, Springer-Verlag. 
  
!    Michie, D., Spiegelhalter, D.J. and Taylor, C.C. (1994), Machine Learning,
!    Neural and Statistical Classification, Ellis Horwood. 
  
     Ripley, B.D. (1993), "Statistical Aspects of Neural Networks", in O.E.
***************
*** 627,633 ****
     Cambridge University Press. 
  
!    Sarle, W.S. (1994), "Neural Networks and Statistical Models," 
!    Proceedings of the Nineteenth Annual SAS Users Group International
!    Conference, Cary, NC: SAS Institute, pp 1538-1550. (
     ftp://ftp.sas.com/pub/neural/neural1.ps) 
  
--- 787,793 ----
     Cambridge University Press. 
  
!    Sarle, W.S. (1994), "Neural Networks and Statistical Models," Proceedings
!    of the Nineteenth Annual SAS Users Group International Conference,
!    Cary, NC: SAS Institute, pp 1538-1550. (
     ftp://ftp.sas.com/pub/neural/neural1.ps) 
  

==> nn2.changes.body <==
*** nn2.oldbody	Fri Feb 28 23:00:21 1997
--- nn2.body	Fri Mar 28 23:00:19 1997
***************
*** 1,4 ****
  Archive-name: ai-faq/neural-nets/part2
! Last-modified: 1997-02-28
  URL: ftp://ftp.sas.com/pub/neural/FAQ2.html
  Maintainer: saswss@unx.sas.com (Warren S. Sarle)
--- 1,4 ----
  Archive-name: ai-faq/neural-nets/part2
! Last-modified: 1997-03-05
  URL: ftp://ftp.sas.com/pub/neural/FAQ2.html
  Maintainer: saswss@unx.sas.com (Warren S. Sarle)
***************
*** 273,277 ****
   o Arnold Neumaier's page on global optimization at 
     http://solon.cma.univie.ac.at/~neum/glopt.html. 
!  o 'Simon Streltsovs page on global optimization at http://cad.bu.edu/go. 
  
  References: 
--- 273,279 ----
   o Arnold Neumaier's page on global optimization at 
     http://solon.cma.univie.ac.at/~neum/glopt.html. 
!  o Simon Streltsov's page on global optimization at http://cad.bu.edu/go. 
!  o Lester Ingber's page on Adaptive Simulated Annealing (ASA), karate, etc.
!    at http://www.ingber.com/ or http://www.alumni.caltech.edu/~ingber/ 
  
  References: 
***************
*** 2043,2051 ****
     Oxford University Press. 
  
-    K. I. Diamantaras, S. Y. Kung (1996) Principal Component Neural
-    Networks: Theory and Applications, NY: Wiley. 
- 
     Deco, G. and Obradovic, D. (1996), An Information-Theoretic Approach to
     Neural Computing, NY: Springer-Verlag. 
  
     Hertz, J., Krogh, A., and Palmer, R. (1991). Introduction to the Theory of
--- 2045,2053 ----
     Oxford University Press. 
  
     Deco, G. and Obradovic, D. (1996), An Information-Theoretic Approach to
     Neural Computing, NY: Springer-Verlag. 
+ 
+    Diamantaras, K.I., and Kung, S.Y. (1996) Principal Component Neural
+    Networks: Theory and Applications, NY: Wiley. 
  
     Hertz, J., Krogh, A., and Palmer, R. (1991). Introduction to the Theory of

==> nn3.changes.body <==
*** nn3.oldbody	Fri Feb 28 23:00:25 1997
--- nn3.body	Fri Mar 28 23:00:23 1997
***************
*** 1,4 ****
  Archive-name: ai-faq/neural-nets/part3
! Last-modified: 1996-12-19
  URL: ftp://ftp.sas.com/pub/neural/FAQ3.html
  Maintainer: saswss@unx.sas.com (Warren S. Sarle)
--- 1,4 ----
  Archive-name: ai-faq/neural-nets/part3
! Last-modified: 1997-03-19
  URL: ftp://ftp.sas.com/pub/neural/FAQ3.html
  Maintainer: saswss@unx.sas.com (Warren S. Sarle)
***************
*** 185,189 ****
  The best way to avoid overfitting is to use lots of training data. If you
  have at least 30 times as many training cases as there are weights in the
! network, you are unlikely to suffer from overfitting. But you can't
  arbitrarily reduce the number of weights for fear of underfitting. 
  
--- 185,189 ----
  The best way to avoid overfitting is to use lots of training data. If you
  have at least 30 times as many training cases as there are weights in the
! network, you are unlikely to suffer from much overfitting. But you can't
  arbitrarily reduce the number of weights for fear of underfitting. 
  
***************
*** 1198,1209 ****
  
  Cross-validation and bootstrapping are both methods for estimating
! generalization error based on "resampling". In k-fold cross-validation, you
! divide the data into k subsets of equal size. You train the net k times,
! each time leaving out one of the subsets from training, but using only the
! omitted subset to compute whatever error criterion interests you. If k
! equals the sample size, this is called leave-one-out cross-validation. A
! more elaborate and expensive version of cross-validation involves leaving
! out all possible subsets of a given size. 
  
  Note that cross-validation is quite different from the "split-sample" or
  "hold-out" method that is commonly used for early stopping in neural nets.
--- 1198,1210 ----
  
  Cross-validation and bootstrapping are both methods for estimating
! generalization error based on "resampling". 
  
+ In k-fold cross-validation, you divide the data into k subsets of equal
+ size. You train the net k times, each time leaving out one of the subsets
+ from training, but using only the omitted subset to compute whatever error
+ criterion interests you. If k equals the sample size, this is called
+ leave-one-out cross-validation. A more elaborate and expensive version of
+ cross-validation involves leaving out all possible subsets of a given size. 
+ 
  Note that cross-validation is quite different from the "split-sample" or
  "hold-out" method that is commonly used for early stopping in neural nets.
***************
*** 1214,1226 ****
  is not obvious. 
  
! Cross-validation is also easily confused with jackknifing. Both involve
! omitting each training case in turn and retraining the network on the
! remaining subset. But cross-validation is used to estimate generalization
! error, while the jackknife is used to estimate the bias of a statistic. In
! the jackknife, you compute some statistic of interest in each subset of the
! data. The average of these subset statistics is compared with the
! corresponding statistic computed from the entire sample in order to estimate
! the bias of the latter. You can also get a jackknife estimate of the
! standard error of a statistic. 
  
  Leave-one-out cross-validation often works well for continuous error
--- 1215,1230 ----
  is not obvious. 
  
! Leave-one-out cross-validation is also easily confused with jackknifing.
! Both involve omitting each training case in turn and retraining the network
! on the remaining subset. But cross-validation is used to estimate
! generalization error, while the jackknife is used to estimate the bias of a
! statistic. In the jackknife, you compute some statistic of interest in each
! subset of the data. The average of these subset statistics is compared with
! the corresponding statistic computed from the entire sample in order to
! estimate the bias of the latter. You can also get a jackknife estimate of
! the standard error of a statistic. Jackknifing can be used to estimate the
! bias of the training error and hence to estimate the generalization error,
! but this process is more complicated than leave-one-out cross-validation
! (Efron, 1982; Ripley, 1996, p. 73). 
  
  Leave-one-out cross-validation often works well for continuous error
***************
*** 1243,1246 ****
--- 1247,1253 ----
  References (see also http://www.statistics.com/books.html): 
  
+    Efron, B. (1982) The Jackknife, the Bootstrap and Other Resampling
+    Plans, Philadelphia: SIAM. 
+ 
     Efron, B. and Tibshirani, R.J. (1993), An Introduction to the Bootstrap,
     London: Chapman & Hall. 
***************
*** 1251,1254 ****
--- 1258,1264 ----
     Masters, T. (1995) Advanced Algorithms for Neural Networks: A C++
     Sourcebook, NY: John Wiley and Sons, ISBN 0-471-10588-0 
+ 
+    Ripley, B.D. (1996) Pattern Recognition and Neural Networks, Cambridge:
+    Cambridge University Press. 
  
     Weiss, S.M. & Kulikowski, C.A. (1991), Computer Systems That Learn,

==> nn4.changes.body <==
*** nn4.oldbody	Fri Feb 28 23:00:29 1997
--- nn4.body	Fri Mar 28 23:00:28 1997
***************
*** 1,4 ****
  Archive-name: ai-faq/neural-nets/part4
! Last-modified: 1997-02-28
  URL: ftp://ftp.sas.com/pub/neural/FAQ4.html
  Maintainer: saswss@unx.sas.com (Warren S. Sarle)
--- 1,4 ----
  Archive-name: ai-faq/neural-nets/part4
! Last-modified: 1997-03-09
  URL: ftp://ftp.sas.com/pub/neural/FAQ4.html
  Maintainer: saswss@unx.sas.com (Warren S. Sarle)
***************
*** 17,20 ****
--- 17,36 ----
  
     Books and articles about Neural Networks?
+       The Best
+          The best popular introduction to NNs
+          The best elementary textbooks on practical use of NNs
+          The best elementary textbook on using and programming NNs
+          The best elementary textbooks on NN research
+          The best intermediate textbooks on NNs
+          The best advanced textbook covering NNs
+          The best books on image and signal processing with NNs
+          The best book on time-series forecasting with NNs
+          The best book on neurofuzzy systems
+          The best comparison of NNs with other classification methods
+       Books for the Beginner
+       The Classics
+       Introductory Journal Articles
+       Not-quite-so-introductory Literature
+       The Worst
     Journals and magazines about Neural Networks?
     The most important conferences concerned with Neural Networks?
***************
*** 22,28 ****
     Other sources of information about NNs?
     Databases for experimentation with NNs?
        The neural-bench Benchmark collection
        Proben1
-       UCI machine learning database
        NIST special databases of the National Institute Of Standards And
        Technology:
--- 38,44 ----
     Other sources of information about NNs?
     Databases for experimentation with NNs?
+       UCI machine learning database
        The neural-bench Benchmark collection
        Proben1
        NIST special databases of the National Institute Of Standards And
        Technology:
***************
*** 52,57 ****
  American, 267 (September), 144-151. 
  
! The best elementary textbooks on using NNs
! ------------------------------------------
  
  Smith, M. (1993). Neural Networks for Statistical Modeling, NY: Van Nostrand
--- 68,73 ----
  American, 267 (September), 144-151. 
  
! The best elementary textbooks on practical use of NNs
! -----------------------------------------------------
  
  Smith, M. (1993). Neural Networks for Statistical Modeling, NY: Van Nostrand
***************
*** 67,73 ****
  Morgan Kaufmann. ISBN 1 55860 065 5. 
  Briefly covers at a very elementary level feedforward nets, linear and
! nearest-neighbor discriminant analysis, trees, and expert sytems. For a book
! at this level, it has an unusually good chapter on estimating generalization
! error, including bootstrapping. 
  
  The best elementary textbook on using and programming NNs
--- 83,90 ----
  Morgan Kaufmann. ISBN 1 55860 065 5. 
  Briefly covers at a very elementary level feedforward nets, linear and
! nearest-neighbor discriminant analysis, trees, and expert sytems,
! emphasizing practical applications. For a book at this level, it has an
! unusually good chapter on estimating generalization error, including
! bootstrapping. 
  
  The best elementary textbook on using and programming NNs
***************
*** 79,83 ****
  are listed below). He combines generally sound practical advice with some
  basic statistical knowledge to produce a programming text that is far
! superior to the competition (see "The Worst" below). 
  
  The best intermediate textbooks on NNs
--- 96,127 ----
  are listed below). He combines generally sound practical advice with some
  basic statistical knowledge to produce a programming text that is far
! superior to the competition (see "The Worst" below). Not everyone likes his
! C++ code (the usual complaint is that the code is not sufficiently OO) but,
! unlike the code in some other books, Masters's code has been successfully
! compiled and run by some readers of comp.ai.neural-nets. 
! 
! The best elementary textbooks on NN research
! --------------------------------------------
! 
! Fausett, L. (1994), Fundamentals of Neural Networks: Architectures,
! Algorithms, and Applications, Englewood Cliffs, NJ: Prentice Hall, ISBN
! 0-13-334186-0. Also published as a Prentice Hall International Edition, ISBN
! 0-13-042250-9 
! Exceptionally clear descriptions of a wide variety of NN architectures and
! algorithms, with more examples than are found in most NN books. The
! algorithms are laid out in sufficient detail (in English and simple
! formulas) that students should be able to program then easily in any good
! programming language. Recommended for classroom use if the instructor
! provides supplementary material on how to get good generalization. Sample
! software (source code listings in C and Fortran) is included in an
! Instructor's Manual. 
! 
! Anderson, J.A. (1995), An Introduction to Neural Networks, Cambridge,MA:
! The MIT Press, ISBN 0-262-01144-1. 
! Anderson provides an accessible introduction to the AI and
! neurophysiological sides of NN research, although the book is weak regarding
! practical aspects of using NNs. Recommended for classroom use if the
! instructor provides supplementary material on how to get good
! generalization. 
  
  The best intermediate textbooks on NNs
***************
*** 89,93 ****
  This is definitely the best book on neural nets for practical applications
  (rather than for neurobiological models). It is the only textbook on neural
! nets that I have seen that is statistically solid.
  "Bishop is a leading researcher who has a deep understanding of the material
  and has gone to great lengths to organize it in a sequence that makes sense.
--- 133,138 ----
  This is definitely the best book on neural nets for practical applications
  (rather than for neurobiological models). It is the only textbook on neural
! nets that I have seen that is statistically solid. Geoffrey Hinton writes in
! the foreword:
  "Bishop is a leading researcher who has a deep understanding of the material
  and has gone to great lengths to organize it in a sequence that makes sense.
***************
*** 104,109 ****
  impressive aspect of this book is that it takes the reader all the way from
  the simplest linear models to the very latest Bayesian multilayer neural
! networks without ever requiring any great intellectual leaps." -Geoffrey
! Hinton, from the foreword. 
  
  Hertz, J., Krogh, A., and Palmer, R. (1991). Introduction to the Theory of
--- 149,153 ----
  impressive aspect of this book is that it takes the reader all the way from
  the simplest linear models to the very latest Bayesian multilayer neural
! networks without ever requiring any great intellectual leaps." 
  
  Hertz, J., Krogh, A., and Palmer, R. (1991). Introduction to the Theory of
***************
*** 110,121 ****
  Neural Computation. Addison-Wesley: Redwood City, California. ISBN
  0-201-50395-6 (hardbound) and 0-201-51560-1 (paperbound)
! "My first impression is that this one is by far the best book on the topic.
! And it's below $30 for the paperback."; "Well written, theoretical (but not
! overwhelming)"; It provides a good balance of model development,
! computational algorithms, and applications. The mathematical derivations are
! especially well done"; "Nice mathematical analysis on the mechanism of
! different learning algorithms"; "It is NOT for mathematical beginner. If you
! don't have a good grasp of higher level math, this book can be really tough
! to get through."
  
  The best advanced textbook covering NNs
--- 154,165 ----
  Neural Computation. Addison-Wesley: Redwood City, California. ISBN
  0-201-50395-6 (hardbound) and 0-201-51560-1 (paperbound)
! Comments from readers of comp.ai.neural-nets: "My first impression is that
! this one is by far the best book on the topic. And it's below $30 for the
! paperback."; "Well written, theoretical (but not overwhelming)"; It provides
! a good balance of model development, computational algorithms, and
! applications. The mathematical derivations are especially well done"; "Nice
! mathematical analysis on the mechanism of different learning algorithms";
! "It is NOT for mathematical beginner. If you don't have a good grasp of
! higher level math, this book can be really tough to get through."
  
  The best advanced textbook covering NNs
***************
*** 142,152 ****
  and Signal Processing. NY: John Wiley & Sons, ISBN 0-471-930105 (hardbound),
  526 pages, $57.95. 
! "Partly a textbook and partly a research monograph; introduces the basic
! concepts, techniques, and models related to neural networks and
! optimization, excluding rigorous mathematical details. Accessible to a wide
! readership with a differential calculus background. The main coverage of the
! book is on recurrent neural networks with continuous state variables. The
! book title would be more appropriate without mentioning signal processing.
! Well edited, good illustrations."
  
  The best book on time-series forecasting with NNs
--- 186,196 ----
  and Signal Processing. NY: John Wiley & Sons, ISBN 0-471-930105 (hardbound),
  526 pages, $57.95. 
! Comments from readers of comp.ai.neural-nets:"Partly a textbook and partly a
! research monograph; introduces the basic concepts, techniques, and models
! related to neural networks and optimization, excluding rigorous mathematical
! details. Accessible to a wide readership with a differential calculus
! background. The main coverage of the book is on recurrent neural networks
! with continuous state variables. The book title would be more appropriate
! without mentioning signal processing. Well edited, good illustrations."
  
  The best book on time-series forecasting with NNs
***************
*** 169,187 ****
  Neural and Statistical Classification, Ellis Horwood. 
  
! Books for the Beginner:
! +++++++++++++++++++++++
  
  Aleksander, I. and Morton, H. (1990). An Introduction to Neural Computing.
  Chapman and Hall. (ISBN 0-412-37780-2). 
! Comments: "This book seems to be intended for the first year of university
! education."
  
  Beale, R. and Jackson, T. (1990). Neural Computing, an Introduction. Adam
  Hilger, IOP Publishing Ltd : Bristol. (ISBN 0-85274-262-2). 
! Comments: "It's clearly written. Lots of hints as to how to get the adaptive
! models covered to work (not always well explained in the original sources).
! Consistent mathematical terminology. Covers perceptrons,
! error-backpropagation, Kohonen self-org model, Hopfield type models, ART,
! and associative memories."
  
  Caudill, M. and Butler, C. (1990). Naturally Intelligent Systems. MIT Press:
--- 213,231 ----
  Neural and Statistical Classification, Ellis Horwood. 
  
! Books for the Beginner
! ++++++++++++++++++++++
  
  Aleksander, I. and Morton, H. (1990). An Introduction to Neural Computing.
  Chapman and Hall. (ISBN 0-412-37780-2). 
! Comments from readers of comp.ai.neural-nets:: "This book seems to be
! intended for the first year of university education."
  
  Beale, R. and Jackson, T. (1990). Neural Computing, an Introduction. Adam
  Hilger, IOP Publishing Ltd : Bristol. (ISBN 0-85274-262-2). 
! Comments from readers of comp.ai.neural-nets: "It's clearly written. Lots of
! hints as to how to get the adaptive models covered to work (not always well
! explained in the original sources). Consistent mathematical terminology.
! Covers perceptrons, error-backpropagation, Kohonen self-org model, Hopfield
! type models, ART, and associative memories."
  
  Caudill, M. and Butler, C. (1990). Naturally Intelligent Systems. MIT Press:
***************
*** 210,224 ****
  Dayhoff, J. E. (1990). Neural Network Architectures: An Introduction. Van
  Nostrand Reinhold: New York. 
! Comments: "Like Wasserman's book, Dayhoff's book is also very easy to
! understand".
! 
! Fausett, L. V. (1994). Fundamentals of Neural Networks: Architectures,
! Algorithms and Applications, Prentice Hall, ISBN 0-13-334186-0. Also
! published as a Prentice Hall International Edition, ISBN 0-13-042250-9.
! Sample softeware (source code listings in C and Fortran) is included in an
! Instructor's Manual.
! "Intermediate in level between Wasserman and Hertz/Krogh/Palmer. Algorithms
! for a broad range of neural networks, including a chapter on Adaptive
! Resonance Theory with ART2. Simple examples for each network."
  
  Freeman, James (1994). Simulating Neural Networks with Mathematica,
--- 254,259 ----
  Dayhoff, J. E. (1990). Neural Network Architectures: An Introduction. Van
  Nostrand Reinhold: New York. 
! Comments from readers of comp.ai.neural-nets: "Like Wasserman's book,
! Dayhoff's book is also very easy to understand".
  
  Freeman, James (1994). Simulating Neural Networks with Mathematica,
***************
*** 244,249 ****
  
  Hecht-Nielsen, R. (1990). Neurocomputing. Addison Wesley. 
! Comments: "A good book", "comprises a nice historical overview and a chapter
! about NN hardware. Well structured prose. Makes important concepts clear."
  
  McClelland, J. L. and Rumelhart, D. E. (1988). Explorations in Parallel
--- 279,285 ----
  
  Hecht-Nielsen, R. (1990). Neurocomputing. Addison Wesley. 
! Comments from readers of comp.ai.neural-nets: "A good book", "comprises a
! nice historical overview and a chapter about NN hardware. Well structured
! prose. Makes important concepts clear."
  
  McClelland, J. L. and Rumelhart, D. E. (1988). Explorations in Parallel
***************
*** 250,258 ****
  Distributed Processing: Computational Models of Cognition and Perception
  (software manual). The MIT Press. 
! Comments: "Written in a tutorial style, and includes 2 diskettes of NN
! simulation programs that can be compiled on MS-DOS or Unix (and they do too
! !)"; "The programs are pretty reasonable as an introduction to some of the
! things that NNs can do."; "There are *two* editions of this book. One comes
! with disks for the IBM PC, the other comes with disks for the Macintosh".
  
  McCord Nelson, M. and Illingworth, W.T. (1990). A Practical Guide to Neural
--- 286,295 ----
  Distributed Processing: Computational Models of Cognition and Perception
  (software manual). The MIT Press. 
! Comments from readers of comp.ai.neural-nets: "Written in a tutorial style,
! and includes 2 diskettes of NN simulation programs that can be compiled on
! MS-DOS or Unix (and they do too !)"; "The programs are pretty reasonable as
! an introduction to some of the things that NNs can do."; "There are *two*
! editions of this book. One comes with disks for the IBM PC, the other comes
! with disks for the Macintosh".
  
  McCord Nelson, M. and Illingworth, W.T. (1990). A Practical Guide to Neural
***************
*** 261,309 ****
  no formulas.
  
! Muller, B., Reinhardt, J., Strickland, M. T. (1995). Neural Networks. An
  Introduction (2nd ed.). Berlin, Heidelberg, New York: Springer-Verlag. ISBN
  3-540-60207-0. (DOS 3.5" disk included.) 
! Comments: The book was developed out of a course on neural-network models
! with computer demonstrations that was taught by the authors to Physics
! students. The book comes together with a PC-diskette. The book is divided
! into three parts: (1) Models of Neural Networks; describing several
! architectures and learing rules, including the mathematics. (2) Statistical
! Physiscs of Neural Networks; "hard-core" physics section developing formal
! theories of stochastic neural networks. (3) Computer Codes; explanation
! about the demonstration programs. First part gives a nice introduction into
! neural networks together with the formulas. Together with the demonstration
! programs a 'feel' for neural networks can be developed.
  
  Orchard, G.A. & Phillips, W.A. (1991). Neural Computation: A Beginner's
  Guide. Lawrence Earlbaum Associates: London. 
! Comments: "Short user-friendly introduction to the area, with a
! non-technical flavour. Apparently accompanies a software package, but I
! haven't seen that yet".
  
  Rao, V.B & H.V. (1993). C++ Neural Networks and Fuzzy Logic. MIS:Press,
  ISBN 1-55828-298-x, US $45 incl. disks. 
! "Probably not 'leading edge' stuff but detailed enough to get your hands
! dirty!"
  
  Wasserman, P. D. (1989). Neural Computing: Theory & Practice. Van Nostrand
  Reinhold: New York. (ISBN 0-442-20743-3) 
! Comments: "Wasserman flatly enumerates some common architectures from an
! engineer's perspective ('how it works') without ever addressing the
! underlying fundamentals ('why it works') - important basic concepts such as
! clustering, principal components or gradient descent are not treated. It's
! also full of errors, and unhelpful diagrams drawn with what appears to be
! PCB board layout software from the '70s. For anyone who wants to do active
! research in the field I consider it quite inadequate"; "Okay, but too
! shallow"; "Quite easy to understand"; "The best bedtime reading for Neural
! Networks. I have given this book to numerous collegues who want to know NN
! basics, but who never plan to implement anything. An excellent book to give
! your manager."
  
! The Classics:
! +++++++++++++
  
  Kohonen, T. (1984). Self-organization and Associative Memory.
  Springer-Verlag: New York. (2nd Edition: 1988; 3rd edition: 1989). 
! Comments: "The section on Pattern mathematics is excellent."
  
  Rumelhart, D. E. and McClelland, J. L. (1986). Parallel Distributed
--- 298,348 ----
  no formulas.
  
! Muller, B., Reinhardt, J., Strickland, M. T. (1995). Neural Networks.:An
  Introduction (2nd ed.). Berlin, Heidelberg, New York: Springer-Verlag. ISBN
  3-540-60207-0. (DOS 3.5" disk included.) 
! Comments from readers of comp.ai.neural-nets: "The book was developed out of
! a course on neural-network models with computer demonstrations that was
! taught by the authors to Physics students. The book comes together with a
! PC-diskette. The book is divided into three parts: (1) Models of Neural
! Networks; describing several architectures and learing rules, including the
! mathematics. (2) Statistical Physiscs of Neural Networks; "hard-core"
! physics section developing formal theories of stochastic neural networks.
! (3) Computer Codes; explanation about the demonstration programs. First part
! gives a nice introduction into neural networks together with the formulas.
! Together with the demonstration programs a 'feel' for neural networks can be
! developed." 
  
  Orchard, G.A. & Phillips, W.A. (1991). Neural Computation: A Beginner's
  Guide. Lawrence Earlbaum Associates: London. 
! Comments from readers of comp.ai.neural-nets: "Short user-friendly
! introduction to the area, with a non-technical flavour. Apparently
! accompanies a software package, but I haven't seen that yet".
  
  Rao, V.B & H.V. (1993). C++ Neural Networks and Fuzzy Logic. MIS:Press,
  ISBN 1-55828-298-x, US $45 incl. disks. 
! Covers a wider variety of networks than Masters, but lacks Masters's insight
! into practical issues of using NNs.
  
  Wasserman, P. D. (1989). Neural Computing: Theory & Practice. Van Nostrand
  Reinhold: New York. (ISBN 0-442-20743-3) 
! Comments from readers of comp.ai.neural-nets: "Wasserman flatly enumerates
! some common architectures from an engineer's perspective ('how it works')
! without ever addressing the underlying fundamentals ('why it works') -
! important basic concepts such as clustering, principal components or
! gradient descent are not treated. It's also full of errors, and unhelpful
! diagrams drawn with what appears to be PCB board layout software from the
! '70s. For anyone who wants to do active research in the field I consider it
! quite inadequate"; "Okay, but too shallow"; "Quite easy to understand"; "The
! best bedtime reading for Neural Networks. I have given this book to numerous
! collegues who want to know NN basics, but who never plan to implement
! anything. An excellent book to give your manager."
  
! The Classics
! ++++++++++++
  
  Kohonen, T. (1984). Self-organization and Associative Memory.
  Springer-Verlag: New York. (2nd Edition: 1988; 3rd edition: 1989). 
! Comments from readers of comp.ai.neural-nets: "The section on Pattern
! mathematics is excellent."
  
  Rumelhart, D. E. and McClelland, J. L. (1986). Parallel Distributed
***************
*** 310,335 ****
  Processing: Explorations in the Microstructure of Cognition (volumes 1 & 2).
  The MIT Press. 
! Comments: "As a computer scientist I found the two Rumelhart and McClelland
! books really heavy going and definitely not the sort of thing to read if you
! are a beginner."; "It's quite readable, and affordable (about $65 for both
! volumes)."; "THE Connectionist bible".
  
! Introductory Journal Articles:
! ++++++++++++++++++++++++++++++
  
  Hinton, G. E. (1989). Connectionist learning procedures. Artificial
  Intelligence, Vol. 40, pp. 185--234. 
! Comments: "One of the better neural networks overview papers, although the
! distinction between network topology and learning algorithm is not always
! very clear. Could very well be used as an introduction to neural networks."
  
  Knight, K. (1990). Connectionist, Ideas and Algorithms. Communications of
  the ACM. November 1990. Vol.33 nr.11, pp 59-74. 
! Comments:"A good article, while it is for most people easy to find a copy of
! this journal."
  
  Kohonen, T. (1988). An Introduction to Neural Computing. Neural Networks,
  vol. 1, no. 1. pp. 3-16. 
! Comments: "A general review".
  
  Rumelhart, D. E., Hinton, G. E. and Williams, R. J. (1986). Learning
--- 349,376 ----
  Processing: Explorations in the Microstructure of Cognition (volumes 1 & 2).
  The MIT Press. 
! Comments from readers of comp.ai.neural-nets: "As a computer scientist I
! found the two Rumelhart and McClelland books really heavy going and
! definitely not the sort of thing to read if you are a beginner."; "It's
! quite readable, and affordable (about $65 for both volumes)."; "THE
! Connectionist bible".
  
! Introductory Journal Articles
! +++++++++++++++++++++++++++++
  
  Hinton, G. E. (1989). Connectionist learning procedures. Artificial
  Intelligence, Vol. 40, pp. 185--234. 
! Comments from readers of comp.ai.neural-nets: "One of the better neural
! networks overview papers, although the distinction between network topology
! and learning algorithm is not always very clear. Could very well be used as
! an introduction to neural networks."
  
  Knight, K. (1990). Connectionist, Ideas and Algorithms. Communications of
  the ACM. November 1990. Vol.33 nr.11, pp 59-74. 
! Comments from readers of comp.ai.neural-nets:"A good article, while it is
! for most people easy to find a copy of this journal."
  
  Kohonen, T. (1988). An Introduction to Neural Computing. Neural Networks,
  vol. 1, no. 1. pp. 3-16. 
! Comments from readers of comp.ai.neural-nets: "A general review".
  
  Rumelhart, D. E., Hinton, G. E. and Williams, R. J. (1986). Learning
***************
*** 336,353 ****
  representations by back-propagating errors. Nature, vol 323 (9 October), pp.
  533-536. 
! Comments: "Gives a very good potted explanation of backprop NN's. It gives
! sufficient detail to write your own NN simulation."
  
! Not-quite-so-introductory Literature:
! +++++++++++++++++++++++++++++++++++++
  
  Anderson, J. A. and Rosenfeld, E. (Eds). (1988). Neurocomputing:
  Foundations of Research. The MIT Press: Cambridge, MA. 
! Comments: "An expensive book, but excellent for reference. It is a
! collection of reprints of most of the major papers in the field." 
  
  Anderson, J. A., Pellionisz, A. and Rosenfeld, E. (Eds). (1990). 
  Neurocomputing 2: Directions for Research. The MIT Press: Cambridge, MA. 
! Comments: "The sequel to their well-known Neurocomputing book."
  
  Bourlard, H.A., and Morgan, N. (1994), Connectionist Speech Recognition: A
--- 377,397 ----
  representations by back-propagating errors. Nature, vol 323 (9 October), pp.
  533-536. 
! Comments from readers of comp.ai.neural-nets: "Gives a very good potted
! explanation of backprop NN's. It gives sufficient detail to write your own
! NN simulation."
  
! Not-quite-so-introductory Literature
! ++++++++++++++++++++++++++++++++++++
  
  Anderson, J. A. and Rosenfeld, E. (Eds). (1988). Neurocomputing:

==> nn5.changes.body <==
*** nn5.oldbody	Fri Feb 28 23:00:33 1997
--- nn5.body	Fri Mar 28 23:00:32 1997
***************
*** 1,4 ****
  Archive-name: ai-faq/neural-nets/part5
! Last-modified: 1997-01-13
  URL: ftp://ftp.sas.com/pub/neural/FAQ5.html
  Maintainer: saswss@unx.sas.com (Warren S. Sarle)
--- 1,4 ----
  Archive-name: ai-faq/neural-nets/part5
! Last-modified: 1997-03-12
  URL: ftp://ftp.sas.com/pub/neural/FAQ5.html
  Maintainer: saswss@unx.sas.com (Warren S. Sarle)
***************
*** 82,86 ****
  33. nn/xnn 
  34. NNDT 
! 35. Trajan 2.0 Shareware 
  36. Neural Networks at your Fingertips 
  
--- 82,86 ----
  33. nn/xnn 
  34. NNDT 
! 35. Trajan 2.1 Shareware 
  36. Neural Networks at your Fingertips 
  
***************
*** 750,757 ****
     See INSTALL.TXT for more details.
  
! 35. Trajan 2.0 Shareware
  ++++++++++++++++++++++++
  
!    Trajan 2.0 Shareware is a Windows-based Neural
     Network simulation package. It includes support
     for the two most popular forms of Neural
--- 750,757 ----
     See INSTALL.TXT for more details.
  
! 35. Trajan 2.1 Shareware
  ++++++++++++++++++++++++
  
!    Trajan 2.1 Shareware is a Windows-based Neural
     Network simulation package. It includes support
     for the two most popular forms of Neural
***************
*** 759,763 ****
     Propagation and Kohonen networks.
  
!    Trajan 2.0 Shareware concentrates on ease-of-use
     and feedback. It includes Graphs, Bar Charts and
     Data Sheets presenting a range of Statistical
--- 759,763 ----
     Propagation and Kohonen networks.
  
!    Trajan 2.1 Shareware concentrates on ease-of-use
     and feedback. It includes Graphs, Bar Charts and
     Data Sheets presenting a range of Statistical
***************
*** 767,780 ****
     The Registered version of the package can
     support very large networks (up to 128 layers
!    with up to 8,192 connections between successive
!    layers, subject to memory limitations in the
!    machine), and allows simple Cut and Paste
!    transfer of data to/from other Windows-packages,
!    such as spreadsheet programs. The Unregistered
!    version features limited network size and no
!    Clipboard Cut-and-Paste.
  
!    There is also a Commercial version of Trajan
!    2.0, which supports a wider range of network
     models, training algorithms and other features.
  
--- 767,779 ----
     The Registered version of the package can
     support very large networks (up to 128 layers
!    with up to 8,192 units each, subject to memory
!    limitations in the machine), and allows simple
!    Cut and Paste transfer of data to/from other
!    Windows-packages, such as spreadsheet programs.
!    The Unregistered version features limited
!    network size and no Clipboard Cut-and-Paste.
  
!    There is also a Professional version of Trajan
!    2.1, which supports a wider range of network
     models, training algorithms and other features.
  

==> nn6.changes.body <==
*** nn6.oldbody	Fri Feb 28 23:00:37 1997
--- nn6.body	Fri Mar 28 23:00:35 1997
***************
*** 1,4 ****
  Archive-name: ai-faq/neural-nets/part6
! Last-modified: 1997-02-11
  URL: ftp://ftp.sas.com/pub/neural/FAQ6.html
  Maintainer: saswss@unx.sas.com (Warren S. Sarle)
--- 1,4 ----
  Archive-name: ai-faq/neural-nets/part6
! Last-modified: 1997-03-12
  URL: ftp://ftp.sas.com/pub/neural/FAQ6.html
  Maintainer: saswss@unx.sas.com (Warren S. Sarle)
***************
*** 79,83 ****
  30. PREVia 
  31. Neural Bench 
! 32. Trajan 2.0 Neural Network Simulator 
  33. DataEngine 
  
--- 79,83 ----
  30. PREVia 
  31. Neural Bench 
! 32. Trajan 2.1 Neural Network Simulator 
  33. DataEngine 
  
***************
*** 1471,1493 ****
     =========
  
!    Trajan 2.0 is a Windows-based Neural Network that supports a wide range
!    of Neural Network types, training algorithms, and graphical and
!    statistical feedback on Neural Network performance.
  
     Features include: 
!    1. Full 32-bit power. Trajan 2.0 is available in a 32-bit version for
        use on Windows 95 and Windows NT platforms, supporting
        virtually-unlimited network sizes (available memory is a constraint).
!       A 16-bit version (network size limited to 8,192 connections between
!       successive layers) is also available for use on Windows 3.1. 
     2. Network Architectures. Includes Support for Multilayer Perceptrons,
        Kohonen networks, Radial Basis Functions, Linear models, Probabilistic
        and Generalised Regression Neural Networks. Training algorithms
!       include Back Propagation (with time-dependent learning rate and
!       momentum, shuffling and additive noise), Quick Propagation and
!       Delta-Bar-Delta for Multilayer Perceptrons; K-Means, K-Nearest
!       Neighbour and Pseudo-Inverse techniques for Radial Basis Function
!       networks. Error plotting, automatic cross verification and a variety
!       of stopping conditions are also included. 
     3. Custom Architectures. Trajan allows you to select special
        Activation functions and Error functions; for example, to use Softmax
--- 1471,1496 ----
     =========
  
!    Trajan 2.1 Professional is a Windows-based Neural Network includes
!    support for a wide range of Neural Network types, training algorithms,
!    and graphical and statistical feedback on Neural Network performance.
  
     Features include: 
!    1. Full 32-bit power. Trajan 2.1 is available in a 32-bit version for
        use on Windows 95 and Windows NT platforms, supporting
        virtually-unlimited network sizes (available memory is a constraint).
!       A 16-bit version (network size limited to 8,192 units per layer) is
!       also available for use on Windows 3.1. 
     2. Network Architectures. Includes Support for Multilayer Perceptrons,
        Kohonen networks, Radial Basis Functions, Linear models, Probabilistic
        and Generalised Regression Neural Networks. Training algorithms
!       include the very fast, modern Levenburg-Marquardt and Conjugate
!       Gradient Descent algorithms, in addition to Back Propagation (with
!       time-dependent learning rate and momentum, shuffling and additive
!       noise), Quick Propagation and Delta-Bar-Delta for Multilayer
!       Perceptrons; K-Means, K-Nearest Neighbour and Pseudo-Inverse
!       techniques for Radial Basis Function networks, Principal Components
!       Analysis and specialised algorithms for Automatic Network Design and
!       Neuro-Genetic Input Selection. Error plotting, automatic cross
!       verification and a variety of stopping conditions are also included. 
     3. Custom Architectures. Trajan allows you to select special
        Activation functions and Error functions; for example, to use Softmax
***************
*** 1503,1517 ****
        Virtually all information can be transferred via the Clipboard to
        other Windows applications such as Spreadsheets. 
!    5. Pre- and Post-processing. Trajan 2.0 supports a range of pre- and
        post-processing options, including Minimax scaling, Winner-takes-all,
        Unit-Sum and Unit-Length vector. Trajan also assigns classifications
!       based on user-specified Accept and Reject thresholds. 
!    6. Embedded Use. The Trajan Dynamic Link Library allows you to call
!       trained networks from within other applications. Trajan 2.0 come
!       complete with sample applications written in 'C' and Visual Basic, and
!       an Excel spreadsheet which calls a Neural Network via spreadsheet
!       formulae.
!    There is also a shareware version of the Software available; please
!    download this to check whether Trajan 2.0 fulfils your needs. 
  
  33. DataEngine
--- 1506,1520 ----
        Virtually all information can be transferred via the Clipboard to
        other Windows applications such as Spreadsheets. 
!    5. Pre- and Post-processing. Trajan 2.1 supports a range of pre- and
        post-processing options, including Minimax scaling, Winner-takes-all,
        Unit-Sum and Unit-Length vector. Trajan also assigns classifications
!       based on user-specified Accept and Reject thresholds.
! 
!    6. Embedded Use. The Trajan Dynamic Link Library gives full
!       programmatic access to Trajan's facilities, including network
!       creation, editing and training. Trajan 2.1 come complete with sample
!       applications written in 'C' and Visual Basic.
!    There is also a demonstration version of the Software available; please
!    download this to check whether Trajan 2.1 fulfils your needs. 
  
  33. DataEngine

==> nn7.changes.body <==
*** nn7.oldbody	Fri Feb 28 23:00:40 1997
--- nn7.body	Fri Mar 28 23:00:39 1997
***************
*** 29,33 ****
  Thomas Lindblad notes on 96-12-30: 
  
!    The reactive tabu search alogortm has been implemented by the
     Italians, in Trento. ISA and VME and soon PCI boards are available.
     We tested the system with the IRIS and SATIMAGE data and it did
--- 29,33 ----
  Thomas Lindblad notes on 96-12-30: 
  
!    The reactive tabu search algorithm has been implemented by the
     Italians, in Trento. ISA and VME and soon PCI boards are available.
     We tested the system with the IRIS and SATIMAGE data and it did
-- 

Warren S. Sarle       SAS Institute Inc.   The opinions expressed here
saswss@unx.sas.com    SAS Campus Drive     are mine and not necessarily
(919) 677-8000        Cary, NC 27513, USA  those of SAS Institute.
 *** Do not send me unsolicited commercial or political email! ***

