Article 6310 of comp.archives: Path: cse!cs.utexas.edu!qt.cs.utexas.edu!zaphod.mps.ohio-state.edu!cis.ohio-state.edu!ucbvax!agate!usenet From: arms@cs.UAlberta.CA (Bill Armstrong) Newsgroups: comp.archives Subject: [comp.ai.neural-nets] Re: network paralysis Message-ID: <1991Aug9.101221.25494@agate.berkeley.edu> Date: 9 Aug 91 10:12:21 GMT References: <1991Jul25.122655.21427@eagle.lerc.nasa.gov> Sender: usenet@agate.berkeley.edu (USENET Administrator) Followup-To: comp.ai.neural-nets Organization: University of Alberta, Edmonton, Canada Lines: 85 Approved: adam@soda.berkeley.edu X-Original-Date: 25 Jul 91 16: 05:49 GMT X-Original-Newsgroups: comp.ai.neural-nets Archive-name: auto/comp.ai.neural-nets/network-paralysis Original-posting-by: arms@cs.UAlberta.CA (Bill Armstrong) Original-subject: Re: network paralysis Reposted-by: adam@soda.berkeley.edu dmr3420@venus.lerc.nasa.gov (David Reed) writes: > I am attempting to train a 13-20-24 network with approx. 150 test >cases. I am using an optimization method to compute a step length and >direction (modified gradient) at each iteration instead of the standard >fixed step length and momentum. After a number of training epochs I reach >a point where it is getting about 75% of the 150*24 outputs correct. After >this it takes a few thousand epochs to increase to 77-78% correct. Suppose you select the best networks of the 24, say 17 of them, such that they perform satisfactorily. You then remove the other seven output nodes. The other seven outputs, you approximate by a 13-20-7 net. The results can't be worse, and would probably be much better than you could ever get from 7 outputs of a 13-20-24 net, simply because the elements can concentrate on solving a smaller task well. What you lose with that approach is evaluation time, because not so much of the network is shared, as is the case with the 13-20-24 network. If high evaluation speed is what you need, then why not try adaptive logic networks? They have speed to burn, especially in hardware (My guess is you could do the whole computation once the inputs are in place in less than 100 nanoseconds), but also in simulation because of lazy evaluation (also called parsimony). Actually, unless you can prove that there is benefit to sharing parts of the network, I don't see why you wouldn't try 24 networks 13-20-1, or even 13-10-1, say. I guess I am missing some of the "indoctrination" that people go through when they take courses on neural networks, because I see no point whatsoever in forced sharing of parts of a network unless you can guide the common part to perform a shared function. Sharing is implicit in sending the outputs of an element to many places. (That would be fine if all but one or two sinks for a signal had zero weights attached, but that's not the way it works.) > By this >time some of the weights are as high as 29 (I initialized them to the range >(-1,+1)). So in that case you have maybe blown it as far as good generalization is concerned anyway, so there's not much point in further training. >At this point the inputs to some of the cells is high >enough that the derivative approaches zero and learning ceases. My view: learning should cease in some parts of a network (any kind of network) which are doing their job well. This is equivalent to specialization occurring. That the sigmoid (or tanh) functions don't allow for cutting off parts of a net that are doing OK is a fault of the backpropagation technique. It also is the reason for costly execution times. > Wasserman in his book Neural Computing suggests applying a sigmoid >function to the weights that are causing high outputs. I have been >experimenting with this idea and a few ideas of my own. That is applying a bandaid to a terminally ill algorithm. Does anyone >have any suggestions for dealing with this problem without destroying >the previous training (i.e. the modification to the weights does not >increase the sum of squares error too much)? All I can say is: try adaptive logic networks! (Sorry for the "sales" pitch, but I can't see any reason for people to work so hard in trying to get a hopelessly complex and slow approach to work, i. e. backprop networks, when an alternative exists that they can try out with our demo software available via anonymous ftp from menaik.cs.ualberta.ca [129.128.4.241] in pub/atree.tar.Z.) The demo software, atree release 1.0, isn't powerful enough for really difficult problems, but yours doesn't sound that difficult. In the worst case, atree release 2.0 is available. Bill -- *************************************************** Prof. William W. Armstrong, Computing Science Dept. University of Alberta; Edmonton, Alberta, Canada T6G 2H1 arms@cs.ualberta.ca Tel(403)492 2374 FAX 492 1071 From arms@cs.ualberta.ca Fri Sep 30 17:36:05 EDT 1994 Article: 19095 of comp.ai.neural-nets Path: cantaloupe.srv.cs.cmu.edu!das-news.harvard.edu!news2.near.net!MathWorks.Com!panix!zip.eecs.umich.edu!newsxfer.itd.umich.edu!nntp.cs.ubc.ca!unixg.ubc.ca!quartz.ucs.ualberta.ca!alberta!arms From: arms@cs.ualberta.ca (Bill Armstrong) Newsgroups: comp.ai.neural-nets Subject: ALN information( was C++ code for ANNs ..) Date: 22 Sep 1994 14:29:29 GMT Organization: Computing Science, U of Alberta, Edmonton, Canada Lines: 242 Message-ID: <35s4c9$i4g@scapa.cs.ualberta.ca> References: <35ron0$fij@reseau.cict.fr> NNTP-Posting-Host: spedden.cs.ualberta.ca Keywords: ANNs C++ mathey@apollo22.eis.enac.dgac.fr (mathey) writes: >Since I'm currently working on an NN project to be developped in C++, >I would be much thankful if anyone could gave me some ftp sites where I >could find library units and packages related to this subject. I don't >have a lot of time to do what I'm asked for and I definitely don't want to >re-do something already achieved by others better than I. >If you have any suggestions, you can e-mail me at : > mathey@eis.enac.dgac.fr >or simply send an answer to this group. The package you should use will depend on your application. There is C++ source code free for non-commercial use on ftp.cs.ualberta.ca in pub/atree/atre27.exe It runs under Windows 3.x. The same learning engine is in atree2.tar.Z for Unix. The latter is straight C. The paradigm is adaptive logic networks (ALNs). If your problem has analog (ie floating point outputs, then you are much better off with the new commercial version Atree 3.0. It has a lot of advantages over other ANNs: 1: the result is a piecewise linear function (simple, understandable) 2: the partial derivatives of the learned function can be bounded, based on a priori knowledge (monotonicity supports testing!) 3: convexity can also be imposed as a constraint on the learned function (such constraints promote smoothness, and reduce the number of training samples needed in general) 4: evaluation on any input requires evaluation of only a few linear pieces (which makes it very fast e.g. 250 microseconds to get an output on a 486) 5: can learn the 2-spirals problem in about an hour and generalize well 6: can learn arbitrary boolean functions of at least 10 variables The inclusion of a priori knowledge in Atree 3.0 is an extremely powerful tool, not available in backprop-type nets. After you've tried it, you'll never go back to backprop. It has been extensively tested already, and has controlled both mechanical systems and walking prostheses for spinal cord damaged patients. Recent publications discuss ALN's superiority over standard statistical techniques and backprop for the classification of beef. Here is info on the ftp version. If you are interested in Atree 3.0, please write me at arms@cs.ualberta.ca . THE ATREE ADAPTIVE LOGIC NETWORK SIMULATION PACKAGE *********************** NOTE: The Atree 2.7 package is now obsolete. It has been replaced by the commercial package: Atree 3.0 ALN Development System beta release 1, which is available to beta testers at the present time (August 1994). Please see the description on ftp.cs.ualberta.ca in pub/atree/atree30.ps.Z. This is about 90K bytes. There is also an uncompressed version in atree30.ps which is about 360K bytes. (There are a few images in the file, which makes it somewhat large.) ********************** The atree adaptive logic network (ALN) simulation package, atree release 2.7, is available via anonymous ftp from ftp.cs.ualberta.ca or [129.128.4.241] in pub/atree/atre27.exe (ftp in binary mode). It runs on IBM PCs and compatibles under Windows 3.x. (Another version runs under Unix -- see below.) Included is documentation and ON-LINE HELP explaining the basic principles of adaptive logic networks, the atree source code and the examples. All C and C++ source code is provided. The atree package is not a toy, despite the fact that it is used for demonstration purposes, and is non-commercial. Experimenters are using it on challenging problems of medicine, physics and the environment. It has been used to grade beef based on ultrasound images, design hardware to discriminate particles produced by a high-energy accelerator, help to design walking prostheses for spinal cord damaged patients and measure the composition of tarsands from spectral data. It is possible to use inexpensive, off-the-shelf programmable logic devices to realize trained ALNs in high-speed hardware, though those facilities are not in the atree 2.7 software. Please read the license, and the warranty (that protects the developers, not the users). All neural networks which are "black boxes" are possibly unsafe, i.e. unexpected outputs can occur at some untested places in the domain of the neural net's mapping. The current atree package is no exception; however ALNs can be made safe by forcing the learned mappings to be piecewise monotonic according to the developer's a priori knowledge of the problem. A commercial version of atree is planned which will support a safe design methodology. Atree release 2.7 is available in either of two files in pub/atree/ on menaik: atre27.exe and a27exe.exe. The file atre27.exe contains the full C and C++ sources for those who want to study or modify them. The code was developed using Borland C++ 3.1 and Application Frameworks. The other, smaller file contains just the executables. The sources help the user understand the adaptive algorithm in detail (see ALN Technical Notes/The Learning Algorithm in the On-Line Help). Everyone should have a look at the OCR demo! It has been referred to as "quite impressive" even by experts in the OCR area. Test yourself against the trained ALNs, and scribble in your own characters (similar to the A, L, or N; or pick any language, any alphabet and then train) to see how noisy and distorted the characters can be, yet still be recognized by the logic networks. The demo can be obtained without the rest in pub/atree/a27ocr.exe. To set up your software on the PC under Windows 3.x, it is recommended that you execute atre27.exe in your main directory, whereupon it will create a subdirectory atree_27 and extract everything into it. Running "setup" in the latter directory will create a group of icons you can use to invoke demos and the facilities for programming adaptive logic network applications in the lf language. The "Open" command gives you access to numerous instructive examples. Clicking on the Help button gives you access to explanations of theory and code. The Unix version, atree release 2.0, is in C, and has been ported to Macintosh, Amiga, and other machines. Windows NT will eventually offer another way to use atree on various platforms. There is an electronic mailing list for discussions of ALNs. Mail to alnl-request@cs.ualberta.ca to subscribe or cancel. Your comments on ALN subjects can be emailed to all other subscribers to the list by mailing to alnl@cs.ualberta.ca. Welcome to the world of adaptive logic networks! RECOMMENDED PUBLICATIONS ON ADAPTIVE LOGIC NETWORKS W. Armstrong, Adaptive Boolean Logic Element, U. S. Patent 3934231, Feb. 28, 1974 (filings in various countries), assigned to Dendronic Decisions Limited, 3624 - 108 Street, Edmonton, Alberta, T6J 1B4, Tel. (403) 438-1103. N. B. EXPIRED JANUARY 1993, THUS PUTTING DESIGNS FOR HIGH-SPEED ALN ADAPTIVE HARDWARE INTO THE PUBLIC DOMAIN. G. v. Bochmann, W. Armstrong, Properties of Boolean Functions with a Tree Decomposition, BIT 13, 1974. pp. 1-13. W. Armstrong, Gilles Godbout: Use of Boolean Tree Functions to Perform High-Speed Pattern Classification and Related Tasks, Dept. d'IRO, Universite de Montreal, Doc. de Travail #53, 1974. (unpublished, except in summary form as follows:) W. Armstrong and G. Godbout, Properties of Binary Trees of Flexible Elements Useful in Pattern Recognition, IEEE 1975 International Conf. on Cybernetics and Society, San Francisco, 1975, IEEE Cat. No. 75 CHO 997-7 SMC, pp. 447-449. W. Armstrong and J. Gecsei, Architecture of a Tree-based Image Processor, 12th Asilomar Conf. on Circuits, Systems and Computers, Pacific Grove, Calif., 1978, pp. 345-349. W. Armstrong and J. Gecsei, Adaptation Algorithms for Binary Tree Networks, IEEE Trans. on Systems, Man and Cybernetics, 9, 1979, pp. 276-285. W. Armstrong, J.-D. Liang, D. Lin, S. Reynolds, Experiments Using Parsimonious Adaptive Logic, Tech. Rept. TR 90-30, Department of Computing Science, University of Alberta, Edmonton, Alberta, Canada, T6G 2H1. This is now available in a revised form via anonymous FTP from menaik.cs.ualberta.ca [129.128.4.241] in pub/atree/atree2.ps.Z (the title of the revised document is Some Results concerning Adaptive Logic Networks). W. Armstrong, A. Dwelly, J.-D. Liang, D. Lin, S. Reynolds, Learning and Generalization in Adaptive Logic Networks, in Artificial Neural Networks, Proceedings of the 1991 International Conference on Artificial Neural Networks ( ICANN'91), Espoo, Finland, June 24-28, 1991, T. Kohonen, K.Makisara, O. Simula, J. Kangas eds. Elsevier Science Publishing Co. Inc. N. Y. 1991, vol. 2, pp. 1173-1176. Allen G. Supynuk, William W. Armstrong, Adaptive Logic Networks and Robot Control, Proc. Vision Interface Conference '92, also called AI/VI/GI '92, Vancouver B. C., May 11-15, 1992, pp. 181 - 186. R. B. Stein, A. Kostov, M. Belanger, W. W. Armstrong and D. B. Popovic, Methods to Control Functional Electrical Stimulation in Walking, First International FES Symposium, Sendai, Japan, July 23 - 25, 1992, pp. 135 - 140. Aleksandar Kostov, Richard B. Stein, William W. Armstrong, Monroe Thomas, Evaluation of Adaptive Logic Networks for Control of Walking in Paralyzed Patients, 14th Ann. Int'l Conf. IEEE Engineering in Medicine and Biology Society, Paris, France, Oct. 29 - Nov. 1, 1992 vol.4, pp. 1332 - 1334. Ian Parsons, W. W. Armstrong: The Use of Adaptive Logic Nets to Quantify Tar Sands Feed (Draft), available via anonymous ftp from ftp.cs.ualberta.ca or [129.128.4.241] in pub/atree/alntarsands.ps.Z. W.W. Armstrong, R. B. Stein, A. Kostov, M. Thomas, P. Baudin, P. Gervais, D. Popovic, Application of adaptive logic networks and dynamics to study and control of human movement, Second Int'l Symp. on 3D Analysis of Human Movement, Poitiers, June 30 - July 3, 1993 pp. 81 - 84. D. B. Popovic, R.B. Stein, K. L. Jovanovic, R.Dai, A. Kostov, W.W. Armstrong, Sensory Nerve Recording for Closed-Loop Control to Restore Motor Functions, IEEE Trans. Biomed. Eng., vol. 40 no. 10 pp. 1024 - 1031, 1993. Aleksandar Kostov, Richard B. Stein, Dejan Popovic and W. W. Armstrong, Improved Methods for Control of FES for Locomotion, Proc. International Federation of Automatic Control (IFAC) Symposium on Modeling and Control in Biomedical Systems, Galveston, Texas, March 27 - 30, 1994. A. Kostov, B. J. Andrews, D. Popovic, R. B. Stein, W. W. Armstrong, Machine Learning in Control of Functional Electrical Stimulation Systems for Locomotion (submitted). William W. Armstrong, Monroe M. Thomas, Control of a Vehicle Active Suspension System Model using Adaptive Logic Networks, included in the appendix to the Proceedings distributed on-site at the World Congress on Neural Networks, San Diego, California, June 5 - 9, 1944, and distributed electronically on internet with the permission of the conference managers via ftp from ftp.cs.ualberta.ca in pub/atree/wcnnpub.ps. (A missing reference from an earlier version is in wcnnpub.readme in that directory.) James Darrell McCauley, Brian Ray Thane and Alan Dale Whittaker, Fat Estimation in Beef Ultrasound Images using Texture and Adaptive Logic Networks, Transactions of the ASAE, vol. 37 no. 3, pp. 997 - 1002, 1994. Anyone interested in the commercial version of atree should contact W. W. Armstrong, President, Dendronic Decisions Limited, 3624 - 108 Street, Edmonton, Alberta, Canada, T6J 1B4, Tel./FAX (403) 438 8285 or by email: arms@cs.ualberta.ca -- *************************************************** Prof. William W. Armstrong, Computing Science Dept. University of Alberta; Edmonton, Alberta, Canada T6G 2H1 arms@cs.ualberta.ca Tel(403)492 2374 FAX 492 1071