Newsgroups: comp.ai.neural-nets
Path: cantaloupe.srv.cs.cmu.edu!das-news2.harvard.edu!news2.near.net!howland.reston.ans.net!ix.netcom.com!netcom.com!park
From: park@netcom.com (Bill Park)
Subject: Re: Neural Nets and Protein Predictions
Message-ID: <parkCyM2qn.BvI@netcom.com>
Keywords: NN, protein, Fourier
Organization: Netcom Online Communications Services (408-241-9760 login: guest)
References: <1994Oct30.175208.1@wkuvx2.wku.edu>
Date: Tue, 1 Nov 1994 23:03:58 GMT
Lines: 60

In article <1994Oct30.175208.1@wkuvx2.wku.edu> camplte1@wkuvx1.wku
writes: Hi.

> I am a graduate student whose thesis project involves using neural
> nets for prediction of protein function(s) and structure(s) from
> primary structure.  
> 
> What, if any, other uses for neural nets have you found in
> analyzing neural nets other than secondary structure prediction?
>
> Troy Earl Camplin

I presume you meant to write, "... analyzing proteins ...."  Never
heard of any other uses for neural nets in molecular biology, but this
article

Xiru Zhang (Thinking Machines, Inc.), "A Hybrid Algorithm for
  Determining Protein Structure, _IEEE Expert_, Vol. 9, No. 4, August,
  1994, pp. 66-71.

mentions that an important problem people are working on is to try to
take into account interactions between amino acids far apart; work on
prediction secondary structures has focused on interactions between
amino acids separated by about 15 others at most.

Abstract: By combining a neural network, a statistical module, and a
memory-based reasoner, this hybrid system improves its ability to
determine how amino acid sequences fold into three-dimensional protein
structures.

The system described predicts secondary structures (e.g., alpha helix,
beta sheet, beta turn, coil) with 66.4% accuracy.  They used 107
protein structures, with 8-way cross validation (divide them randomly
into 8 groups, take one group at a time as the test set, the other 7
as the training set).  They got a large variation in accuracy, which
Zhang says, "strongly argues against using a small test set for this
kind of study."

One sidebar gives a brief summary of molecular biology.  Another
sketches four other approaches to secondary structure prediction: the
Chou-Fasman algorithm, the GOR (authors' initials) algorithm,
pattern-matching algorithms, and Quian and Sejnmowski's
backpropagation experiments.

They note that the highest prediction accuracy seems to be about 60%.

Tertiary structure (overall protein shape), and quaternary structure
(how multiple protein subunits combine) would seem to far more
difficult to predict from the amino acid sequence alone.  A molecular
biologist explained to me that the surface tension of the water
molecules all around the peptide chain as it folds up, and the action
of various catalytic proteins are very important factors.  If you
denature (uncoil) some proteins by heating them, they often do not go
back to their original shape, even if you anneal the solution (cool it
slowly) to try to give the peptide chains time to find their
minimum-energy configurations through thermal agitation.

Bill Park
=========
-- 
Grandpaw Bill's High Technology Consulting & Live Bait, Inc.
