Newsgroups: comp.speech
Path: pavo.csi.cam.ac.uk!doc.ic.ac.uk!uknet!pipex!sunic!uunet!spool.mu.edu!agate!pasteur!bregler
From: bregler@ICSI.Berkeley.EDU (Chris Bregler)
Subject: Re: Entropy of English text?
Message-ID: <1993Jul27.064646.27728@pasteur.Berkeley.EDU>
Sender: Christoph Bregler <bregler@icsi.berkeley.edu>
Nntp-Posting-Host: icsib70.icsi.berkeley.edu
Organization: International Computer Science Institute, Berkeley, CA, U.S.A.
References: <CAs7pD.HCC@newcastle.ac.uk> <CAt6vu.8q5@noose.ecn.purdue.edu>
Date: Tue, 27 Jul 1993 06:46:46 GMT
Lines: 21

Here's an interesting reference for your question:

"An Estimate of an Upper Bound for the Entropy of English" by P. Brown et al.
(IBM Yorktown) in Computational Linguistics, 18(1), 31-40, 1992.

-Chris

In article <CAt6vu.8q5@noose.ecn.purdue.edu> helz@ecn.purdue.edu (Randall A Helzerman) writes:
>In article <CAs7pD.HCC@newcastle.ac.uk>, d.j.e.nunn@durham.ac.uk (Douglas Nunn) writes:
>|> Can some kind soul give me a reasonable figure for the average 
>|> entropy of typical English text?
>|> 
>|> I'll leave any assumptions up to you.
>|> 
>
>Well, The Man himself Claude E. Shannon in his paper "The Mathematical
>Theory of Communication" says that English is about 50% redundant. 
>(1-entropy=redundancy).
>


