www.GetXFactor.com

Leading Technology, Science,
Agriculture News and information


Part of the Identityscape.com network...

getxfactor.com jmoodmusic.com smartbusinesschoices.com mintdepot.com lowfaresalways.com evangelicalview.com shoppingpodder.com soproudlywehail.com webnews.ws currenthumor.com

 

 

shannon>s entropy equasion: which log base? 2 or e ?
   Science and Technology news... Forum Index -> Compression Forum  
View previous topic :: View next topic  
Author Message
bendy
Guest






PostPosted: Wed Aug 06, 2003 2:43 am    Post subject: shannon>s entropy equasion: which log base? 2 or e ? Reply with quote

hiyer,

this maybe bit of a thicky question possibly, but i have contradicting
information on it, so i>d really like to have it clarified please.

shannon>s well known, famous, entropy / information equasion :

H = - E P(i) log P(i)

(with E representing sigma / sum of)

the log. is it log e or log2? which base is it in?

thanks.
Back to top
Dr Chaos
Guest






PostPosted: Wed Aug 06, 2003 7:13 am    Post subject: Re: shannon>s entropy equasion: which log base? 2 or e ? Reply with quote

On Wed, 6 Aug 2003 02:43:08 +0000 (UTC), bendy <x@x.x> wrote:
[quote]hiyer,

this maybe bit of a thicky question possibly, but i have contradicting
information on it, so i>d really like to have it clarified please.

shannon>s well known, famous, entropy / information equasion :

H = - E P(i) log P(i)

(with E representing sigma / sum of)
[/quote]
actually expectation value over the distribution P(i)

[quote]
the log. is it log e or log2? which base is it in?
[/quote]
It doesn>t matter as long as you are consistent in your units.

If it is log base two then the output will be measured in the
more natural "bits" (entropy rate in bits per symbol).

But analysis works better with natural logarithm as you take
derivatives the, well, natural way.

That>s why Shannon didn>t make a big point about distinguishing
it.
Back to top
David A. Scott
Guest






PostPosted: Wed Aug 06, 2003 8:12 am    Post subject: Re: shannon>s entropy equasion: which log base? 2 or e ? Reply with quote

bendy <x@x.x> wrote in news:060820030342542486%x@x.x:

[quote]hiyer,

this maybe bit of a thicky question possibly, but i have contradicting
information on it, so i>d really like to have it clarified please.

shannon>s well known, famous, entropy / information equasion :

H = - E P(i) log P(i)

(with E representing sigma / sum of)

the log. is it log e or log2? which base is it in?

thanks.

[/quote]
TWO

David A. Scott
Back to top
Vedat Hallac
Guest






PostPosted: Wed Aug 06, 2003 1:14 pm    Post subject: Re: shannon>s entropy equasion: which log base? 2 or e ? Reply with quote

On Wed, 06 Aug 2003 07:13:36 +0000, Dr Chaos wrote:

[quote]snip
It doesn>t matter as long as you are consistent in your units.

If it is log base two then the output will be measured in the
more natural "bits" (entropy rate in bits per symbol).

.... and if it is base e, the unit is 'nats'. Here>s a piece of news for[/quote]
me: if it is base 10, it is 'hartleys'. Never heard of it before. :-)

> snip
Back to top
Dale King
Guest






PostPosted: Thu Aug 07, 2003 8:42 am    Post subject: Re: shannon>s entropy equasion: which log base? 2 or e ? Reply with quote

In article <060820030342542486%x@x.x>, x@x.x says...
[quote]hiyer,

this maybe bit of a thicky question possibly, but i have contradicting
information on it, so i>d really like to have it clarified please.

shannon>s well known, famous, entropy / information equasion :

H = - E P(i) log P(i)

(with E representing sigma / sum of)

the log. is it log e or log2? which base is it in?
[/quote]
Quoting Shannon:

"The choice of a logarithmic base corresponds to the choice of a unit for
measuring information. If the base 2 is used the resulting units may be
called binary digits, or more briefly bits....If the base 10 is used the
units may be called decimal digits."

Basically what it says is that the base depends on what your basic unit
is.

Take for example the case of an equal distribution of 256 different
values. P(i) is the same for all i so the equation can be simplified to.

H = - 256 P(i) log P(i)

We know that P(i) = 1 / 256 and substituting that in gives:

H = - log 1/256 = log 256

The choice of bases depends on what unit you want to measure in. If you
want it in bits you use base 2:

H = log2( 256 ) = 8 bits.

The information content of a symbol is 8 bits, which means you need an
average of 8 bits to encode one of the values.

Let>s say we are going to encode it as hex characters. How many hex
characters do we need to encode it? In that case you use base 16:

H = log16( 256 ) = 2 hex characters.

How many octal characters do we need. We use base 8:

H = log8( 256 ) = 2 2/3 octal digits.

How many decimal digits?

H = log10( 256 ) = 2.408 decimal digits.

Lets say we are using some communication method that has three states.
How many trinary digits or "trits" would it take on average to encode one
of these 256 values? We use base 3 in that case:

H = log3( 256 ) = 5.047 trits

So generally today, it will be base 2 because we usually care about bits,
but in general the base is determined by the symbol set you are using for
encoding.

When Shannon wrote his paper (1948) the world had not exactly settled on
binary as the standard. There is actually quite a history for trinary
computers extending even into the sixties (see
http://www.americanscientist.org/Issues/Comsci01/Compsci2001-11.html).
Even today there are those advocating trinary (http://www.trinary.cc).
Shannon himself published a paper in 1950 paper discussing trinary.

But even with binary being the standard we still have communication
systems that have more than two states. QPSK has 4 states, 8-VSB has 8
states, QAM-64 has 64 and QAM-256 has 256. These systems are all in use
currently for digital TV.

And remember that it is only a constant multiplier to convert from one
base to another so as long as you identify what base you used by
specifying what units the result is in, it istrivial to convert to
another unit. For example

8 bits / log2( 10 ) = 2.408 decimal digits
--
Dale King
Back to top
Thomas Richter
Guest






PostPosted: Mon Aug 11, 2003 3:58 pm    Post subject: Re: shannon>s entropy equasion: which log base? 2 or e ? Reply with quote

[quote]On Wed, 06 Aug 2003 07:13:36 +0000, Dr Chaos wrote:
[/quote]

[quote]If it is log base two then the output will be measured in the
more natural "bits" (entropy rate in bits per symbol).

... and if it is base e, the unit is 'nats'. Here>s a piece of news for
me: if it is base 10, it is 'hartleys'. Never heard of it before. :-)
[/quote]
And I thought for base e the unit would be called "bins"...

So long,
Thomas
Back to top
bendy
Guest






PostPosted: Tue Aug 12, 2003 7:00 pm    Post subject: Re: shannon>s entropy equasion: which log base? 2 or e ? Reply with quote

In article <slrnbj1al0.hac.mbkennelSPAMBEGONE@lyapunov.ucsd.edu>, Dr
Chaos <mbkennelSPAMBEGONE@NOSPAMyahoo.com> wrote:

[quote]On Wed, 6 Aug 2003 02:43:08 +0000 (UTC), bendy <x@x.x> wrote:
hiyer,

this maybe bit of a thicky question possibly, but i have contradicting
information on it, so i>d really like to have it clarified please.

shannon>s well known, famous, entropy / information equasion :

H = - E P(i) log P(i)

(with E representing sigma / sum of)

actually expectation value over the distribution P(i)
[/quote]
i>m confused on that. in the equation where E is in the above typed
version is that bent looking E which i beleive is the greek letter
sigma. and i also further have the impression that a sigma symbol like
that represent / means "the sum of" ? am i incorrect there? or is it a
bit different because it>s refering to a probability rather than
something that>s not to do with probability?


[quote]

the log. is it log e or log2? which base is it in?

It doesn>t matter as long as you are consistent in your units.

If it is log base two then the output will be measured in the
more natural "bits" (entropy rate in bits per symbol).

But analysis works better with natural logarithm as you take
derivatives the, well, natural way.

That>s why Shannon didn>t make a big point about distinguishing
it.
[/quote]
i see, so base e input would mean using log e. binary input means using
log2. decimal input, log10. (i>m now having trouble imagining how or
what input in base e would look like, but in any case..)

but that would certainly explain why i have conflicting log bases with
shannon>s equasion. so if someone wrote shannon>s equasion with log e,
you couldn>t say "that>s incorrect" because it entirely depends on the
input.

great. thanks for the explenation
Back to top
bendy
Guest






PostPosted: Tue Aug 12, 2003 7:01 pm    Post subject: Re: shannon>s entropy equasion: which log base? 2 or e ? Reply with quote

In article <MPG.199a56c6f486491b989721@netnews.insightBB.com>, Dale
King <kingd@tmicha.net> wrote:

[quote]In article <060820030342542486%x@x.x>, x@x.x says...
hiyer,

this maybe bit of a thicky question possibly, but i have contradicting
information on it, so i>d really like to have it clarified please.

shannon>s well known, famous, entropy / information equasion :

H = - E P(i) log P(i)

(with E representing sigma / sum of)

the log. is it log e or log2? which base is it in?

Quoting Shannon:

"The choice of a logarithmic base corresponds to the choice of a unit for
measuring information. If the base 2 is used the resulting units may be
called binary digits, or more briefly bits....If the base 10 is used the
units may be called decimal digits."

Basically what it says is that the base depends on what your basic unit
is.

Take for example the case of an equal distribution of 256 different
values. P(i) is the same for all i so the equation can be simplified to.

H = - 256 P(i) log P(i)

We know that P(i) = 1 / 256 and substituting that in gives:

H = - log 1/256 = log 256

The choice of bases depends on what unit you want to measure in. If you
want it in bits you use base 2:

H = log2( 256 ) = 8 bits.

The information content of a symbol is 8 bits, which means you need an
average of 8 bits to encode one of the values.

Let>s say we are going to encode it as hex characters. How many hex
characters do we need to encode it? In that case you use base 16:

H = log16( 256 ) = 2 hex characters.

How many octal characters do we need. We use base 8:

H = log8( 256 ) = 2 2/3 octal digits.

How many decimal digits?

H = log10( 256 ) = 2.408 decimal digits.

Lets say we are using some communication method that has three states.
How many trinary digits or "trits" would it take on average to encode one
of these 256 values? We use base 3 in that case:

H = log3( 256 ) = 5.047 trits

So generally today, it will be base 2 because we usually care about bits,
but in general the base is determined by the symbol set you are using for
encoding.

When Shannon wrote his paper (1948) the world had not exactly settled on
binary as the standard. There is actually quite a history for trinary
computers extending even into the sixties (see
http://www.americanscientist.org/Issues/Comsci01/Compsci2001-11.html).
Even today there are those advocating trinary (http://www.trinary.cc).
Shannon himself published a paper in 1950 paper discussing trinary.

But even with binary being the standard we still have communication
systems that have more than two states. QPSK has 4 states, 8-VSB has 8
states, QAM-64 has 64 and QAM-256 has 256. These systems are all in use
currently for digital TV.

And remember that it is only a constant multiplier to convert from one
base to another so as long as you identify what base you used by
specifying what units the result is in, it istrivial to convert to
another unit. For example

8 bits / log2( 10 ) = 2.408 decimal digits
[/quote]
thanks very much for your explenation there. some of it>s a little over
my head but i get it basically.

it seems that in actual fact the choice of base that you use is a
fairly inconsiquensial part of it. so long as you stick to the same
base throughout it doesn>t make too much difference. (i>m talking
genarally (or maybe mathematically) rather than from a specific
compression on a computer perspective).

what seems far more pivitol is the number of possabilities, and what
probabilities they have - or that you give them. i read most of the
'entropy of a sequence' thread that>s in this newsgroup and how the
probabilites of the message in question were being gained was never
really mentioned. several people said 'it depends on which model you>re
using' and i guess that very much wrapped up with how the probabilities
are gained - how the message is treated. etc. that must be the model
right?

thinking about it, the model, if i>m correct on what that is, is *far*
more important that shannon>s theory. and how you get those
probabilities and treat the messages - how you split them up, or don>t
as the case maybe is very important and was not covered i don>t think
by shannon>s paper. so the data that you feed shannon>s theory is very
important - that>s what i>m saying. i guess that>s pretty obvious, but
then didn>t seem to be from that thread. i haven>t actually read
shannon>s paper - the maths part trips me over every time i try, but
i>ve read warren wheeler>s more english descriptive follow up and that
gives a pretty good over view of it i think.

in fact that>s another thing from that thread that i mentioned -
someone mentioned parts of shannon>s paper refering to page numbers
something like 370 or something. i>ve got both the pdf and the book
"mathematical theory of communication" and in the book shannon>s paper
goes up to page 91. are there two different papers? the one in the book
is the famous / main one i think. is there another "shannon>s paper" as
it were?
Back to top
rep_movsd
Guest






PostPosted: Tue Aug 19, 2003 12:19 am    Post subject: Re: shannon>s entropy equasion: which log base? 2 or e ? Reply with quote

bendy <x@x.x> wrote in message news:<120820032000329944%x@x.x>...
[quote]In article <MPG.199a56c6f486491b989721@netnews.insightBB.com>, Dale
King <kingd@tmicha.net> wrote:

In article <060820030342542486%x@x.x>, x@x.x says...
hiyer,

this maybe bit of a thicky question possibly, but i have contradicting
information on it, so i>d really like to have it clarified please.

shannon>s well known, famous, entropy / information equasion :

H = - E P(i) log P(i)

(with E representing sigma / sum of)

the log. is it log e or log2? which base is it in?

Quoting Shannon:

"The choice of a logarithmic base corresponds to the choice of a unit for
measuring information. If the base 2 is used the resulting units may be
called binary digits, or more briefly bits....If the base 10 is used the
units may be called decimal digits."

Basically what it says is that the base depends on what your basic unit
is.

Take for example the case of an equal distribution of 256 different
values. P(i) is the same for all i so the equation can be simplified to.

H = - 256 P(i) log P(i)

We know that P(i) = 1 / 256 and substituting that in gives:

H = - log 1/256 = log 256

The choice of bases depends on what unit you want to measure in. If you
want it in bits you use base 2:

H = log2( 256 ) = 8 bits.

The information content of a symbol is 8 bits, which means you need an
average of 8 bits to encode one of the values.

Let>s say we are going to encode it as hex characters. How many hex
characters do we need to encode it? In that case you use base 16:

H = log16( 256 ) = 2 hex characters.

How many octal characters do we need. We use base 8:

H = log8( 256 ) = 2 2/3 octal digits.

How many decimal digits?

H = log10( 256 ) = 2.408 decimal digits.

Lets say we are using some communication method that has three states.
How many trinary digits or "trits" would it take on average to encode one
of these 256 values? We use base 3 in that case:

H = log3( 256 ) = 5.047 trits

So generally today, it will be base 2 because we usually care about bits,
but in general the base is determined by the symbol set you are using for
encoding.

When Shannon wrote his paper (1948) the world had not exactly settled on
binary as the standard. There is actually quite a history for trinary
computers extending even into the sixties (see
http://www.americanscientist.org/Issues/Comsci01/Compsci2001-11.html).
Even today there are those advocating trinary (http://www.trinary.cc).
Shannon himself published a paper in 1950 paper discussing trinary.

But even with binary being the standard we still have communication
systems that have more than two states. QPSK has 4 states, 8-VSB has 8
states, QAM-64 has 64 and QAM-256 has 256. These systems are all in use
currently for digital TV.

And remember that it is only a constant multiplier to convert from one
base to another so as long as you identify what base you used by
specifying what units the result is in, it istrivial to convert to
another unit. For example

8 bits / log2( 10 ) = 2.408 decimal digits

thanks very much for your explenation there. some of it>s a little over
my head but i get it basically.

it seems that in actual fact the choice of base that you use is a
fairly inconsiquensial part of it. so long as you stick to the same
base throughout it doesn>t make too much difference. (i>m talking
genarally (or maybe mathematically) rather than from a specific
compression on a computer perspective).

what seems far more pivitol is the number of possabilities, and what
probabilities they have - or that you give them. i read most of the
'entropy of a sequence' thread that>s in this newsgroup and how the
probabilites of the message in question were being gained was never
really mentioned. several people said 'it depends on which model you>re
using' and i guess that very much wrapped up with how the probabilities
are gained - how the message is treated. etc. that must be the model
right?

thinking about it, the model, if i>m correct on what that is, is *far*
more important that shannon>s theory. and how you get those
probabilities and treat the messages - how you split them up, or don>t
as the case maybe is very important and was not covered i don>t think
by shannon>s paper. so the data that you feed shannon>s theory is very
important - that>s what i>m saying. i guess that>s pretty obvious, but
then didn>t seem to be from that thread. i haven>t actually read
shannon>s paper - the maths part trips me over every time i try, but
i>ve read warren wheeler>s more english descriptive follow up and that
gives a pretty good over view of it i think.

in fact that>s another thing from that thread that i mentioned -
someone mentioned parts of shannon>s paper refering to page numbers
something like 370 or something. i>ve got both the pdf and the book
"mathematical theory of communication" and in the book shannon>s paper
goes up to page 91. are there two different papers? the one in the book
is the famous / main one i think. is there another "shannon>s paper" as
it were?
[/quote]
A model is essentially a view of the data through which probablities
are measured. For example If you were compressing a 24 bit color
image,the probablities for the set of symbols will deviate from
average(allowing compression) if you take 3 bytes as a symbol.If you
measured the frequencies of 2 byte symbols,without knowing the nature
of the data as being a 24 bits per pixel image, you might not be able
to compress as well

Another example, If you knew that you are compressing english text
the probablity of a 'u' is quite less but if you take into account one
previous character, it turns out that a 'q' is almost always followed
by a 'u' so you can compress much better.

There is no perfect model. just many :)
Back to top
David A. Scott
Guest






PostPosted: Tue Aug 19, 2003 4:00 am    Post subject: Re: shannon>s entropy equasion: which log base? 2 or e ? Reply with quote

rep_movsd@yahoo.co.in (rep_movsd) wrote in
news:85e8883c.0308181119.615c0fe0@posting.google.com:

[quote]
Another example, If you knew that you are compressing english text
the probablity of a 'u' is quite less but if you take into account one
previous character, it turns out that a 'q' is almost always followed
by a 'u' so you can compress much better.

There is no perfect model. just many :)


[/quote]
English is a nondead language. What you say may have been
true a few years ago. But now I see "a" following "q" much
more than one a few years ago would have thought possible.
So when compressing english text. It depends not only on
who wrote it but when.



David A. Scott
--
SCOTT19U.ZIP NOW AVAILABLE WORLD WIDE "OLD VERSIOM"
http://www.jim.com/jamesd/Kong/scott19u.zip old version
My Crypto code http://radiusnet.net/crypto/archive/scott/
My Compression code http://bijective.dogma.net/
**TO EMAIL ME drop the roman "five" **
Disclaimer:I am in no way responsible for any of the statements
made in the above text. For all I know I might be drugged.
As a famous person once said "any cryptograhic
system is only as strong as its weakest link"
Back to top
ben
Guest






PostPosted: Thu Aug 21, 2003 6:44 pm    Post subject: Re: shannon>s entropy equasion: which log base? 2 or e ? Reply with quote

In article <85e8883c.0308181119.615c0fe0@posting.google.com>, rep_movsd
<rep_movsd@yahoo.co.in> wrote:

....
[quote]how you get those
probabilities and treat the messages - how you split them up, or don>t
as the case maybe is very important and was not covered i don>t think
by shannon>s paper. so the data that you feed shannon>s theory is very
important
....[/quote]

[quote]A model is essentially a view of the data through which probablities
are measured. For example If you were compressing a 24 bit color
image,the probablities for the set of symbols will deviate from
average(allowing compression) if you take 3 bytes as a symbol.If you
measured the frequencies of 2 byte symbols,without knowing the nature
of the data as being a 24 bits per pixel image, you might not be able
to compress as well

[/quote]



[quote]Another example, If you knew that you are compressing english text
the probablity of a 'u' is quite less but if you take into account one
previous character, it turns out that a 'q' is almost always followed
by a 'u' so you can compress much better.
[/quote]
yeah, this minutely touches on the sort of thing i>m getting at with
probability assignment and message splitting, and how important how you
do that is (and i think more important that shannon>s equation). the
example you>ve used there is operating on a character by character
basis -

if you>re dealing with language then not treating the messages at least
as blocks of characters (seeing as that>s how they>re used by us)
misses a lot. i don>t think you>re just reducing it to a nice simple
manageable state to deal with there, i think you>re actually discarding
and ignoring important information. cutting your data severely short.
that>s what i think you>re doing if you choose to feed shannon>s
equation with data derived from individual character probabilities.
(this is all from a general language point of view btw).

it seems to me that you should take the data preparation much more
seriously before you make use of shannon>s theory. more seriously than
the equation itself.

the first main thing with data for a shannon equation is how you>re
going to split the whole full messages - char by char, whole message
(so not split at all), or somewhere in between, or even some other
idea, something that wouldn>t be quite so accurately described as
neatly splitting. maybe multiple different ways, overlapping, with hazy
edges? (i don>t know, i>m just imagining).

then having decided how to split your whole message (which is quite a
thing in itself) into smaller messages, there>s two main things you
need to ascertain for each split: the number of possible messages per
message (the width, eg: 256 with an 8bit char set), and the
probabilities of each possible message within that width. and then
repeat this paragraph part for each split.

what i>m trying to get at is there>s some potentially *stunning*
complexity in all that. those two last paragraphs to me seem nearly
impossible, certainly a very tall order. having decided how you>re
going to split, you then need to have a rough idea of the possible
number of versions of each split. then you>re going to need to give
each of those possible versions a probability! this is why i think that
maybe shannon>s theory, using it in general with language, is a tiny
part of the story. i>m not actually belittling it, i>m enlarging,
drawing attention to another part: the data dealing / splitting /
handling / probability deriving part that comes before using shannon>s
equation, which i think is way more complicated and imporatant that
shnanon>s equation.

you say "If you knew that you are compressing english text". you>re
never going to be able to know that for sure in advance. never ever.
it>s that type of thought that>ll unecessarily and unaturally, i think,
restrict your data, method and results. so it>s the type of thought you
should avoid. just like doing it on a char by char basis. it>s far to
narrowing.

but none of this is shannon>s fault. he never had language and
semantics etc in mind for his equation. it>s other people who>ve had
that idea about his work, which i don>t think is an incorrect thought.
there>s just a lot missing i think. shannon thought of using it for,
well like on a character by character basis (teletype or something) and
what the messages were made up of - the language/semantics etc, were
irrelevent to him.
Back to top
Display posts from previous:   
   Science and Technology news... Forum Index -> Compression Forum  
Page 1 of 1
All times are GMT

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum