Interpreting Entropy All you data scientists will probably know the entropy equation: H ( p

Taniyah Estrada

Taniyah Estrada

Answered question

2022-06-16

Interpreting Entropy
All you data scientists will probably know the entropy equation:
H ( p ) = i = 1 n p i log 2 p i
And, using this, I was messing around with some compression, and calculated the entropy for a set of probablilties { 0.3 , 0.2 , 0.2 , 0.1 }, which came out as about 2.246.
This doesn't make sense to me, because if Entropy 1/Compression, then I've done the impossible by compressing data with these proportions.
I find myself confused as to how to interpret this value any other way. Is it bits per arbitrary unit? Am I simply wrong?

Answer & Explanation

Anika Stevenson

Anika Stevenson

Beginner2022-06-17Added 19 answers

Your calculation is wrong. The correct answer for four digits is
1.7820.
The correct interpretation of the entropy is as follows.
Assume that you have to code an infinite sequence of independent random variables distributed identically; taking different values (symbols) with the given probabilities. Then create disjoint sequence of length N out of symbols poured by the source. Use some faithful coding method. Let the random variable X N , n be the length of the n t h sequence of length N. Then for any method the average code length will be greater or equal than the entropy:
1 N E [ X N , n ] = 1 N E [ X N , 1 ] H .
(If N the average code length tends to the entropy.)

Do you have a similar question?

Recalculate according to your conditions!

Ask your question.
Get an expert answer.

Let our experts help you. Answer in as fast as 15 minutes.

Didn't find what you were looking for?