You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Nebu comments on Perceptual Entropy and Frozen Estimates - Less Wrong Discussion

10 Post author: Davidmanheim 03 June 2015 07:27PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (20)

You are viewing a single comment's thread.

Comment author: Nebu 10 October 2015 03:09:09AM *  1 point [-]

Feedback:

Need an example? Sure! I have two dice, and they can each land on any number, 1-6. I’m assuming they are fair, so each has probability of 1/6, and the logarithm (base 2) of 1/6 is about -2.585. There are 6 states, so the total is 6* (1/6) * 2.585 = 2.585. (With two dice, I have 36 possible combinations, each with probability 1/36, log(1/36) is -5.17, so the entropy is 5.17. You may have notices that I doubled the number of dice involved, and the entropy doubled – because there is exactly twice as much that can happen, but the average entropy is unchanged.) If I only have 2 possible states, such as a fair coin, each has probability of 1/2, and log(1/2)=-1, so for two states, (-0.5*-1)+(-0.5*-1)=1. An unfair coin, with a ¼ probability of tails, and a ¾ probability of heads, has an entropy of 0.81. Of course, this isn’t the lowest possible entropy – a trick coin with both sides having heads only has 1 state, with entropy 0. So unfair coins have lower entropy – because we know more about what will happen.

I've had to calculate information entropy for a data compression course, so I felt like I already knew the concepts you were trying to explain here, but I was not able to follow your explanation at all.

the logarithm (base 2) of 1/6 is about -2.585. There are 6 states, so the total is 6* (1/6) * 2.585 = 2.585.

The total what? Total entropy for the two dice that you have? For just one of those two dice? log(1/6) is a negative number, so why do I not see any negative numbers used in your equation? There are 6 states, so I guess that sort of explains why you're multiplying some figure by 6, but why are you dividing by 6?

If I only have 2 possible states, such as a fair coin, each has probability of 1/2, and log(1/2)=-1, so for two states, (-0.5*-1)+(-0.5*-1)=1.

Why do you suddenly switch from the notation 1/2 to the notation 0.5? Is that significant (they're referring to different concepts who coincidentally happen to have equal values)? If they actually refer to the same value, why do we have the positive value 1/2, but negative value -0.5?

Suggestion:

  • Do fair coin first, then fair dice, then trick coin.
  • Point out that a fair coin has 2 outcomes when flipped, each with equal probability, so it has entropy [-1/2 log2(1/2)] + [-1/2 log2(1/2)] = (1/2) + (1/2) = 1.
  • Point out a traditional fair dice has 6 outcomes when rolled, each of equal probability, and so it has entropy ∑n=1 to 6 of -1/6 log2(1/6) =~ 6 * -1/6 * -2.585 = 2.585.
  • Point out that a trick coin that always comes up heads has 1 outcome when flipped, so it has entropy -1 log2(1/1) = 0.
  • Point out that a trick coin that always comes up heads 75% of the time has entropy [-3/4 log2(3/4)]+[-1/4 log2(1/4)] =~ 0.311 + 0.5 = 0.811.
  • Consistently use the same notation for each example (I sort of got lazy and used ∑ for the dice to avoid writing out a value 6 times). In contrast, do not use 6 * (1/6) * 2.585 = 2.585 for one example (where all the factors are positive) and then (-0.5*-1)+(-0.5*-1)=1 for another example (where we rely on pairs of negative factors to become positive).