Friday, August 2, 2019

Entropy

Entropy is a measure of uncertainty. High entropy means the data has high variance and thus contains a lot of information and/or noise. For instance, a constant function where f(x) = 4 for all x has no entropy and is easily predictable, has little information, has no noise and can be briefly represented . Similarly, f(x) = ~4 has some entropy while f(x) = random_number is very high entropy due to noise.

Information entropy is a concept from information theory. It tells how much information there is in an event. In general, the more certain or deterministic the event is, the less information it will contain. More clearly stated, information is an increase in uncertainty or entropy. The concept of information entropy was created by mathematician Claude Shannon.

Generally speaking, information entropy is the average amount of information conveyed (sent,transported) by an event, when considering all possible outcomes (results).

Example:
we have 3 bags:

  • 1st with 4 red balls
  • 2nd with 3 red and 1 green balls
  • 3rd with 2 red and 2 green balls
Entropy and information are opposites. The more variants of arrangement of the balls we have the more amount of entropy we'll get. So if we'd speak about color probability if one ball is taken from the bag:

  • 1st bag have 100% probability of red color, so this bag has the least entropy
  • 2nd bag has 75% probability of red and 25% probability of green color, has medium entropy
  • 3rd bag has 50% probability of red and 50% probability of green color, has the greatest entropy


No comments:

Post a Comment