Joint and conditional entropy are analogous to joint and conditional probability.
Definition 32–5. Let the random variable X take values from some set { x1, ... xn }. The value xi occurs with probability p(X = xi), where
p(X = xi) = 1. Let the random variable Y take values from some set { y1, ... ym }. The value yj occurs with probability p(Y = yj), where
p(Y = yj) = 1. The joint entropy of X and Y is
H(X, Y) =
![]()
p(X = xi, Y = yj) lg p(X = xi, Y = yj)
|
EXAMPLE: Let X be a random variable representing the roll of one die, and let Y be a random variable representing the flip of a coin. (See the second example in Section 32.1.) Assuming that both die and coin are fair, the entropy of X and Y taken jointly is
|
Definition 32–6. Let the random variable X take values from some set { x1, ... xn }. The value xi occurs with probability p(X = xi), where
p(X = xi) = 1. Let the random variable Y take values from some set { y1, ... ym }. The value yj occurs with probability p(Y = yj), where
p(Y = yj) = 1. The conditional entropy, or equivocation, of X given Y = yj is
H(X | Y = yj) =
p(X = xi | Y = yj) lg p(X = xi | Y = yj)
The conditional entropy of X given Y is
H(X | Y) =
p(Y = yj)[
p(X = xi | Y = yj) lg p(X = xi | Y = yj)]
The latter is the weighted mean of the conditional entropies of X given Y = yj for the possible values of Y.
|
EXAMPLE: Let X represent the roll of a red die, and let Y represent the sum of the values from rolling the red die and a blue die. Then the conditional entropy of X given Y = 2 is
because p(X = 1 | Y = 2) = 1 and p(X = i | Y = 2) = 0 for i = 2, ..., 6. However, the conditional entropy of X given Y = 7 is
Intuitively, these results make sense. If the total of the red and blue dice comes up 2, both must be 1, and so in the first case the conditional entropy is 0 because, given the value of Y, there is no uncertainty in the value of X. In the second case, the red die may take on any of its six possible values, so, assuming that it is fair, the uncertainty corresponds to each possible value of X being equally likely. The conditional entropy of X given Y is
|
Perfect secrecy arises when knowing the ciphertext does not decrease the uncertainty of the plaintext. More formally:
Definition 32–7. Let M be a random variable that takes values from the set of messages m1, …mn. The cipher C = E(M) achieves perfect secrecy if H(M | C) = H(M).
|
EXAMPLE: The one-time pad (see Section 9.2.2.2) meets this requirement. Let M be a random variable representing a message selected from a set of n four-letter messages. The uncertainty of this variable is H(M). An attacker intercepts C = AAVG. Given this, the uncertainty of M is H(M | C). However, the attacker has gleaned no more information than was initially available, because the key is four randomly chosen letters. Any message could produce the intercepted ciphertext. Hence, H(M | C) = H(M), and the cipher achieves perfect secrecy. |
| Top |