Previous section   Next section

32.3 Joint and Conditional Entropy

Joint and conditional entropy are analogous to joint and conditional probability.

32.3.1 Joint Entropy

Definition 32–5. Let the random variable X take values from some set { x1, ... xn }. The value xi occurs with probability p(X = xi), where graphics/nsigma.gif p(X = xi) = 1. Let the random variable Y take values from some set { y1, ... ym }. The value yj occurs with probability p(Y = yj), where graphics/msigma.gif p(Y = yj) = 1. The joint entropy of X and Y is

H(X, Y) = graphics/msigman.gif graphics/nsigma.gif p(X = xi, Y = yj) lg p(X = xi, Y = yj)

EXAMPLE: Let X be a random variable representing the roll of one die, and let Y be a random variable representing the flip of a coin. (See the second example in Section 32.1.) Assuming that both die and coin are fair, the entropy of X and Y taken jointly is

H(X, Y)

=

graphics/msigman.gif graphics/nsigma.gif p(X = xi, Y = yj) lg p(X = xi, Y = yj)

 

=

–2 [ 6 [ (1/12) lg (1/12) ] ] = lg 12 Ý 3.59

32.3.2 Conditional Entropy

Definition 32–6. Let the random variable X take values from some set { x1, ... xn }. The value xi occurs with probability p(X = xi), where graphics/nsigma.gif p(X = xi) = 1. Let the random variable Y take values from some set { y1, ... ym }. The value yj occurs with probability p(Y = yj), where graphics/msigma1.gif p(Y = yj) = 1. The conditional entropy, or equivocation, of X given Y = yj is

H(X | Y = yj) = graphics/nsigman.gif p(X = xi | Y = yj) lg p(X = xi | Y = yj)

The conditional entropy of X given Y is

H(X | Y) = graphics/msigman.gif p(Y = yj)[graphics/nsigma.gif p(X = xi | Y = yj) lg p(X = xi | Y = yj)]

The latter is the weighted mean of the conditional entropies of X given Y = yj for the possible values of Y.

EXAMPLE: Let X represent the roll of a red die, and let Y represent the sum of the values from rolling the red die and a blue die. Then the conditional entropy of X given Y = 2 is

H(X | Y = 2) = graphics/nsigman.gif p(X = xi | Y = 2) lg p(X = xi | Y = 2) = 0

because p(X = 1 | Y = 2) = 1 and p(X = i | Y = 2) = 0 for i = 2, ..., 6. However, the conditional entropy of X given Y = 7 is

H(X | Y = 7) = graphics/nsigman.gif p(X = xi | Y = 7) lg p(X = xi | Y = 7) = lg 6 Ý 2.58

Intuitively, these results make sense. If the total of the red and blue dice comes up 2, both must be 1, and so in the first case the conditional entropy is 0 because, given the value of Y, there is no uncertainty in the value of X. In the second case, the red die may take on any of its six possible values, so, assuming that it is fair, the uncertainty corresponds to each possible value of X being equally likely.

The conditional entropy of X given Y is

H(X | Y)

=

graphics/12sigma.gif p(Y = yj)[graphics/6sigma.gif p(X = xi | Y = yj) lg p(X = xi | Y = yj)]

 

=

p(Y = 2)[graphics/6sigma.gif p(X = xi | Y = 2) lg p(X = xi | Y = 2)] + … + p(Y = 12)[graphics/6sigma.gif p(X = xi | Y = 12) lg p(X = xi | Y = 12)] Ý 1.8955

32.3.3 Perfect Secrecy

Perfect secrecy arises when knowing the ciphertext does not decrease the uncertainty of the plaintext. More formally:

Definition 32–7. Let M be a random variable that takes values from the set of messages m1, …mn. The cipher C = E(M) achieves perfect secrecy if H(M | C) = H(M).

EXAMPLE: The one-time pad (see Section 9.2.2.2) meets this requirement. Let M be a random variable representing a message selected from a set of n four-letter messages. The uncertainty of this variable is H(M). An attacker intercepts C = AAVG. Given this, the uncertainty of M is H(M | C). However, the attacker has gleaned no more information than was initially available, because the key is four randomly chosen letters. Any message could produce the intercepted ciphertext. Hence, H(M | C) = H(M), and the cipher achieves perfect secrecy.


  Previous section   Next section
Top