In this paper, the relationship between entropy and guesswork is investigated in
detail. After a redefinition of guesswork, since the measure is not completely accurate,
the relationship or result is stated in two theorems. The first theorem states that the
redefined guesswork is equal to the concept of cross entropy, and the second theorem
states, as a consequence of the first theorem, that the redefined guesswork is equal to
the sum of the entropy and the relative entropy.
The rest of the paper is organized as follows. In Section 2, guessing strategies for
entropy and guesswork is presented. The relationship between entropy and guesswork
is investigated in Section 3. Finally, Section 4 concludes the paper.
2 Entropy, Guesswork, and Guessing
Guessing the correct value of a random variable X, can be seen as a game of two
players. Player one chooses a secret value from a given set of possible values, and
player two tries to guess the correct value, using a strategy. From the known information
about the game, such as the probability distribution of the search space or conditions
of the guessing process, a set of strategies or actions, are possible. In the continuation,
the probability distribution of the search space is assumed to be known. Furthermore,
from the set of strategies we normally want to use an optimal guessing strategy, that
minimizes the needed number of questions to find the value of X. This is the focus of
game theory [6], i.e., how to best play the game.
In order to compare the efficiency between different strategies, possibly having dif-
ferent information about the game, measures that give the expected number of guesses
to find the correct value are needed. Two such measures are entropy and guesswork.
Entropy gives the minimum number of expected questions, when we have the possibil-
ity to ask questions of the form Q
1
=”Is X ∈ A?”, for any set A of the search space. A
variant of this question, that for example is used in the bisection method to find a root
of a continuous function in an interval, is ”Is X > a?”. Guesswork, on the other hand,
gives the minimum expected number of questions when we have the possibility to ask
questions of the form Q
2
=”Is X = x
i
?”.
For guesswork, the optimality (minimum number of questions) comes from the fact
that we can arrange the probabilities of the values x
i
in non-increasing probability order,
and then start testing them. For entropy, the optimality comes from the fact that entropy
gives the minimum average code length for compression [7], and that a sequence of yes
or no questions is equivalent to a binary code. A way to construct such a set of optimal
questions is to use the Huffman algorithm [7]. In the following, we use guesswork and
entropy for both the name of the measure and the optimal strategy that is connected to
the measure.
The difference between guesswork and entropy resides in the information of the
two questions, Q
1
and Q
2
. For Q
1
we are allowed to group several values into a set
of values, and test if the correct value is in that set. For Q
2
we are only allowed to
test one value at a time. Hence, Q
1
uses the divide and conquer strategy, binary search,
and Q
2
uses the one at a time strategy, linear search. Furthermore, Q
2
is actually a
special case of Q
1
, since Q
2
can be rewritten as ”Is X ∈ A = {x
i
}?” for any set
136