tracing based on Formula 1 can deterministically
identify ONE traitor in a coalition of nine after re-
covering 256 pirated content/keys and it takes similar
number of pirated copies to detect the second, and
subsequent traitor. In contrast, for the same coali-
tion size, the new algorithm based on detecting entire
coalition can detect all active traitors using only 56
pirated content/keys and the false positive rate can be
low at 0.0001%.
Of course, the attackers may use scapegoat strat-
egy. Some device is used heavily, for example, score 9
or 10. The traditional approach can correctly identify
him, but it is hard to find the lightly used device and
the true coalition size. The new tracing can nonethe-
less find the other members in the coalition.
Again the ultimate goal is to detect and disable all
traitors as fast as possible, we believe the traditional
traitor tracing definition does not lead to the design of
an efficient tracing scheme that can achieve the above
ultimate goal efficiently.
3.2 Assume Coalition Size vs. Deduce
Coalition Size? or Deterministic vs.
Probabilistic?
We believe it is not practical to assume a maximum
coalition size and perform deterministic tracing based
on the assumed coalition size. Indeed, because the
tracing agency rarely knows exactly how many de-
vices are involved in the attack. As a result, the an-
swers it gets are always qualified. For example, an an-
swer might be as follows: ”If N devices are involved,
it must be exactly this N. However, different innocent
coalitions of N + M devices may have produced the
same result.” We will walk readers through a simple
example to show how the actual tracing is done based
on forensic evidence.
Suppose each content/key comes with 256 varia-
tions and there are 255 content/keys in the sequence.
So each device is assigned 255 content/keys with one
variation in each content/key. The assignment can be
done using a systematic approach like error correct-
ing code, for example, Reed-Solomon code. This ap-
proach can guarantee that any two users differ at at
least 252 content/key assignment. This assignment
can support 1 billion devices in the system. For any
given content,
1
256
of the devices (about 4 million de-
vices) encode the content the same way. For a given
three content, only
1
256
3
of the devices (about 60
players) encode those content the same way. For a
given four content, exactly 0 of the devices encode
the content the same way. That is the essential prop-
erty of the Reed-Solomon code assignment.
Let us take the case of an attack where only a sin-
gle device X is being used. After recovering a single
content/key, the license agency has four million de-
vices that are potential candidates, including X. Af-
ter the second or third recovered pirated content/keys,
the number of candidates is reduced, but it is not un-
til the fourth pirate content/key is recovered that the
guilty device X positively identified– BUT only if it
is known that only a single device is involved. Mil-
lions of pairs of devices could also have produced the
four pirated content/keys.
By the time nine pirated content/keys have been
recovered, the license agency knows there are no pos-
sible innocent pairs of devices. (By ”innocent”, we
mean a pair that does not include the actual guilty
device X.) An innocent pair could have produced at
most six of the pirated content/keys. However, an in-
nocent triplet picked at random could have produced
all nine pirated content/keys, each member of the
triplet having three content/keys in common with the
guilty device. The number of such triplets are:
9
3
* 60 *
6
3
* 60 *
3
3
* 60
Among all the
1,000,000,000
3
triplets the probabil-
ity that a triplet picked at random is in the above set is
roughly 2 in 10
18
. If the licensing agency is willing to
assign apriori probabilities to the different numbers of
attackers, and assuming that the attackers cannot de-
duce the code and therefore must act randomly, the
license agency can perform a Bayesian analysis and
conclude, based on the observedresult, what the prob-
ability is that the indicated device X is, in fact, guilty.
One important caveat is traditionally addressed by
defining the tracing problem to be finding a single
member of the coalition, not finding the exact mem-
bership of the coalition. So the Bayesian analysis
really reveals the probability that device X is an at-
tacker, not that he is the sole attacker.
As one can see from the above simple example,
without knowing the actual coalition size, the tracing
has to be probabilistic. During tracing, every time a
pirated copy is recovered, it increases the probability
that the suspect device is actually guilty. That is the
nature of the tracing when the actual coalition size
is unknown. From the example above, we can also
see during the process of figuring out the traitorous
devices, it is possible to deduce the size of the active
members in the coalition without having to assume
the maximum coalition size. We believe performing
probabilistic tracing and deducing the active coalition
size much better fits the real world scenarios.
SECRYPT 2008 - International Conference on Security and Cryptography
334