Improved “Partial Sums”-based Square Attack on AES
Michael Tunstall
Department of Computer Science, University of Bristol, Merchant Venturers Building, Woodland Road, Bristol, U.K.
Keywords:
Cryptanalysis, Square Attack, Advanced Encryption Standard.
Abstract:
The Square attack as a means of attacking reduced round variants of AES was described in the initial descrip-
tion of the Rijndael block cipher. This attack can be applied to AES, with a relatively small number of chosen
plaintext-ciphertext pairs, reduced to less than six rounds in the case of AES-128 and seven rounds otherwise
and several extensions to this attack have been described in the literature. In this paper we describe new vari-
ants of these attacks that have a smaller time complexity than those present in the literature. Specifically, we
demonstrate that the quantity of chosen plaintext-ciphertext pairs can be halved producing the same reduction
in the time complexity. We also demonstrate that the time complexity can be halved again for attacks applied
to AES-128 and reduced by a smaller factor for attacks applied to AES-192. This is achieved by eliminating
hypotheses on-the-fly when bytes in consecutive subkeys are related because of the key schedule.
1 INTRODUCTION
The Advanced Encryption Standard (AES) (FIPS
PUB 197, 2001) was standardized in 2001 from a
proposal by Daemen and Rijmen (Daemen and Rij-
men, 1998). It has since been analyzed with regard
to numerous attacks ranging from purely theoretical
cryptanalysis to attacks that require some extra infor-
mation, e.g from some side channel (Mangard et al.,
2007), to succeed.
In Daemen and Rijmen’s AES proposal an at-
tack is described that is referred to as the Square at-
tack (Daemen and Rijmen, 1998). This attack was
so-called since it was first presented in the descrip-
tion of the block cipher Square (Daemen et al., 1997).
The Square attack is based on a particular property
arising from the structure of AES. That is, for a set of
256 plaintexts where each byte at an arbitrary index is
a distinct value and all the other bytes are equal, the
XOR sum of the 256 intermediate states after three
rounds of AES will be zero.
Some optimizations to this attack have been pro-
posed in the literature. Ferguson et al. proposed a
way of conducting the Square attack referred to as the
“partial sums” method (Ferguson et al., 2001). This
allowed the Square attack to be conducted with a rel-
atively low time complexity for reduced round vari-
ants of AES. The time complexity of these attacks
was further reduced following an observation made
by Lucks. He noted that given a known last subkey in
AES-192 then information on previous subkeys can
be derived.
Recent cryptanalytical attacks have predomi-
nantly been focused on other properties, such as im-
possible differentials (Bahrak and Aref, 2008; Lu
et al., 2008; Zhang et al., 2007). The use of impos-
sible differentials is related to the Square attack but
allows an attacker to overcome variants of AES with
more rounds. Recently, a marginal attack on AES
has also been proposed that is based on the use of
bicliques (Bogdanov et al., 2011). However, for suit-
ably reduced round variants of AES the “partial sums”
method proposed by Ferguson et al. is currently the
most efficient chosen plaintext attack.
In this paper we describe how the attacks pro-
posed by Ferguson et al. and Lucks can be im-
proved. Specifically, we show that the number of cho-
sen plaintext-ciphertext pairs required to conduct the
Square attack can be halved and therefore halve the
time complexity of the attack. Moreover, we demon-
strate that the time complexity of the Square attack
can be halved again when applied to AES-128, and
reduced to a lesser extent for AES-192, by exploiting
relationships between key bytes as they are derived.
In this paper we restrict ourselves to attacks that re-
quire a relatively small number of chosen plaintext-
ciphertext pairs. The attacks proposed by Ferguson
et al., based on the Square attack, that require around
2
128
chosen plaintext-ciphertext pairs are beyond the
cope of this paper (Ferguson et al., 2001).
25
Tunstall M..
Improved “Partial Sums”-based Square Attack on AES.
DOI: 10.5220/0003990300250034
In Proceedings of the International Conference on Security and Cryptography (SECRYPT-2012), pages 25-34
ISBN: 978-989-8565-24-2
Copyright
c
2012 SCITEPRESS (Science and Technology Publications, Lda.)
This paper is organized as follows. In Section 2 we
define the notation we use to describe AES. In Sec-
tion 3 we describe the property that the Square attack
is based on. In Section 4 we describe how the Square
attack can be applied to AES-128, and in Section 5
we describe how the Square can be applied to AES-
192 and AES-256. We summarize our contribution
and conclude in Section 6.
2 PRELIMINARIES
In this paper, multiplications in F
2
8
are considered to
be polynomial multiplications modulo the irreducible
polynomial x
8
+ x
4
+ x
3
+ x + 1. It should be clear
from the context when a mathematical expression
contains integer multiplication.
2.1 The Advanced Encryption Standard
Algorithm 1: The AES encryption function.
Input: The 128-bit plaintext block P and 128,
192 or 256-bit secret key K, with N set
to 10, 12, 14 respectively.
Output: The 128-bit ciphertext block C.
X
AddRoundKey
(P,K) ;
for i 1 to N do
X
SubBytes
(X) ;
X
ShiftRows
(X) ;
if i 6= N then
X
MixColumns
(X) ;
end
X
AddRoundKey
(X,K) ;
end
C X ;
return C
The structure of the Advanced Encryption Standard
(AES) , as used to perform encryption, is illustrated
in Algorithm 1. In discussing the AES we consider
that all intermediate variables of the encryption op-
eration variables are arranged in a 4 × 4 array of
bytes, referred to as the state matrix. For example,
the 128-bit plaintext P = (p
1
, p
2
,..., p
16
), where each
p
i
{1,...,16} is one byte, is arranged in the follow-
ing fashion
p
1
p
5
p
9
p
13
p
2
p
6
p
10
p
14
p
3
p
7
p
11
p
15
p
4
p
8
p
12
p
16
.
The encryption itself is conducted by the repeated use
of a round function that comprises the following op-
erations executed in sequence:
The
AddRoundKey
operation XORs each byte of
the array with a byte from a corresponding subkey.
In each instance of the
AddRoundKey
a fresh 16-byte
subkey is used from the subkey bytes generated by the
key schedule. We describe how this is done in more
detail for the different variants of AES in Section 2.2.
The
SubBytes
operation is the only nonlinear step
of the block cipher, consisting of a substitution table
applied to each byte of the state. This replaces each
byte of the state matrix by its multiplicative inverse,
followed by an affine mapping. In the remainder of
this paper we will refer to the function S as this sub-
stitution table and S
1
as its inverse.
The
ShiftRows
operation is a byte-wise permuta-
tion of the state that operates on each row.
The
MixColumns
operation operates on the state col-
umn by column. Each column of the state matrix is
considered as a vector where each of its four elements
belong to F
2
8
. A 4×4 matrix M whose elements are
also in F
2
8
is used to map this column into a new vec-
tor. This operation is applied to the four columns of
the state matrix. Here M and its inverse M
1
are de-
fined as
M =
2 3 1 1
1 2 3 1
1 1 2 3
3 1 1 2
andM
1
=
14 11 13 9
9 14 11 13
13 9 14 11
11 13 9 14
All the elements in M and M
1
are elements of F
2
8
expressed in decimal.
In Algorithm 1 we can see that the last round
does not include the execution of a
MixColumns
op-
eration. In all the attacks considered in this pa-
per we will assume that the last round does not in-
clude a
MixColumns
operation. This is important to
note since it has been shown that the presence of a
MixColumns
operation in the last round would affect
the security of AES (Dunkelman and Keller, 2010).
2.2 The Key Schedule
The key schedule generates a series of subkeys from
the secret key. There are three variants of the AES
corresponding to the three possible bit lengths of the
secret key used, i.e. 128, 192 or 256 bits. In Algo-
rithm 2 we show how the subkey bytes are generated
from an initial secret key K. The function S is the sub-
stitution function used in the
SubBytes
operation de-
scribed above. The function f is, for the most part, the
identity function. However, when K is a 256-bit key
and j = 4 then f is the substitution function S. RCON
is a round constant that changes for each loop. We
refer the reader to the AES specification for a more
detailed description of the key schedule (FIPS PUB
197, 2001).
SECRYPT2012-InternationalConferenceonSecurityandCryptography
26
Algorithm 2: The AES key schedule function.
Input: X-bit secret key K, with X set to 128,
192 or 256 and N set to 10, 12, 14
respectively, RCON.
Output: W a stream of subkey bytes.
for i 0 to X/8 1 do W[i] K[i] ;
for i 1 to (N + 1) · (X/128)
2
do
for j 0 to 3 do
W[i· X + j] S(W[i· (X 1) + j]) ;
end
W[i· X] W[i· X] RCON ;
for j 1 to 3 do
for k 0 to X/4 do
W[i·X + 4 j+k] W[i(X 1)+4 j
+k] f(W[i· X + 4( j 1) + k]);
end
end
end
return W
For AES-128, knowing one subkey will allow the
original key to be derived. For AES-192 and AES-
256 there will still be some ambiguity and two sub-
keys are required to derive the original key.
3 THE SQUARE ATTACK
If we consider two plaintexts that have a XOR dif-
ference that is non-zero in one byte, then this differ-
ence will expand in a known manner. After one round
the XOR difference between the intermediate states
would show that one column of the state matrix has
a non-zero difference. This property will then propa-
gate to all the bytes in the state matrix after the second
round by the same reasoning. An example where the
difference in two plaintexts is at index one is shown
in Figure 1.
It is therefore impossible that the XOR difference
between two such plaintexts will be zero in any byte
after two rounds. This property will persist until the
next
MixColumns
operation, and is used for impos-
sible differential cryptanalysis since the XOR differ-
ence after two rounds cannot be zero (Biham and
Keller, 1999).
This property is also used to construct an attack
referred to as the Square attack (so-called since it was
first presented in the description of the block cipher
Square (Daemen et al., 1997)) and was first presented
in the original description of AES (Daemen and Rij-
men, 1998). We consider 256 distinct plaintexts that
are equal in fifteen bytes. After computing two rounds
of AES the property described above will be valid be-
tween all possible pairs, i.e. across all 256 intermedi-
ate states the bytes at each index will contain one of
each possible value. The XOR sum of the 256 bytes at
each index will therefore be equal to zero. There will
not be one instance of each possible value across the
256 bytes at each index after the next
MixColumns
operation. However, the XOR sum of the bytes at
each index will still be zero after the
MixColumns
op-
eration, and this property will remain true until the
next
SubBytes
operation. In the remainder of this
paper we will refer to a set of 256 chosen plaintext-
ciphertext pairs where the 256 distinct plaintexts that
are equal in fifteen bytes as a δ-set.
4 APPLYING THE SQUARE
ATTACK TO AES-128
Attacks based on the Square attack applicable to
AES-128 are presented in this section.
4.1 Analyzing Four-round AES
An attack based on the property described in Sec-
tion 3 was originally detailed by Daemen and Rijmen
in their AES proposal (Daemen and Rijmen, 1998).
If we consider one δ-set, the XOR sum of the inter-
mediate states at the end of the third round is equal
to zero. For a four-round variant of AES an attacker
can use this observation to validate hypotheses on the
last subkey byte-by-byte, where an attacker checks
that the XOR sum of the input to the final round is
equal to zero. For each byte this will return the cor-
rect subkey byte and one additional incorrect hypoth-
esis per byte with a probability of 255/256. That is,
any random sequence of byte will have an XOR sum
equal to zero with a probability of 1/256 and there
are 255 such sequences. This will result in an ex-
pected total number of key hypotheses for the last
subkey of
1+
255
256
16
2
16
, since the length of the
lists of hypotheses are mutually independent. One
can determine the key if one repeats the analysis, i.e.
one takes 2
9
chosen plaintexts and conducts the above
analysis twice. This would have a time complexity of
2
9
one-round decryptions (2
7
encryptions of a four-
round AES).
Biham and Keller observed the sum of a sequence
of random bytes can be computed by only consider-
ing one example of values that occur with an odd-
numbered frequency. Since values that occur with
Improved"PartialSums"-basedSquareAttackonAES
27
Plaintext
ζ 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
First Round
2
θ 0 0 0
3
θ 0 0 0
θ 0 0 0
θ 0 0 0
Second Round
2
α β γ
3
δ
3
α
2
β γ δ
α
3
β
2
γ δ
α β
3
γ
2
δ
Figure 1: Propagation of a one-byte difference across two rounds of AES. A structure in this difference is imposed by the
MixColumns
operation (see Section 2.1).
an even-numbered frequency will have no effect on
the XOR sum across all 256 intermediate values (Bi-
ham and Keller, 1999). Given 256 bytes taken from
the same index from a δ-set, one can remove all val-
ues that occur with an even-numbered frequency and
keep one example of those that occur with an odd-
numbered frequency.
We define Z as the number of instances of a given
value that occur in a sequence of 256 bytes. The prob-
ability of observing a given value n times is
Pr(Z = n) =
256
n
256
n
1
1
256
256n
(1)
for 1 n 256. The probability of observing an
odd number of a given value is therefore Pr(X =
1)+Pr(X = 3) + ...+Pr(X = 255) = 0.43, and there-
fore the number of values that need to be treated de-
creases to 256 × 0.43 = 110. The analysis given by
Biham and Keller stops here and we provide a more
precise analysis below.
We define Y as the number of distinct values that
occur in a sequence of 256 bytes. The probability of
observing m distinct values is
Pr(Y = m) =
256
(m)
256
m
256
256
, (2)
for 1 m 256. We define r
(m)
= r(r 1) . . . (r
m+ 1) and
n
i
as a function that returns the Stirling
numbers of the second kind. That is, the number of
ways of partitioning n elements into i non-empty sets.
The expectation of x is simply
256
i=1
i Pr(Y = i) = 162.
For a sequence of 256 bytes that consist of m dis-
tinct values, the probability distribution will be some-
what similar to that defined by Biham and Keller.
Again we define Z as the number of instances of a
given value that occur in a sequence of 256 bytes.
The probability of observing observing a given value
n times given that there are m distinct values is
Pr(Z = n|Y = m) =
256
n
m
n
1
1
m
256n
(3)
for 1 m, n 256. Again, the probability of ob-
serving an odd number of a given value is therefore
Pr(X = 1|Y = m) + Pr(X = 3|Y = m) + . . . +Pr(X =
255|Y = m) for a given m. We define A as the number
of distinct values that occur with an odd-numbered
frequency. Then the expectation of A will be:
E(A) =
256
i=1
i Pr(Y = i)
127
j=0
Pr(Z = (2 j+1)|Y = i) 78
(4)
That is, the sum of the number of distinct val-
ues occurring with an odd-numbered frequency
i
127
j=0
Pr(X = (2 j + 1)|Y = i)
multiplied by the
probability of it occurring. This would reduce the
time complexity of an attack requiring two δ-sets to
156 one-rounddecryptions (approximately 2
5
encryp-
tions of a four-round AES).
4.2 Analyzing Five-round AES
An extension to the above attack was also first pre-
sented in the original description of AES (Daemen
and Rijmen, 1998). This attack allowed an extra
round to be analyzed with an increase in the time com-
plexity. Rather than analyzing the final subkey byte-
by-byte, one analyzes the penultimate subkey.
In order to do this one is obliged to guess 32 bits
of the final subkey to determine one column of the
state matrix before the XOR with the penultimate sub-
key. One can then compute the
MixColumns
opera-
tion on this column, and validate hypotheses on a byte
of a subkey equivalent to the penultimate subkey (one
could compute the
MixColumns
operation on the de-
rived subkey to determine the penultimate subkey). A
valid byte would allow the property described in Sec-
tion 3 to be observed. Each evaluation reduces the
potential key space of ve bytes being analyzed by
a factor of 256, and one would need to conduct this
analysis five times to determine 32 bits of the nal
subkey (Daemen and Rijmen, 1998).
If we define each of the 2
32
partial decryptions as
having a time complexity equivalent to a quarter of a
round, analyzing five sets of 256 ciphertexts to deter-
mine 32 bits of the last subkey and eight bits of the
“penultimate subkey, can be computed with an ef-
fort equivalent to 2
40
/4 one-round AES decryptions
for one δ-set. Given this is a quarter of the work re-
quired for one set of 256 acquisitions, the total com-
plexity to determine a key using five δ-sets would be
SECRYPT2012-InternationalConferenceonSecurityandCryptography
28
5 · 2
40
one-round AES decryptions, or equivalent to
2
40
ve-round AES encryption operations.
The cryptanalysis cannot be significantly im-
proved by following the reasoning given in Sec-
tion 4.1. That is, if one partially decrypts a δ-set using
hypotheses on 32 bits of the last subkey, one can then
form hypotheses on individual bytes of the penulti-
mate subkey using one example of distinct values that
occur with an odd-numbered frequency. One could
follow this reasoning with the 32-bit values taken
from the ciphertexts. However, over 256 acquisitions
the probability of observing a value that occurs with
an even-numbered frequency will be too low to have
any significant impact on the time complexity of an
attack.
Ferguson et al. present a way of conducting
this attack that is referred to as the “partial sums”
method (Ferguson et al., 2001). They observe that
conducting the attack involves computing
i
S
1
(S
0
(c
i,0
k
0
) S
1
(c
i,1
k
1
)
S
2
(c
i,2
k
2
) S
3
(c
i,3
k
3
) k
4
),
(5)
where S
λ
, for λ {0, . . . , 3}, are bijective look-up ta-
bles that consist of the function S and a multiplica-
tion by a field element from F
2
8
. These are evaluated
efficiently by associating a partial sum” x
k
to each
ciphertext where x
k
is defined as
x
k
k
j=0
S
j
(c
j
k
j
) . (6)
This gives a map from (c
0
,c
1
,c
2
,c
3
) 7→
(x
k
,x
k+1
,...,c
3
). In order to conduct an
attack one can compute (x
1
,c
2
,c
3
), i.e.
((S
0
(c
0
k
0
) S
1
(c
1
k
1
)),c
2
,c
3
), for all the
ciphertexts in a δ-set for all possible values of k
0
and
k
1
. This will take 2
24
executions of the function S
resulting in 2
24
values for (x
1
,c
2
,c
3
). This continues
by computing (x
2
,c
3
) for all possible values of k
2
that also requires a 2
24
executions of the function S
resulting in 2
32
values for (x
2
,c
3
). Computing the 2
40
values for x
3
for all values of k
3
will require a further
2
40
executions of the function S. The last step will
require 2
48
executions of the function S
1
. Using
the estimate provided by Ferguson et al., that the
time complexity of one AES encryption is equivalent
to 2
8
executions of the function S (Ferguson et al.,
2001), implementing the attack described above will
have a time complexity equivalent to approximately
2
40
ve-round AES encryption operations. This has
the same time complexity as the straightforward
approach described previously. The method reported
by Ferguson et al. can only be applied when groups
of δ-sets are treated together (see Section 4.3).
The above analyses assume that one only consid-
ers one byte of the subkey that is equivalent to the
penultimate subkey. However, if we consider more
bytes of the penultimate subkey then fewer δ-sets are
required. An attacker guesses 32 bits of the last sub-
key which allows an attacker to validate hypotheses
on four bytes of a subkey that is equivalent to the
penultimate subkey. For one guess of 32 bits of the
final subkey one would expect a hypothesis for any
byte of the subkey that is equivalent to the penulti-
mate subkey to produce an XOR sum equal to zero
with a probability equal to 1/2
8
. Given that there are
four such bytes the probability that four bytes of this
subkey will produce four sequences with XOR sums
equal to zero is 1/2
32
. Therefore, in order to deter-
mine 32 bits of the last subkey and 32 bits of a subkey
equivalent to the penultimate subkey one would ex-
pect to need two δ-sets. This would reduce the time
complexity of the attack detailed by Daemen and Ri-
jmen (Daemen and Rijmen, 1998) to approximately
2
39
ve-round AES encryption operations.
The analysis of the hypotheses can be further op-
timized if the relationship between the last and penul-
timate subkey are verified as the attack progresses. If
we consider one δ-set, one can analyze two columns
of the penultimate subkey by guessing eight bytes of
the last subkey (in two sets of four bytes). This will
produce two sets of 2
32
hypotheses with a time com-
plexity equivalent to 2
36
ve-round AES encryption
operations (using the estimations given above). One
can then eliminate hypotheses in each set that are in-
consistent, given that we have hypotheses on eight
bytes of the penultimate key and eight bytes of the
last subkey. However, we note that extra computa-
tion is required to change the hypotheses on the de-
rived values to hypotheses for the penultimate subkey.
That is, the four bytes corresponding to the penulti-
mate subkey are multiplied by M as described in Sec-
tion 2.1. This can be efficiently computed by consid-
ering the input vector as a 32-bit word and the matrix
multiplication conducted using 32-bit operations. We
estimate the complexity of this to be approximately
equivalent to (5), which is equivalent to 1/2
6
ve-
round AES encryption operations. Operating on these
2
32
valueswill require the equivalentof 2
26
ve-round
AES encryption operations, which is negligible com-
pared to the generation of the hypotheses.
From the AES key schedule we can see that any
subkey byte can be computed from two specific bytes
in either the previous or following subkey. From
the two sets of hypotheses generated, as described
above, there will be three cases where known rela-
tionships between these sets of hypotheses can be ver-
ified. Given that the probability that all three bytes
Improved"PartialSums"-basedSquareAttackonAES
29
of a given hypotheses can be verified will be 1/2
24
,
one would expect that the two sets of 2
32
hypotheses
can be reduced to one set of 2
40
hypotheses. A third
set of 2
32
hypotheses can then be generated for one
of the remaining columns of the penultimate subkey
and four bytes of the last subkey. There will be a fur-
ther four bytes of the last subkey that are generated
by bytes for which there are already hypotheses, and
an element from the set of of 2
40
hypotheses will val-
idate a hypothesis from the new set of 2
32
hypotheses
with a probability of 1/2
32
. One would therefore ex-
pect to combine these two sets to produce a set of 2
40
hypotheses for 96 bits of the penultimate and 96 bits
of the last subkey. A set of 2
32
hypotheses can then be
generated for the final column of the penultimate sub-
key and four bytes of the final subkey. At this point
on can verify whether an entire subkey can be gener-
ated from the penultimate subkey. For each of the 2
32
hypotheses generated, hypotheses in the set of 2
40
hy-
potheses for 96 bits of the penultimate and 96 bits of
the last subkey will produce valid keys with a prob-
ability of 1/(2
8
)
9
= 1/2
72
(since there will be nine
bytes in the last subkey that will not have been veri-
fied previously). One would, therefore, expect to gen-
erate two hypotheses from the two sets of hypotheses.
One that is correct and one that fulfills the criteria by
chance.
The time complexity of the entire attack will be
2
38
ve-round AES encryption operations and require
2
40
hypotheses to be stored in memory. If a second
δ-set is included the time complexity will increase,
but the memory requirements will become negligi-
ble. The time complexity does not double since an
attacker only requires sufficient information to de-
termine which of the two remaining hypotheses are
false. This should be possible with work equivalent
to 2
36
ve-round AES encryptions, i.e. the generation
of hypotheses of 32-bits of the final and 32 bits of the
penultimate subkeys. These attack are summarized in
Table 1.
Table 1: Summary of the Square Attack on Five-round
AES-128.
Number Memory Time Remaining
of δ-sets Complexity Hypotheses
1 2
40
2
38
2
2 1 2
38
1
4.3 Analyzing an Extra Round
The attack described above applied to a five-round
AES can be extended to attack a six-round AES. In
order to permit an extra round to be analyzed a set
of 2
32
plaintexts are chosen that give all the possible
ciphertexts that differ at indexes 1,6,11, and 16. An
attacker will then seek to choose the 256 plaintext-
ciphertext pairs that produce intermediate states that
differ in only one byte after one round, i.e. the input
required to attack a five-round AES, as shown in Fig-
ure 2.
The simplest way of achieving this would be to
choose 256 32-bit values for the first column of the
intermediate state that differ in one byte. These 32-bit
values can be deciphered for a given hypothesis for
four bytes of the first subkey (specifically the bytes
at indexes 1,6,11, and 16). This will produce 256
plaintexts that produce a δ-set after one round that can
be analyzed using the attack described in Section 4.2,
each of which will provide one hypothesis for the se-
cret key given the hypotheses for 32 bits of the first
subkey. This would increase the time complexity of
an attack by a factor of 2
32
, but allow an extra round
to be analyzed.
Ferguson et al. observed that all 2
32
can be used as
a set of acquisitions to conduct the Square attack (Fer-
guson et al., 2001). That is, a set of 2
32
distinct plain-
texts differing at, for example, indexes 1,6,11, and
16 described above can be viewed as 2
24
δ-sets. This
remains true after the first round but an attacker can-
not distinguish individual δ-sets after the first round
without knowing four bytes of the first subkey. How-
ever, an attacker can treat all 2
32
acquisitions together,
i.e. the attack described in the previous section work
in the same manner but with a set of 2
32
, rather than
2
8
, acquisitions. We refer to a set of 2
32
plaintext-
ciphertext pairs that are equivalent to 2
24
δ-sets as a
-set.
An attack would proceed in the same manner as
described in Section 4.2. Using the same notation
the computation of the sets of (x
1
,c
2
,c
3
) will require
2
48
executions of the function S. This would result
in 2
32
triples (x
1
,c
2
,c
3
). However, a maximum of
2
24
values distinct values are possible. As described
in Section 4.1, one only needs to keep one example
of the triplets that occur with an odd-numbered fre-
quency. Likewise, this will produce at most 2
16
values
for (x
2
,c
3
) per key hypothesis, and at most 2
8
val-
ues for x
3
per key hypothesis. The time complexity
of the entire analysis requires 2
50
executions of the
function S for all four 32-bit sets for the final subkey,
which given our estimate given above corresponds to
2
42
AES encryptions operations. This is increased to
2
44
where five -sets are required to determine the
key.
Given that only two -sets are required to deter-
mine the key, see Section 4.2, one the complexity
using the “partial sums” method can be reduced to
2
43
. This is not immediately apparent since evalu-
SECRYPT2012-InternationalConferenceonSecurityandCryptography
30
Plaintext
0 0 0
0 0 0
0 0 0
0 0 0
AddRoundKey
0 0 0
0 0 0
0 0 0
0 0 0
SubBytes
0 0 0
0 0 0
0 0 0
0 0 0
ShiftRow
0 0 0
0 0 0
0 0 0
0 0 0
MixColumn
0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
Figure 2: Propagation of a four-byte difference across one round of AES, where represents a non-zero difference.
ating (5) will reduce the number of key hypotheses
for {k
0
,k
1
,k
2
,k
3
,k
4
} by a factor of 256. Intuitively,
one would expect that the same effort would be re-
quired to analyze one -set four times, resulting in
an attack with the same time complexity as that pro-
posed by Ferguson et al. We note that evaluating (5)
once for a given -set will requires 2
48
executions of
the function S. A second evaluation of an instance
of (5) with k
4
replaced with a different key byte from
the penultimate subkey will require 2
40
executions of
the function S. This is because S
λ
(c
i,λ
k
λ
), for one
λ {0,...,3}, will already have been computed for
all i (we note that this will involve carefully choos-
ing the order in which the first instance of (5) is eval-
uated). This same reasoning applies for a third and
fourth evaluation of (5) that will result in an 8-tuple
consisting of four bytes of the last subkey and four
bytes of the penultimate subkey. The overall com-
plexity of evaluating (5) four times for one -set is,
therefore, approximately 2
48
executions of the func-
tion S. The time complexity of the repeating this for
all four 32-bit sets for the final subkey requires 2
50
ex-
ecutions of the function S, which given our estimate
given above corresponds to 2
42
AES encryptions op-
erations. This is increased to 2
43
where two -sets are
required to determine the key.
We can reduce the time complexity further if we
exploit the relationships between key bytes in adja-
cent subkeys. That is, one can generate 8-tuples of
key byte values using the “partial sums” sums tech-
nique proposed by Ferguson et al. and use the at-
tack described in in Section 4.2 to reduce the number
of key hypotheses. Given one set of 2
32
plaintext-
ciphertext pairs an attack with a time complexity
equivalent to 2
42
six-round AES encryption opera-
tions and require that 2
40
key hypotheses are stored
in memory. Again, more acquisitions can be used to
reduce the memory requirements at the cost of an in-
creased time complexity. This is summarized in Ta-
ble 2.
Table 2: Summary of the Square Attack on Five-round
AES-128.
Number Memory Time Remaining
of δ-sets Complexity Hypotheses
1 2
40
2
42
2
2 1 2
42
1
5 APPLYING THE SQUARE
ATTACK TO AES-192 AND
AES-256
In this section we describe how the attacks described
in Section 4 are applicable to AES-192 and AES-256.
5.1 Analyzing Five-round AES
Attacking a five-round AES-192 will function using
a time complexity equivalent to 2
39
AES encryption
operations using the attack described in Section 4.2
based on the attack proposed by Ferguson et al. (Fer-
guson et al., 2001). That is, two δ-sets would be suffi-
cient to determine the last two subkeys of a five-round
instance of AES-192 or AES-256.
An attack on an instance of AES-256 cannot be
made more efficient by analyzing two consecutive
subkeys since there are no direct relationships. How-
ever, one can exploit the relationships between two
consecutive subkeys when attacking an AES-192.
This would follow a similar technique to that pre-
sented in Section 4.2. If one acquires a δ-set, one
can analyze two columns of the penultimate subkey
by guessing eight bytes of the last subkey (in two
sets of four bytes). This will produce two sets of 2
32
hypotheses with a time complexity equivalent to 2
36
ve-round AES encryption operations. One can then
eliminate hypotheses in each set that are inconsistent,
since we would have hypotheses on eight bytes of the
penultimate key and eight bytes of the last subkey.
Given the key schedule, one would expect that the
two sets of 2
32
hypotheses can be reduced to one set
of 2
48
hypotheses. A third set of 2
32
hypotheses can
then be generated for one of the remaining columns
of the penultimate subkey and four bytes of the last
subkey. One would expect that this would produce
Improved"PartialSums"-basedSquareAttackonAES
31
2
64
hypotheses with the relevant subkeys. This is be-
cause, in each case, there are two key bytes in the
penultimate key that can each be derived from bytes
the last subkey.
As previously, the memory requirements of the
above attack can be reduced by acquiring more δ-sets
at the cost of an increase in the time complexity. How-
ever, one would only need to analyse half the informa-
tion present in a second δ-set to reduce the hypotheses
to a trivial amount. The details of the square attack ap-
plied to five-round AES-192 and AES-256 are shown
in Table 3 and 4 respectively.
Table 3: Summary of the Square Attack on Five-round
AES-192.
Number Memory Time Remaining
of δ-sets Complexity Hypotheses
1 2
64
2
38
2
64
2 1 2
38.5
2
16
Table 4: Summary of the Square Attack on Five-round
AES-256.
Number Memory Time Remaining
of δ-sets Complexity Hypotheses
1 2
128
2
38
2
128
2 1 2
39
1
5.2 Analyzing Six-round AES
The attacks on five rounds of AES-192 and AES-256
can be extended by assuming that an attacker knows
the last subkey. The time complexity of the five-round
attack does increases by a factor of 2
128
when it is
applied to a six round variants of AES-256. How-
ever, Lucks observed that knowledge of the last sub-
key would allow an attacker to directly derive bytes
in previous subkeys when attacking AES-192 (Lucks,
2000). That is, 64 bits of the penultimate subkey and
32 bits of the antepenultimate subkey, as shown in
Figure 3.
Following the reasoning in Section 4.2, a compu-
tational effort equivalent to 2
22
(i.e. a factor of 2
16
less than the standard attack) six-round AES-192 en-
cryption operations to treat each δ-set. As previously,
two δ-sets would be sufficient to determine the penul-
timate and antepenultimate subkeys.
One can reduce this to one set if one uses the re-
lationships between two consecutive subkey as de-
scribed in Section 5.1. Since much of the subkeys are
already known the memory requirementsare much re-
duced. The details of the square attack applied to five-
round AES-192 and AES-256 are shown in Table 5
and 6 respectively.
Table 5: Summary of the Square Attack on Six-round AES-
192 using δ-sets..
Number Memory Time Remaining
of δ-sets Complexity Hypotheses
1 2
16
2
150
2
8
2 1 2
151
1
Table 6: Summary of the Square Attack on Six-round AES-
256 using δ-sets..
Number Memory Time Remaining
of δ-sets Complexity Hypotheses
1 2
128
2
166
2
128
2 1 2
167
1
We can also note that introducing an extra round
before, and acquiring -sets will lead to a more effi-
cient attack. This involves conducting the five-round
attack described in Section 5.1 on a six-round AES
using -sets rather than δ-sets. The details of the
square attack applied to six-round AES-192 and AES-
256 using -sets are shown in Table 7 and 8 respec-
tively.
Table 7: Summary of the Square Attack on Six-round AES-
192 using -sets.
Number Memory Time Remaining
of δ-sets Complexity Hypotheses
1 2
64
2
42
2
64
2 1 2
42.5
2
16
Table 8: Summary of the Square Attack on Five-round
AES-256 using -sets.
Number Memory Time Remaining
of δ-sets Complexity Hypotheses
1 2
128
2
42
2
128
2 1 2
43
1
Table 9: Summary of the Square Attack on Seven-round
AES-192.
Number Memory Time Remaining
of δ-sets Complexity Hypotheses
1 2
16
2
153
2
8
2 1 2
154
1
Table 10: Summary of the Square Attack on Seven-round
AES-256.
Number Memory Time Remaining
of δ-sets Complexity Hypotheses
1 2
128
2
170
2
128
2 1 2
171
1
SECRYPT2012-InternationalConferenceonSecurityandCryptography
32
7-th Subkey
• • • •
• • • •
• • • •
• • • •
6-th Subkey
◦ ◦ 0 0
◦ ◦ 0 0
◦ ◦ 0 0
◦ ◦ 0 0
5-th Subkey
0 0 0
0 0 0
0 0 0
0 0 0
Figure 3: The known information from the last subkey, where represents a known value and represents a derived value.
Table 11: Summary of attacks presented in this paper.
Rounds Key Length Memory Acquisitions Complexity
(Daemen and Rijmen, 1998) 4 128 2
9
2
6
(Biham and Keller, 1999) (corrected) 4 128 2
9
2
5
(Daemen and Rijmen, 1998) 5 generic 2
11
2
40
This paper 5 128 2
8
2
38
This paper 5 192 2· 2
8
2
38.5
This paper 5 256 2· 2
8
2
39
(Daemen and Rijmen, 1998) 6 generic 5· 2
32
2
72
(Ferguson et al., 2001) 6 generic 2
32
6· 2
32
2
44
This paper 6 128 2
40
2
32
2
42
This paper 6 192 2· 2
32
2
42.5
This paper 6 256 2· 2
32
2
43
(Lucks, 2000) 7 192 2
32
2
32
2
176
Ferguson et al. (Ferguson et al., 2001) 7 192 2
32
19· 2
32
2
155
This paper 7 192 2· 2
32
2
154
(Lucks, 2000) 7 256 2
32
2
32
2
192
(Ferguson et al., 2001) 7 256 2
32
21· 2
32
2
172
This paper 7 256 2· 2
32
2
171
5.3 Analyzing an Extra Round
As described in Section 4.3, an extra round can be
added where an attacker uses -sets (Ferguson et al.,
2001). Given that a large amount of the penultimate
subkey is known the “partial sums” method of con-
ducting the Square attack can be further optimized.
We recall that the attack requires that (5) is evaluated
as described in Section 4.2. Note that the attack does
not proceed exactly as the six-round attack given in
Section 5.2 as the relationships between the bytes of
the subkeys will be different.
If, for example, an attacker wishes to evaluate (5)
and knows (k
0
,k
1
) these values can be evaluated be-
fore the unknown (k
2
,k
3
). Generating 2
24
values for
(x
1
,c
2
,c
3
) will require 2
33
executions of the function
S. Then 2
16
values for (x
2
,c
3
) per hypotheses for k
2
can be generated with 2
32
executions of the function
S, and continue as per the attack described in Sec-
tion 4.3. The time complexity of the entire analysis
will be 2
34
evaluations of the function S. That is, 2
36
executions of the function S for all four 32-bit sets for
the final subkey, which corresponds to 2
28
AES en-
cryptions operations.
6 CONCLUSIONS
In this paper we have demonstrated that the “partial
sums” method of conducting the Square attack can be
improved by analyzing more information per δ-set (or
-set). This allows the time complexity of attacks to
be halved. For AES-128 the time complexity can be
halved again, and reduced by a smaller amount for
AES-192, by eliminating key hypotheses on-the-fly.
In Table 11 we present a summary of the attacks
presented in this paper alongside similar attacks in the
literature. The memory requirements are the number
of state matrices, or equivalent, that need to be stored
in memory. The number of acquisitions are the num-
ber of chosen plaintexts enciphered with an unknown
key, where the number is given in multiples of 2
8
or
2
32
so that it is clear how many δ or -sets are re-
quired to conduct the attack. The time complexity
is expressed as the computational effort required to
compute that many AES encryption operations.
In this paper we have focused on attacks that re-
quire relatively few chosen plaintext-ciphertext pairs.
Ferguson et al. also proposed a Square attack appli-
cable to seven rounds of a AES-128 and eight rounds
for AES-192 and AES-256. However, this attack re-
Improved"PartialSums"-basedSquareAttackonAES
33
quires approximately 2
128
chosen plaintext-ciphertext
pairs and is beyond the scope of this paper although
one would expect the proposed strategy to aid in the
analysis.
ACKNOWLEDGEMENTS
The author would like to thank Martijn Stam for help-
ful discussions. The work described in this paper
has been supported in part the European Commis-
sion through the ICT Programme under Contract ICT-
2007-216676 ECRYPT II and the EPSRC via grant
EP/I005226/1.
REFERENCES
Bahrak, B. and Aref, M. R. (2008). Impossible differen-
tial attack on seven-round AES-128. IET Information
Security Journal, 2(2):28–32.
Biham, E. and Keller, N. (1999). Cryptanalysis of
reduced variants of Rijndael. unpublished.
http://www.madchat.fr/crypto/codebreakers/35-
ebiham.pdf.
Bogdanov, A., Khovratovich, D., and Rechberger, C.
(2011). Biclique cryptanalysis of the full AES. In
Lee, D. H. and Wang, X., editors, ASIACRYPT 2011,
volume 7073 of LNCS, pages 344–371. Springer.
Daemen, J., Knudsen, L., and Rijmen, V. (1997). The block
cipher Square. In Biham, E., editor, FSE '97, volume
1267 of LNCS, pages 149–165. Springer.
Daemen, J. and Rijmen, V. (1998). AES proposal: Rijn-
dael. In AES Round 1 Technical Evaluation CD-1:
Documentation. NIST. http://www.nist.gov/aes.
Dunkelman, O. and Keller, N. (2010). The effects of the
omission of last round’s MixColumns on AES. Infor-
mation Processing Letters, 110(8–9):304–308.
Ferguson, N., Kelsey, J., Lucks, S., Schneier, B., Stay,
M., Wagner, D., and Whiting, D. (2001). Improved
cryptanalysis of Rijndael. In Schneier, B., editor,
FSE 2000, volume 1978 of LNCS, pages 213–230.
Springer.
FIPS PUB 197 (2001). Advanced encryption standard
(AES). Federal Information Processing Standards
Publication 197, National Institute of Standards and
Technology (NIST), Gaithersburg, MD, USA.
Lu, J., Dunkelman, O., Keller, N., and Kim, J. (2008). New
impossible differential attacks on AES. In Chowd-
hury, D. R., Rijmen, V., and Das, A., editors, IN-
DOCRYPT 2008, volume 5365 of LNCS, pages 279–
293. Springer.
Lucks, S. (2000). Attacking seven rounds
of Rijndael under 196-bit and 256-bit
keys. In AES Candidate Conference 2000.
http://csrc.nist.gov/encryption/aes/round2/conf3/aes3
conf.htm.
Mangard, S., Oswald, E., and Popp, T. (2007). Power Anal-
ysis Attacks: Revealing the Secrets of Smart Cards.
Springer-Verlag.
Zhang, W., Wu, W., and Feng, D. (2007). New results on
impossible differential cryptanalysis of reduced AES.
In Nam, K.-H. and Rhee, G., editors, ICISC 2007, vol-
ume 4817 of LNCS, pages 239–250. Springer.
SECRYPT2012-InternationalConferenceonSecurityandCryptography
34