tion 4. Experimental results and comparison with the
state of the art are reported in Section 5. Section 6
concludes the paper presenting our final remarks.
2 RELATED WORK
The initial implementations of the Rijndael algorithm
appeared soon during the AES process of the stan-
dard. Afterwards, a large number of different imple-
mentations was published, each of them targeted to
different needs.
In the previous research it is possible to find both
software and hardwareimplementations, developedin
order to meet different constraints in terms of sili-
con area, algorithm flexibility, time performance and
power consumption. Most of the research work done
in the hardware direction is related to implementation
in FPGA technology, which has substantially differ-
ent requirements and constraints with respect to the
ASIC technology. For this reason, a comparison of
our work with those related to FPGA implementation
is not possible.
An 8-bit implementation of the AES algorithm
which supports encryption and decryption is de-
scribed by Feldhofer et al. in (Feldhofer et al., 2005).
This implementation, optimized for low-resource re-
quirements, is targeted to the RFID application do-
main and is based on a 8-bit architecture. The
standard-cell implementation requires roughly 3400
equivalent gates, while the maximum clock frequency
of 80 MHz allows a data throughput rate of 9.9 Mbit
/ s.
In (Satoh et al., 2000) a compact and high-speed
architecture for 128-bit key AES is presented. Sep-
arated function blocks are developed for encryption,
decryption and key scheduling. The SubBytes trans-
formation is performed by four S-Box modules de-
signed for composite field arithmetic, which are com-
mon to the encryption round and the key schedul-
ing. Using the CMOS 0.11 µm VLSI technology, this
module can reach a throughput of 311 Mbit/s, while
the hardware complexity is of 5, 400 equivalent gates.
The architecture operates at 32 bits.
Paper (Mangard et al., 2003) presents a highly
regular and scalable AES 32-bit hardware architec-
ture, for supporting encryption, decryption, various
key sizes and the CBC mode. The architecture is a
matrix of 16 cells, which operate at 8 bits to calcu-
late the MixColumns transformation and all the other
ones, except the SubBytes. The number of S-Boxes is
customizable, and in their paper the authors show that
the highest reached throughput is 241 Mbit/s, while
the implementation requires 15 K equivalent gates.
Paper (Chodowiec and Gaj, 2003) presents a com-
pact FPGA architecture for the AES algorithm with
a key of 128 bits. Encryption, decryption and key
schedule are all implemented using limited resources.
This implementation can encrypt and decrypt data
streams of 150 Mbit / s. The architecture exploits spe-
cific features of the target FPGA, and the implemen-
tation of the MixColumns and the InvMixColumns
transformations allows to share part of the circuit be-
tween the two operations.
In (Hodjat and Verbauwhede, 2006) a high per-
formance AES processor is presented. With loop
unrolling and outer-round pipelining techniques, a
throughput of 30 Gbit / s to 70 Gbit / s is achievable
by means of a CMOS 0.18 µm VLSI technology. The
architecture proposed in the paper uses an inner round
pipelining scheme of the composite field implementa-
tion of the S-Box, and uses an off-line key scheduling.
Paper (Kuo and Verbauwhede, 2001) discusses the
architectural optimization of an AES processor. Par-
allelism and distributed memory are exploited in or-
der to reach a throughput of 1.82 Gbit / s for data
encryption. The required silicon area is of 173, 000
equivalent gates.
Paper (Hsiao et al., 2006) exposes traditional
hardware design methods and introduces a tech-
nique for area optimization. The presented Common-
Subexpression-Elimination (CSE) algorithm is ap-
plied to the subfunctions that realize the various trans-
formations in the AES encryption and decryption.
The paper claims that a cell-based implementation of
the proposed AES design can achieve an area reduc-
tion rate of about 20% with respect to using the well-
known Synopsys VLSI design tools.
A comprehensive survey can be found in (Feld-
hofer et al., ), where different hardware implementa-
tions that target various applications are presented.
3 THE ALGORITHM
In this section we give an overview of the Rijndae-
lalgorithm, that became officially AES after the pub-
lication of (Institute of Standards and Technology
(NIST), 2001), on the 26th of November 2001. As
requested by NIST, the algorithm implements a block
cipher for symmetric key cryptography and supports
a key size of 128, 192 and 256 bits, while the block
size is restricted to 128 bits. Every block is repre-
sented using four 32-bit words. The algorithm works
on a two dimensional representation of the input block
called state, which is initialized with the input data
block, holds the intermediate result during the cipher
and decipher process, and ultimately holds the final
SECRYPT 2008 - International Conference on Security and Cryptography
454