implementation. Similarly, the 10-round experiment
that took 15 days when run on CPU would take less
than 10 minutes with our proposed GPU optimiza-
tions. Moreover, the full 12-round attack that requires
2
57.07
encryptions would take 2
21.67
second which
is less than 39 days. Note that it would take more
than 300 years to verify this attack on the CPU setup
and the C implementation of (Lallemand and Naya-
Plasencia, 2014).
The attacks of (Lallemand and Naya-Plasencia,
2014) was improved in (Rasoolzadeh et al., 2017)
which now requires 2
54.9
encryptions. Performing
2
54.9
encryptions would take less than 9 days with our
CUDA codes on an RTX 4090.
Our optimization results show that we can try
2
34.74+24.91
= 2
59.65
KLEIN-80 keys in a year. This
means that it would take 2
20.35
years for an RTX 4090
to capture a KLEIN-80 or it would require 2
20.35
≈
1.34 million RTX 4090 GPUs to capture the key in a
year.
A biclique attack in (Abed et al., 2012) has a
time complexity of 2
79
encryptions which is two times
faster than the exhaustive search attack. However, this
attack requires 2
60
memory and implementing this at-
tack using our GPU optimizations might result in an
attack that is slower than the exhaustive search. Be-
cause storing and processing 2
60
data would introduce
a significant overhead.
Our optimization results show that we can try
2
34.39+24.91
= 2
59.3
KLEIN-96 keys in a year. This
means that it would take 2
36.7
years for an RTX 4090
to capture a KLEIN-96 or it would require 2
36.7
≈
111 billion RTX 4090 GPUs to capture the key in a
year.
A biclique attack in (Abed et al., 2012) has a time
complexity of 2
95.18
encryptions which is 2
0.82
times
faster than the exhaustive search attack. However, this
attack requires 2
60
memory and implementing this at-
tack using our GPU optimizations might result in an
attack that is slower than the exhaustive search. Be-
cause storing and processing 2
60
data would introduce
a significant overhead.
Although an exhaustive key search attack on
GPUs does not look realistic with these numbers, it
should be noted that this attack can become practical
in the future since new GPUs are always built with
more cores and faster clock speeds. Moreover, GPUs
are general purpose devices and if an attack on 96-
bit KLEIN becomes profitable, one can built ASICs
where this attack becomes practical and requires less
electricity than GPUs.
5 CONCLUSIONS
In this work we provided a CUDA optimized table-
based implementation of the KLEIN family of block
ciphers which does not contain shared memory bank
conflicts. Our best optimization reach 2
35.40
≈ 45
billion KLEIN-64 key trials on an RTX 4090. Our
results show that KLEIN block cipher that supports
64-bit, 80-bit, and 96-bit secret keys is susceptible to
brute force attacks via GPUs. Thus, lightweight de-
signs should not support short keys.
ACKNOWLEDGEMENTS
This work has been supported by The Scientific
and Technological Research Council of T
¨
urkiye
(T
¨
UBITAK) and German Academic Exchange Ser-
vice (DAAD) Bilateral Research Cooperation Project
(T
¨
UB
˙
ITAK 2531 Project) under the grant number
123N546 and titled ”Cryptanalysis of Symmetric Key
Encryption Algorithms: Theory vs. Practice”.
This project has also been supported by Mid-
dle East Technical University Scientific Research
Projects Coordination Unit under grant number
AGEP-704-2023-11294.
REFERENCES
Abed, F., Forler, C., List, E., Lucks, S., and Wenzel, J.
(2012). Biclique cryptanalysis of present, led, and
klein. Cryptology ePrint Archive, Paper 2012/591.
https://eprint.iacr.org/2012/591.
Ahmadian, Z., Salmasizadeh, M., and Aref, M. R. (2015).
Biclique cryptanalysis of the full-round KLEIN block
cipher. IET Inf. Secur., 9(5):294–301.
Barker, E. and Roginsky, A. (2019). Transitioning the use
of cryptographic algorithms and key lengths. NIST SP
800-131A Rev. 2.
Belorgey, M. G., Dandjee, S., Gama, N., Jetchev, D., and
Mikushin, D. (2023). Falkor: Federated learning se-
cure aggregation powered by AESCTR GPU imple-
mentation. In Brenner, M., Costache, A., and Rohloff,
K., editors, Proceedings of the 11th Workshop on En-
crypted Computing & Applied Homomorphic Cryp-
tography, Copenhagen, Denmark, 26 November 2023,
pages 11–22. ACM.
Daemen, J. and Rijmen, V. (2002). The Design of Rijndael:
AES - The Advanced Encryption Standard. Informa-
tion Security and Cryptography. Springer.
Gong, Z., Nikova, S., and Law, Y. W. (2011). KLEIN: A
new family of lightweight block ciphers. In Juels,
A. and Paar, C., editors, RFID. Security and Pri-
vacy - 7th International Workshop, RFIDSec 2011,
Amherst, USA, June 26-28, 2011, Revised Selected
ICISSP 2024 - 10th International Conference on Information Systems Security and Privacy
888