Higher Order Leakage Assessment and Neural Network-based Attack on
CRYSTALS-Kyber
Buvana Ganesh, Mosabbah Mushir Ahmed and Alieeldin Mady
Qualcomm Inc., Cork, Ireland
Keywords:
Side Channel Attacks, CRYSTALS-Kyber, Leakage Assessment, Deep Learning, Higher Order Masking.
Abstract:
To enable the secure deployment of CRYSTALS-Kyber as the National Institute of Standards and Technol-
ogy (NIST) post-quantum cryptography (PQC) standard for key encapsulation mechanisms (KEM), several
attacks have emerged for both the algorithm and its implementations. In this work, a thorough higher order
test vector leakage assessment has been performed on open source implementations of CRYSTALS-Kyber.
With the traces obtained using the ChipWhisperer framework, the leakage is determined and a template Side
Channel Attacks (SCA) is performed with deep learning to successfully uncover the secret key from the first-
order masked implementation of CRYSTALS-Kyber. Overall, this work performs a comprehensive leakage
assessment and neural network-based SCAs on the masked implementation of CRYSTALS-Kyber.
1 INTRODUCTION
Key Encapsulation Mechanisms are essential for gen-
erating a shared secret key between parties, to estab-
lish secure peer-to-peer transactions. As the final-
ized candidate for KEMs in the PQC standardization
(PQC, 2023), CRYSTALS-Kyber (Bos et al., 2018)
has strong mathematical security, while being com-
pact in size. There are open-source implementations
(Heinz et al., 2022; Kannwischer et al., 2020) avail-
able for CRYSTALS-Kyber that are embedded device
compatible. These implementations are built for spe-
cific requisites concerning both security and perfor-
mance, with a notable emphasis on resilience against
side-channel attacks (SCA).
SCAs (Kocher et al., 1999) exploit physically
measured information, such as power consumption,
electromagnetic radiation, sound emissions, and exe-
cution time, during cryptographic operations on hard-
ware. In CRYSTALS-Kyber, a successful SCA (Ravi
et al., 2022) implies the shared key recovery using
power and electromagnetic analysis, as the secret key
can be derived from the shared key, which is the mes-
sage and subsequently extract the long-term secret
key.
Before performing a full-scale SCA, a leakage as-
sessment is conducted to detect potential side chan-
nel leakage from the device running the algorithm,
as it requires fewer traces and complexity compared
to an attack. The assessment points out weaknesses
in the algorithm which can be used as target for the
attack, if it also involves critical security parameters
like the secret key. While there are works demon-
strating leakage assessment for the unmasked imple-
mentation, no significant study has been conducted
for the masked implementation to immediately deter-
mine the leakage in mkm4 (Heinz et al., 2022). To
rectify this, our work demonstrates a leakage assess-
ment with higher order Test Vector Leakage Assess-
ment (TVLA) (Schneider and Moradi, 2015), as it
predicts leakage better than basic t-tests on higher or-
der masking.
In the unmasked implementation, the poly tomsg
component has been found to be one of the viable
sources of leakage, as it converts the shared key from
polynomial to the message domain as a part of decap-
sulation in pqm4 (Kannwischer et al., 2020). Several
attacks (Chang et al., 2022; Mujdei et al., 2022; Ravi
et al., 2020b) have been performed on this component
for message and secret key recovery. To protect the at-
tacked components from SCA, masking is one of the
most used countermeasures. But currently there are
also attacks against the masked components (Back-
lund et al., 2022; Ngo et al., 2021). In our work, we
perform the leakage assessment and then use template
attacks, that are enhanced by Deep Learning (DL)
(Picek et al., 2023) to attack the message decoding
component.
The motivation for this work is to understand
the implementation of CRYSTALS-Kyber, develop a
Ganesh, B., Ahmed, M. and Mady, A.
Higher Order Leakage Assessment and Neural Network-based Attack on CRYSTALS-Kyber.
DOI: 10.5220/0012715700003767
Paper published under CC license (CC BY-NC-ND 4.0)
In Proceedings of the 21st International Conference on Security and Cryptography (SECRYPT 2024), pages 373-380
ISBN: 978-989-758-709-2; ISSN: 2184-7711
Proceedings Copyright © 2024 by SCITEPRESS Science and Technology Publications, Lda.
373
comprehensive methodology for leakage assessment
and deploy a set of experiments that can exploit the
leakage. The use of Recurrent Neural Networks
(RNN) for SCAs is explored, as RNNs are sensitive
to the order of the data supplied. This allows to un-
cover dependencies distributed across different points
in time, making it potent against implementations of
higher-order masking schemes with multivariate leak-
ages.
1.1 Contribution
Our work demonstrates that the masked message de-
coding function is leaky under second order t-tests for
fixed vs random keys, and is exploitable even with the
first order masking. The rationale in this work is to
find a source of leakage before launching an attack on
the masked implementation. The part of the leakage
in masked message decoding is used to launch a side-
channel attack. The main contributions of the paper
are given as follow:
Performed Welch’s t-tests on multiple compo-
nents of the unmasked and masked implementa-
tions of CRYSTALS-Kyber.
Performed second order t-tests on the leaky com-
ponents of the first order masked implementation
(Heinz et al., 2022), which has not been done be-
fore in literature.
Performed a template SCA using NN on the
masked implementation of Kyber specifically fo-
cused on the masked poly tomsg step in the de-
cryption using multi-layer perceptrons and recur-
rent NNs.
After finding an exploitable vulnerability, an NN-
based side-channel attack is executed, specifically tar-
geting the masked poly tomsg step in the decryption
process of the Kyber algorithm. The multi-layer per-
ceptron (MLP) model retrieves 99% of the secret key
from just the attack phase. The model is evaluated
based on validation accuracy and key retrieval. The
performance of the model was achieved through be-
spoke pre-processing and the error correcting code.
Our MLP model is similar to the model in (Back-
lund et al., 2022), where the attack was performed on
masked and shuffled Kyber, which is not open source.
We target mkm4, that needed additional implemen-
tation, such as finding specific triggers according to
the leakage from the TVLA. Compared to the attack
in (Backlund et al., 2022), our method has less pre-
processing, from removing their cut-and-join tech-
nique.
The paper is outlined as discussed in this section.
Sec. 2 discusses the preliminaries required for the rest
of the paper including the CRYSTALS-Kyber algo-
rithm, masking, t-tests and the device set-up for the
experiments performed. The results of the TVLA ex-
periments are provided in Sec. 3 and further proceed
to perform attacks in Sec. 4. Finally, Sec. 5 concludes
and provides some possible future works.
1.2 Related Work
The initial phase preceding any attack involves identi-
fying potential vulnerabilities through statistical anal-
ysis, and Welch’s t-test serves as a valuable tool for
this purpose. While there exists a considerable num-
ber of studies on attacks targeting CRYSTALS-Kyber,
it is noteworthy that, until recently, TVLA on Ky-
ber have been sparingly explored. The pioneering
works of (Rajendran et al., 2023; Ravi et al., 2020b;
Sim et al., 2022) are among the few that have delved
into TVLA specifically on the poly tomsg operation
of Kyber. Two widely adopted implementations that
have become standard for cryptographic attacks are
pqm4 (Kannwischer et al., 2020) and mkm4 (Heinz
et al., 2022).
Numerous attacks on CRYSTALS-Kyber have
been conducted using the official pqm4 implementa-
tion (Ravi et al., 2022; Ravi et al., 2020b). Following
this, there were several attacks focusing on different
components of the implementation like the compari-
son operation plaintext-checking (PC) oracle (Rajen-
dran et al., 2023), but mostly surrounding the mes-
sage encoding (Sim et al., 2020), the number theo-
retic transform (Bock et al., 2024; Primas et al., 2017;
Yang et al., 2023) and the poly
tomsg components
(Chang et al., 2022). Tab. 1 refers to some relevant
attacks on Kyber, especially focusing on the compo-
nents of Decapsulation.
The landscape of side-channel attacks has wit-
nessed a notable evolution with the integration of
NNs, as evidenced in recent literature, as illustrated
in (Picek et al., 2023). While there exist pre-trained
models for AES and RSA, limited availability is noted
for other cryptographic schemes. After the release
of the masked implementation mkm4 (Heinz et al.,
2022), many attacks leveraged various NN architec-
tures, including multi-layer perceptrons (MLP), con-
volutional neural networks (CNN), and RNN attack-
ing mkm4.
Ngo et al. introduced profiled attacks with NNs in
their work on Saber KEM (Ngo et al., 2021) and ex-
tended this approach to generic KEMs in (Ngo et al.,
2022; Dubrova et al., 2023). Tab. 2 covers relevant at-
tacks on first-order masking and some of these attacks
use NNs to accomplish the attack.
SECRYPT 2024 - 21st International Conference on Security and Cryptography
374
Table 1: Side Channel Attacks on CRYSTALS-Kyber.
Paper Target Method used
(Ravi et al., 2020b) poly tomsg Chosen Ciphertext – 2560 traces
(Sim et al., 2020) Barrett reduction Chosen Ciphertexts - Clustering
(Chang et al., 2022) poly tomsg Template matching – 900 traces
(Mujdei et al., 2022) Polynomial Multiplication Hamming weight - 3329
2
guesses
(Primas et al., 2017) poly frommsg Hamming weight - 500 traces
(Rajendran et al., 2023) re encrypt - PC oracle Template matching - 5520 & 72 traces
(Yang et al., 2023) KeyGen Multiplication Template matching – 900 traces
Table 2: Attacks on first order masking and use of AI.
Paper Target Method used
(Bock et al., 2024) Polynomial Multiplication Template attack - no DL
(Backlund et al., 2022) masked poly tomsg MLP- Hamming distance
(Ngo et al., 2021) Saber - poly A2A MLP & ECC- Hamming weight
(Ueno et al., 2022) masked AES NN Classification
2 PRELIMINARIES
2.1 CRYSTALS-Kyber
Define the ring R = Z and R
q
= R/qR = Z
q
and
K
R
with the ring-embedding σ : K K
R
, with
discrete Gaussian and normal distribution used for
sampling the vectors. Let {0, 1} be I. Con-
sider the functions in message encoding and de-
coding and reduce the key and ciphertext sizes
in R
q
, Compress(·,d) : Z
q
{0, ...,2
d
1} and
Decompress(·, d) : {0,...,2
d
1} Z
q
. These func-
tions constitute the poly frommsg and poly tomsg
in the implementations, and follow invertibility with
negligible error.
Algorithm 1: Kyber PKE - Key Generation.
Input: seeds ρ,σ I
256
1: Sample A R
k×k
q
using seed ρ
2: Sample (s, e) I
k
× I
k
, using seed σ
3: t
comp
:= Compress(As + e,d
t
)
4: return pk := (t
comp
,ρ), sk := s
The Fujisaki-Okamoto (FO) transform is used
to upgrade the security of the underlying public
key encryption (PKE) in CRYSTALS-Kyber (Bos
et al., 2018) from the weaker indistinguishability un-
der Chosen Plaintext Attacks, to the stronger adap-
tive Chosen Ciphertext Attacks for KEM, by remov-
ing ciphertext malleability in the PKE. The modified
FO transform re-encrypts the decrypted message and
compares the resulting ciphertext against the received
one.
Algorithm 2: Kyber PKE - Encryption.
Input: pk, m I
256
, τ I
256
1: t := Decompress(t
comp
,d
t
)
2: Sample A R
k×k
q
, for seed ρ
3: Sample (r, e
1
,e
2
) I
k
× I
k
× I, for seed τ
4: u
comp
:= Compress(A
T
r + e
1
,d
u
)
5: v
comp
:= Compress(t
T
r + e
2
+
q
2
m,d
v
)
6: return c := (u
comp
,v
comp
)
Algorithm 3: Kyber PKE - Decryption.
Input: c = (u
comp
,v
comp
), s
1: u := Decompress(u
comp
,d
u
)
2: v := Decompress(v
comp
,d
v
)
3: return m := Compress(v s
T
u,1)
The pqm4 library serves as a benchmarking and
testing framework that targets the ARM Cortex-M4
family of microcontrollers and supports all versions
of Kyber. The masking countermeasure, introduced
in (Chari et al., 1999), serves as a strategy to conceal
side channel leakage for dividing data into multiple
shares that can be processed independently, yet when
combined, represent the final processed data, so that
no individual share reveals any information about the
masked secret. Previous research has demonstrated
that, through DL, individual secret shares can be ex-
tracted and subsequently combined (Backlund et al.,
2022; Ueno et al., 2022). The mkm4 library is built
upon the M4 implementation of pqm4. The mask-
ing is available for the functions that were attacked in
pqm4 like message encode and decode, polynomial
multiplication and polynomial comparison. mkm4
only supports Kyber 768, which we use from both the
Higher Order Leakage Assessment and Neural Network-based Attack on CRYSTALS-Kyber
375
libraries for our attacks. Alg. 3 presents only the Ky-
ber PKE, as given in (Bos et al., 2018) as it covers the
necessary background for our attack.
In the decryption and eventually the decapsula-
tion function, one of the important components is the
decoding function wherein the polynomial gets con-
verted into the message again, where the message is
the shared secret, which are of interest to attack.
2.2 Welch’s t-test and TVLA
The Welch’s t-test (Welch, 1947) is useful when deal-
ing with unequal sample sizes or variances, enables
the comparison of means between the two groups. For
SCAs, this helps point out any leakage between the
two sets that can be considered for exploitation. Con-
sider having n
A
samples from set A and n
B
samples
from set B. For TVLA typically, set A contains fixed
traces, i.e., traces obtained for a function where the
keys are fixed and the inputs are random, and set B
contains random traces where keys and messages are
random. For each group j = A,B, let µ
j
represent the
sample mean in group j, and s
2
j
denote the sample
variance.
|t| =
|µ
A
µ
B
|
r
s
2
A
n
A
+
s
2
B
n
B
(1)
In the context of masked implementations of any
order, the detection of leakage necessitates higher-
order TVLA, as articulated in (Schneider and Moradi,
2015). Various techniques can be employed for this
purpose, including applying the t-test with higher-
order central moments and, the χ
2
test, among oth-
ers. An essential adaptation involves replacing vari-
ance with the order of the moment, aligning with the
order of the mask, which may include measures such
as skewness and kurtosis. This nuanced approach is
imperative for unveiling and addressing potential vul-
nerabilities in masked implementations, contributing
to a comprehensive evaluation of security in crypto-
graphic systems.
3 LEAKAGE ASSESSMENT
First, the leakage assessment is performed with stan-
dard t-test and the second order t-test on both the
implementations of CRYSTALS-Kyber (Heinz et al.,
2022; Kannwischer et al., 2020) to observe the be-
haviour of the components. Here, key generation or
encapsulation procedures are not considered because
based on the history of attacks (Ravi et al., 2022),
most attacks tend to focus on the decapsulation al-
gorithm to retrieve the secret key, either directly or
through the shared secret, i.e., the message. It is worth
noting that higher order TVLA has not been employed
to assess leakage in masked implementations for PQC
so far and our work rectifies this.
3.1 Device Set-up
All experiments were conducted on a Ryzen-7 lap-
top equipped with 16GB of RAM and approximately
50GB of virtual RAM. The power consumption-based
attacks were facilitated using ChipWhisperer Lite, a
well-established tool in the field (O’Flynn and Chen,
2014). This tool captures traces of power consump-
tion by monitoring clock cycles. The experiments tar-
get an ARM Cortex-M4 processor, specifically utiliz-
ing either the STM32F303 or STM32F415, which is
commonly employed for trace collection. The F4 se-
ries, notable for its inclusion of a True Random Num-
ber Generator, is also utilized.
3.2 TVLA
The critical security parameters for Kyber include the
secret key or the shared secret, representing the sen-
sitive core of cryptographic systems. The construc-
tion of PQC algorithms can also vary across different
implementations, urging the adaptation of standards
to accommodate these nuances. For each of the ex-
periments, 1000 traces were considered per operation
to perform TVLA. Though it would typically require
more traces for considering a significant leakage, but
a noticeable leakage is observed with such a low trace
numbers also. The public key distinguisher method
proposed in (Saarinen, 2022), with the as encryption,
aims at assessing whether the public key leaks more
data than expected during encryption. This experi-
mental study with pqm4 revealed no discernible leak-
age for the t-test calculated as in 2.2. This signifies the
resilience of the implementation against unintended
information disclosure through the public key during
the encryption process.
Random vs Mismatch Traces: For assuring the se-
curity of the plaintext-checking oracle, with decap-
sulation of pqm4, random vs mismatch traces were
used to understand how the oracle reacts to improper
decapsulations. From the setup in Sec. 2.2, set A con-
tains valid ciphertexts, while set B comprises cipher-
texts encapsulated with a deliberately mismatched
public key, pk
̸= pk. The subtraction and the com-
parison operations of decapsulation exhibit leakage
when analyzing such random vs mismatched cipher-
texts with TVLA, but it is not significant enough. It
should be noted that the masking of the comparison
operation has made it harder to attack this part of the
SECRYPT 2024 - 21st International Conference on Security and Cryptography
376
algorithm.
Fixed vs Random Traces: For evaluating decryp-
tion, traces within set A are generated using a fixed
keypair, whereas set B employs random and unique
keypairs for each trace. The creation of ciphertexts
involves random messages paired with matching key-
pairs, strategically targeting different components of
the decryption process. The leakage found for this
variety is not too evident for all operations in decap-
sulation but present for the poly tomsg component in
pqm4. Though other operations were tested as well,
polynomial multiplication showed less than 5% leak-
age and the rest of the functions did not show any
immediately.
As shown in Fig. 2, the leakage is not enough
for performing attacks on the function, as it indicates
leakage of a single bit uniformly, which may not be
exploitable. Given the apparent leakage in the un-
masked implementation, the test is replicated with the
masked implementation to gauge the area of leakage.
The standard fixed vs random TVLA is performed in
the masked mkm4 library, and the leakage is evident
for the masked counterpart of the decoding function,
masked
poly tomsg in the beginning of the traces.
3.3 Higher Order TVLA
In literature for the leakage assessments done on Ky-
ber (Rajendran et al., 2023; Ravi et al., 2020b; Sim
et al., 2020; Sim et al., 2022), the first-order t-tests
have revealed leakage on the poly tomsg component,
corresponding to the message decoding process. It
is necessary to confirm whether the problem persists
specifically within the masked poly tomsg operation
in mkm4.
Though there are different methods like the χ
2
test
or F-test or bi-variate analysis , our approach involves
central moments (Schneider and Moradi, 2015). For
the higher order TVLA, the mean and standard devi-
ation in the equation in 2.2 are replaced with higher
order central moments. For second order TVLA, the
mean µ is substituted with CM
2
and the variance s
2
with CM
4
CM
2
2
, where CM
2
is the second order cen-
tral moment and CM
4
is the fourth order central mo-
ment.
|t| =
|CM
2
A
CM
2
B
|
r
CM
4
CM
2
2
A
n
A
+
CM
4
CM
2
2
B
n
B
(2)
For the fixed vs random TVLA, the set A con-
tains traces of the target operation for the encapsu-
lations with a fixed key and set B for random keys.
The aim is to enhance and confirm the findings of
the previous first order TVLA but with 1000 traces
for practical considerations. The captured traces were
each of length 280, 000 for the masked poly tomsg
operation. A subset of 44,000 points in the begin-
ning of the trace can clearly be identified as leaky in
Fig. 3. This approach facilitated a more manageable
yet insightful examination, revealing critical insights
into the nature and extent of the leakage within the
masked poly tomsg operation. The results indicate
that the masked implementation exhibits a reduced
susceptibility to leakage for the specific component
under examination.
4 ATTACK
The target for performing the attack is the
masked poly tomsg operation in mkm4 as per
in Fig. 3. For neural networks, the two phases are
training and prediction, that require the training
dataset and the test dataset. First the training data
is collected and pre-processed. The cleaned data is
then segmented and ordered to make the NN optimal
with speed and efficiency, then fed into the NN of
choice to be trained. Once the model is trained, the
saved model is used to make predictions on the test
data. The model is adjusted and retrained based on
the prediction results.
Moving to the attack phase, traces were collected
specifically for chosen ciphertexts. The conceptual-
ization of the approach for the attack originated with
Saber and has been extended to Kyber, as detailed in
(Backlund et al., 2022). Their attack focuses on a
custom masked and shuffled version of CRYSTALS-
Kyber, but only masking is used as countermeasure
in this work, as there is no official code for the shuf-
fled component available open source. The ECT is
strategically employed to extract the secret key from
the message guesses associated with different cipher-
texts.
4.1 Profiling Phase
For the mkm4 implementation, 50000 traces are cap-
tured for the masked poly tomsg of the decapsulation
for detailed analysis, for ciphertexts known to the at-
tacker. The trace length is set to the first 53000 out of
280,000 points to utilize the source of the leakage as
seen in Fig. 3.
4.1.1 Pre-Processing
First, N traces associated with the
masked poly tomsg operation are gathered with
the ChipWhisperer. The traces are split into five
Higher Order Leakage Assessment and Neural Network-based Attack on CRYSTALS-Kyber
377
Figure 1: First Order TVLA on poly tomsg with pqm4.
Figure 2: First Order TVLA on masked poly tomsg with mkm4.
equal sets. The dataset is carefully prepared for
training, with the trace shape set at (10000,53000).
Synchronization is crucial in this process, and a
segment length of 225 is chosen. Then, the mean
of the end segments for all traces is calculated, and
correlations are identified for consistency, resulting
in a refined trace shape of (10000, 57600). This is
done to ensure a consistent shape for all the traces
and eliminate outliers. The subsequent steps involve
cutting and shaping the traces, so that they can be
split bitwise for better learning in the DL architecture.
Then a unified dataset of 50000 traces is created,
combining individual datasets into one to increase the
amount of training data. The final trace shape is
(256 50000,3 225) where the 53000 is split into
256 bit parts (Or 8 bytes). It was decided to keep
the three consecutive bit information together as the
slicing them would increase the complexity or reduce
performance. Standardization and normalization pro-
cedures are applied to each trace before the model
training phase.
4.1.2 Deep Learning Architecture
We consider a similar architecture to (Backlund et al.,
2022) but the algorithm was modified to better fit the
mkm4 library. Additionally, the Long Short-Term
Memory (LSTM) model, a type of RNN is used as
it is known for its proficiency in handling temporal
data. Recognizing the importance of trace order and
message bit sequencing in key recovery, LSTM is em-
ployed in conjunction with batch normalization for
standardized traces. The model architecture incorpo-
rates ReLU activation for the middle layers and sig-
moid for the output layer. The architecture for the
RNN is given in Tab. 3. The network can be ad-
justed with different activation functions and NN lay-
Table 3: Neural Network Architecture.
Layer type Output shape
Batch Normalization 1 input size
LSTM 512
Batch Normalization 2 32
ReLU 128
Dense 1 16
Batch Normalization 3 16
Sigmoid 16
Dense 2 8
ers. There are many factors and metrics like loss, ac-
curacy, etc., that can be calculated in different meth-
ods. Setting up different permutations of all such fac-
tors is called hyperparametrization and it is very use-
ful in improving the accuracy of a model.
The optimization is carried out using the N-Adam
optimizer and RMSprop, with a loss function based
on binary cross-entropy. The model undergoes train-
ing for 100 epochs, with a batch size set at 128, culmi-
nating in a comprehensive evaluation of the proposed
attack methodology. The architecture in the paper for
RNN, in Table 3, is performed in combination with
Dense layers and the activation functions are chosen
according to the number of labels to be predicted.
4.2 Attack Phase
The attack phase consists of two parts, first, train-
ing the traces and second, using the model to predict
the secret key. Constructing the Chosen Ciphertexts
(CCT) required for the attack involves the creation of
sparse ciphertexts, specifically for a ciphertext (u, v)
with a size of 768 bytes as seen in the encryption part
of Alg. 3. The computation of m = v su constructs
the message m from the ciphertext and the secret key
s. This forms the basis for the construction of CCTs,
SECRYPT 2024 - 21st International Conference on Security and Cryptography
378
Figure 3: Second Order TVLA on masked poly tomsg in mkm4.
wherein the chosen values are determined using Ham-
ming distances with a rotational parameter for cipher-
text rotations through shifting (Ravi et al., 2020a).
The enumeration of CCTs is executed in collabo-
ration with the ECT. This tool is very helpful in round-
ing coefficients that are close enough to the original
value in the predicted secret key. The initial step
involves generating an error dictionary based on the
code distances observed in the chosen ciphertexts.
Subsequently, the recovered messages undergo three
iterations of error correction through this error dictio-
nary to identify and rectify errors. If no errors are de-
tected after three passes, the corresponding coefficient
is deemed enumerable. In conclusion, these coeffi-
cients are systematically enumerated over q = 3329,
facilitating the recovery of the correct secret key co-
efficient.
4.3 Results
The neural networks were trained with both the
trimmed and untrimmed data, but as one can expect,
the bitwise trimmed data performs better than the
untrimmed data. In current setup, the attacks have
not yielded success with the RNN architecture, even
though RNN is built to perform better on time-series
data. The training phase witnessed a decline in loss
rates, indicative of challenges in convergence, and
highlighted the dependency on significantly larger
datasets compared to MLP. Despite these setbacks,
RNN exhibits a notable advantage in terms of reduced
prediction runtime. Moreover, the model exhibits en-
hancements through hyperparameter tuning, indicat-
ing the potential for optimization with further explo-
ration.
Increasing the number of traces or the number
of layers makes the attack more cumbersome com-
pared to the MLP-based approach, therefore making
the MLP approach better, compared to the more so-
phisticated RNN. For MLP, the convergence was ap-
proximately 90% for validation accuracy for the MLP,
but not more than 70% for the RNN, even with dif-
ferent permutations of layers. The MLP architecture
demonstrates impressive capabilities by successfully
retrieving 99% of the coefficients, in 65 hours. How-
ever, it’s noteworthy that the comprehensive key re-
covery necessitates the combination of results from
different models.
5 CONCLUSION
In this study, the security of the CRYSTALS-Kyber
implementations is examined thoroughly. The nov-
elty of the work comes from the execution of higher
order TVLA and using recurrent neural network
(RNN) SCA to attack the masked Kyber implemen-
tation mkm4 using ChipWhisperer. Vulnerabilities in
the mkm4 implementation are successfully identified
with higher order leakage. This proves that methods
other than first order t-tests are needed to attest the se-
curity of cryptographic implementations. Though the
NTT component did not display any leakage through
t-tests, some works (Mujdei et al., 2022) have ex-
ploited the vulnerability to perform attacks, due to in-
herent mathematical properties. The inadequacy of t-
tests in directly correlating with successful key recov-
ery is hinted at by related works (Ravi et al., 2022),
underscoring the need for a more nuanced approach.
In navigating the landscape of modern cryptographic
challenges, it becomes imperative to explore modern
solutions, such as the integration of DL to introduce
noise into the system instead of measures like mask-
ing. This line of inquiry addresses the intricacies of
cryptographic systems and underscores the evolving
nature of cyber security solutions.
REFERENCES
(2023). Nist post-quantum cryptography: Post-quantum
cryptography standardization.
Backlund, L., Ngo, K., G
¨
artner, J., and Dubrova, E. (2022).
Secret key recovery attacks on masked and shuffled
implementations of crystals-kyber and saber. IACR
Cryptol. ePrint Arch.
Bock, E. A., Banegas, G., Brzuska, C., Chmielewski, L.,
Puniamurthy, K., and Sorf, M. (2024). Breaking dpa-
protected kyber via the pair-pointwise multiplication.
In Applied Cryptography and Network Security - 22nd
Higher Order Leakage Assessment and Neural Network-based Attack on CRYSTALS-Kyber
379
ACNS 2024, Proceedings, Part II, Lecture Notes in
Computer Science.
Bos, J., Ducas, L., Kiltz, E., Lepoint, T., Lyubashevsky, V.,
Schanck, J. M., Schwabe, P., Seiler, G., and Stehle,
D. (2018). CRYSTALS - Kyber: A CCA-Secure
Module-Lattice-Based KEM. In 2018 IEEE European
Symposium on Security and Privacy.
Chang, Y., Yan, Y., Zhu, C., and Guo, P. (2022). Template
attack of lwe/lwr-based schemes with cyclic message
rotation. Entropy.
Chari, S., Jutla, C. S., Rao, J. R., and Rohatgi, P. (1999). To-
wards sound approaches to counteract power-analysis
attacks. In Advances in Cryptology - CRYPTO ’99,
Proceedings, Lecture Notes in Computer Science.
Springer.
Dubrova, E., Ngo, K., G
¨
artner, J., and Wang, R. (2023).
Breaking a fifth-order masked implementation of
crystals-kyber by copy-paste. In Proceedings of the
10th ACM Asia Public-Key Cryptography Workshop,
APKC 2023, Melbourne, VIC, Australia, July 10-14,
2023. ACM.
Heinz, D., Kannwischer, M. J., Land, G., P
¨
oppelmann, T.,
Schwabe, P., and Sprenkels, A. (2022). First-order
masked kyber on ARM cortex-m4. IACR Cryptol.
ePrint Arch.
Kannwischer, M. J., Petri, R., Rijneveld, J., Schwabe,
P., and Stoffelen, K. (2020). PQM4: Post-quantum
crypto library for the ARM Cortex-M4.
Kocher, P. C., Jaffe, J., and Jun, B. (1999). Differential
power analysis. In Advances in Cryptology - CRYPTO
’99, Proceedings, Lecture Notes in Computer Science.
Springer.
Mujdei, C., Wouters, L., Karmakar, A., Beckers, A., Mera,
J. M. B., and Verbauwhede, I. (2022). Side-channel
analysis of lattice-based post-quantum cryptography:
Exploiting polynomial multiplication. ACM Trans.
Embed. Comput. Syst.
Ngo, K., Dubrova, E., and Johansson, T. (2021). Break-
ing masked and shuffled CCA secure saber KEM by
power analysis. In ASHES@CCS: 5th Workshop on
Attacks and Solutions in Hardware Security. ACM.
Ngo, K., Wang, R., Dubrova, E., and Paulsrud, N. (2022).
Side-channel attacks on lattice-based kems are not
prevented by higher-order masking. IACR Cryptol.
ePrint Arch.
O’Flynn, C. and Chen, Z. D. (2014). Chipwhisperer: An
open-source platform for hardware embedded security
research. In Constructive Side-Channel Analysis and
Secure Design - 5th International Workshop, Lecture
Notes in Computer Science. Springer.
Picek, S., Perin, G., Mariot, L., Wu, L., and Batina, L.
(2023). Sok: Deep learning-based physical side-
channel analysis. ACM Comput. Surv., (11).
Primas, R., Pessl, P., and Mangard, S. (2017). Single-trace
side-channel attacks on masked lattice-based encryp-
tion. In Cryptographic Hardware and Embedded Sys-
tems - CHES Proceedings, Lecture Notes in Computer
Science. Springer.
Rajendran, G., Ravi, P., D’Anvers, J., Bhasin, S., and Chat-
topadhyay, A. (2023). Pushing the limits of generic
side-channel attacks on lwe-based kems - parallel PC
oracle attacks on kyber KEM and beyond. IACR
Trans. Cryptogr. Hardw. Embed. Syst., (2).
Ravi, P., Bhasin, S., Roy, S. S., and Chattopadhyay, A.
(2020a). On exploiting message leakage in (few)
NIST PQC candidates for practical message recovery
and key recovery attacks. IACR Cryptol. ePrint Arch.
Ravi, P., Chattopadhyay, A., and Baksi, A. (2022). Side-
channel and fault-injection attacks over lattice-based
post-quantum schemes (kyber, dilithium): Survey and
new results. IACR Cryptol. ePrint Arch.
Ravi, P., Roy, S. S., Chattopadhyay, A., and Bhasin, S.
(2020b). Generic side-channel attacks on cca-secure
lattice-based PKE and kems. IACR Trans. Cryptogr.
Hardw. Embed. Syst., (3).
Saarinen, M. O. (2022). Wip: Applicability of ISO stan-
dard side-channel leakage tests to NIST post-quantum
cryptography. In IEEE International Symposium on
Hardware Oriented Security and Trust, HOST. IEEE.
Schneider, T. and Moradi, A. (2015). Leakage assessment
methodology - A clear roadmap for side-channel eval-
uations. In Cryptographic Hardware and Embedded
Systems - CHES Proceedings, Lecture Notes in Com-
puter Science. Springer.
Sim, B., Kwon, J., Lee, J., Kim, I., Lee, T., Han, J., Yoon,
H. J., Cho, J., and Han, D. (2020). Single-trace attacks
on message encoding in lattice-based kems. IEEE Ac-
cess.
Sim, B., Park, A., and Han, D. (2022). Chosen-ciphertext
clustering attack on CRYSTALS-KYBER using the
side-channel leakage of barrett reduction. IEEE In-
ternet Things J., (21).
Ueno, R., Xagawa, K., Tanaka, Y., Ito, A., Takahashi,
J., and Homma, N. (2022). Curse of re-encryption:
A generic power/em analysis on post-quantum kems.
IACR Trans. Cryptogr. Hardw. Embed. Syst., (1).
Welch, B. L. (1947). The generalization of ‘student’s’ prob-
lem when several different population varlances are
involved. Biometrika, (1-2).
Yang, B., Ravi, P., Zhang, F., Shen, A., and Bhasin, S.
(2023). Stamp-single trace attack on M-LWE point-
wise multiplication in kyber. IACR Cryptol. ePrint
Arch.
SECRYPT 2024 - 21st International Conference on Security and Cryptography
380