On the Security of Partially Masked Software Implementations

Alessandro Barenghi and Gerardo Pelosi

Department of Electronics, Information and Bioengineering – DEIB

Politecnico di Milano, Via G. Ponzio 34/5, I-20133 Milano, Italy

Keywords:

Applied Cryptography, Side-channel Attacks.

Abstract:

Providing sound countermeasures against passive side channel attacks has received large interest in open

literature. The scheme proposed in (Ishai et al., 2003) secures a computation against a d-probing adversary

splitting it into d+1 shares, albeit with a signiﬁcant performance overhead (5× to 20×). We maintain that it

is possible to apply such countermeasures only to a portion of the cipher implementation, retaining the same

computational security, backing a widespread intuition present among practitioners. We provide the sketch

of a computationally bound attacker model, adapted as an extension of the one in (Ishai et al., 2003), and

detail the resistance metric employed to estimate the computational effort of such an attacker, under sensible

assumptions on the characteristic of the device leakage (which is, to the current state of the art, still lacking a

complete formalization).

1 INTRODUCTION

In a world where embedded computing devices be-

come more and more pervasive, cyber-physical sys-

tems are increasingly employed to process and store

both sensitive and security-critical data. The prime

mean to provide proper security and privacy war-

ranties is represented by cryptographic primitives,

which are being more and more often embedded in

digital devices as either hardware accelerators or soft-

ware libraries. In such a scenario an attacker can

have physical access to the target device and may ex-

ploit either cipher implementation weaknesses or side

channel information (f.i., power consumption, execu-

tion timing or electro-magnetic emanations) to infer

the value of secret parameters intended to be stored in

an un-accessible way. Tackling these attacks requires

a combined effort in order to choose cryptographic

primitives with sound warranties from the theoretical

standpoint, as well as to consider carefully their im-

plementation so that the large class of the so-called

implementation attacks are warded off. The choice of

sound primitives can be effectively performed among

the well scrutinized ones, which have been recog-

nized as standards by international and national en-

tities such as the ISO/IEC committee or the US na-

tional institute of standards and technology (NIST).

By contrast, warding off implementation attacks is

still a complex issue. Indeed, they have been a sig-

niﬁcant threat in recent times, allowing the breach of

many systems ranging from e-tickets (Garcia et al.,

2009) to IP-protection on large scale reconﬁgurable

devices (Moradi et al., 2011).

The largest class of implementation attacks is rep-

resented by the so-called side-channel attacks, where

the attacker exploits the information leakage happen-

ing on an unintended channel, i.e., not on means of

communicating with the target device intended by

the designer (such as input/output ports). More con-

cretely, the attacker exploits the fact that some envi-

ronmental parameters of a regular functioning of a

digital device are dependent on the input data being

processed by it. In particular, the energy required to

perform the computation, the Electro-Magnetic (EM)

emanations of the device or the time taken to complete

the execution have been reported to be effective side-

channels and are exploited in (Mangard et al., 2007;

Barenghi et al., 2013). Designing efﬁcient and effec-

tive countermeasures against side-channel attacks is

a topic which has received warm attention by the re-

search community (Agosta et al., 2012; Agosta et al.,

2013b; Agosta et al., 2014). Typically the coun-

termeasures involve either modifying the cipher to

the algorithmic or implementation level, or changing

the underlying hardware architecture so to suppress

the side-channel leakage. Despite the efforts to de-

sign such countermeasures have been signiﬁcant, only

very few schemes have been proven secure against

492

Barenghi A. and Pelosi G..

On the Security of Partially Masked Software Implementations.

DOI: 10.5220/0005120504920499

In Proceedings of the 11th International Conference on Security and Cryptography (SECRYPT-2014), pages 492-499

ISBN: 978-989-758-045-1

 2014 SCITEPRESS (Science and Technology Publications, Lda.)

a precise attacker model (Ishai et al., 2003; Coron,

2014), and it hasn’t been infrequent for non-proven

countermeasures to be broken after their proposal. On

the other hand, provably secure countermeasure so-

lutions have very signiﬁcant overheads in terms of

performance: losses in the range of 5× to 10× are

not infrequent, thus limiting the range of applicabil-

ity of such solutions (Agosta et al., 2013a). Such an

overhead comes from the inherently redundant com-

putation mandated by the countermeasures being per-

formed on the whole cipher execution.

In this work, we will show which part of a ci-

pher implementation can be left unprotected by the

countermeasures, preserving the computational secu-

rity margin of the implementation. We will deal with

symmetric block ciphers, which (by their own na-

ture) provide a computational security margin, as their

key size is ﬁnite. We will analyze the possibility of

applying the countermeasures to the parts of the ci-

pher where the computational complexity of leading

a side-channel attack is higher or the same of perform-

ing an exhaustive search over the whole keyspace. In

particular, we will provide a practical sketch of the

attacker model against software implementation of

block ciphers protected through adapting the scheme

in (Ishai et al., 2003) and we will extended the deﬁ-

nition of instruction resistance introduced in (Agosta

et al., 2013a) detailing its properties in the context

of power-based side-channel leakages of block cipher

software implementations.

The paper is organized as follows: in Section 2

we recap the workﬂow of side-channel attacks, the

main strategies put into effect to counter them and de-

scribe the perfectly secure countermeasure proposed

in (Ishai et al., 2003), in Section 3 we illustrate the

data ﬂow analysis employed to precisely pinpoint the

vulnerable instruction of a software implementation

and point out the properties of the bit level resistance

metric. Subsequently, in Section 4 we will describe

the side-channel attacker model for software imple-

mentations and validate our main statement. Section 5

will draw our conclusions.

2 PRELIMINARIES

The classic workﬂow for power-based SCAs aims at

recovering the value of the secret parameter of a ci-

pher (i.e., the secret key) one portion at a time. This

is possible since, during a cryptographic computa-

tion, the algorithm combines the secret key bits with

the intermediate values involving a limited quantity

of them at a time. An analogous strategy can be ap-

plied employing EM radiations of the device as the

side channel leaking information. The ﬁrst step of

the attack consists in measuring the power consump-

tion of the target device with different input messages

for a large number of computations. Subsequently, an

intermediate operation employing a small portion of

the secret key is selected, and its results are guessed

for all the possible values of the key portion. From

these hypotheses on the results, a series of predic-

tions of the power consumption are made, one for

each possible value of the secret key portion. Finally,

the predicted consumption values are compared with

the actual measured ones through the use of statistical

means (f.i., linear correlation index or difference-of-

means test) to ﬁnd out which prediction ﬁts best.

Power-based SCAs affect both hardware and soft-

ware implementations of cryptographic primitives.

Many techniques have been designed to counter this

attack either at logic style- or architectural-level, since

energy consumption variations are strictly related to

both of them (Mangard et al., 2007). An example

of low-level hardware countermeasure consists in em-

ploying a decoupling capacitor placed as close as pos-

sible to the supplied voltage pins of the target device,

in such a way to signiﬁcantly hinder the collection

of informative measurements through ﬂattening volt-

age variations. However, this does not provide pro-

tection against EM-based attacks. Indeed, as shown

in (O’Flynn and Chen, 2012), it is possible to wrap

the capacitor in a thin magnetic wire in such a way to

exploit the measurement of the high-frequency com-

ponents ﬂowing through it. Usually, countering many

different attack vectors through hardware solutions,

results in an unfavorable trade-off between cost, per-

formance and security guarantees. The importance of

protected software implementations comes into play

as a way to obtain more convenient tradeoffs among

these ﬁgures of merit. Indeed, more and more often

software cryptographic libraries are deployed either

as an alternative solution to hardware ones, or as a

fallback should they be breached.

Masking and Hiding. Countermeasures to protect

implementations of ciphers against power-based SCA

aim at concealing the relation between the power con-

sumption and the operations performed by the target

device to compute sensitive intermediate values: they

are split into two categories: hiding and masking.

For software implementations, the hiding strate-

gies hinder the matching between the actual power

measurements and the consumption modeled for each

key-portion guess through rescheduling some instruc-

tions, permuting the sequence of accesses to lookup

tables, or inserting random delays built out of dummy

operations (Mangard et al., 2007; Tillich and Herbst,

2008; Coron and Kizhvatov, 2010). In hardware im-

OntheSecurityofPartiallyMaskedSoftwareImplementations

493

Table 1: Complexity of bitwise masked operations as a

function of the masking order d and lookup table size l.

Op.s Complexity Ref.

xor 3(d+1) xor

(Rivain and Prouff, 2010)

(Ishai et al., 2003)

not 1 not (Ishai et al., 2003)

and 2d(d+1) xor +(d+1)

and (Ishai et al., 2003)

2d(d+1) xor +

a ∨ b=¬((¬a) ∧ (¬b))

+(d+1)

and +3 not

table 2ld xor +ld store +

(Schramm and Paar, 2006)

lookup +(ld + 1) load

plementations the common hiding strategies involve

feeding the chip with a drifting clock so to change the

instant in time when the sensitive operations are per-

formed and adding extra hardware with the intent to

reduce the Signal to Noise Ratio (SNR) of the physi-

cal measurements (Mangard et al., 2007).

Masking schemes (Ishai et al., 2003; Mangard

et al., 2007) invalidate the correlation between the val-

ues employed to predict the power consumption and

the actual values processed by the device. The prin-

ciple is to add one or more random values (masks) to

every sensitive intermediate variable occurring during

the computation. In a masked implementation, each

sensitive intermediate value is represented as split in

a number of shares (containing both the randomized

sensitive value and the masks employed), which are

then separately processed. To this end, the target al-

gorithm is modiﬁed to process each share and recom-

bine them at the end of the computation. This tech-

nique effectively hinders the attacker from formulat-

ing a correct power consumption model, as the in-

stantaneous power consumption is independent from

the original (non-masked) value. Typically, masking

techniques are categorized by the number of masks, d,

employed for each sensitive value, which is known as

the order of the masking. A d-th-order masking can

always be theoretically broken by a (d+1)-th-order

attack, i.e., an attack exploiting the combination of

d+1 measurements in different time instants, during

an execution, to build a mask-independent power con-

sumption model (Mangard et al., 2007; Schramm and

Paar, 2006; Rivain and Prouff, 2010).

Hiding provides an increase in the computational

security margin, as more samples must be collected to

recover the secret key, resulting in both higher storage

and computation requirements. By contrast, mask-

ing techniques are able to provide perfect security

(i.e., security against a computationally unbounded

attacker), provided that the number of measurements

performed during the computation of an intermediate

value is lower than the order of the masking.

Provably Secure Countermeasures. A theoretical

framework for assessing the security of a masking-

based countermeasures is provided by Ishai et

al. (Ishai et al., 2003). Each sensitive operation is

modeled as a Boolean circuit and a perfectly secure

masking scheme (usually referred to as ISW), with

order d, is deﬁned in terms of a transformation op-

erating on the circuit, outputting a protected version

of it, which is functionally equivalent to the unpro-

tected one. The protected circuit employs both stan-

dard logic gates and a “randomness” gate, which out-

puts one fresh randomly chosen bit per clock cycle.

The threat model assumes an adversary able to ac-

quire at most d simultaneous bit-level values during

per clock cycle of the computation (Rivain and Prouff,

2010; Ishai et al., 2003). The scheme is proven to pro-

vide the indistinguishability of the d values obtained

by the attacker from d randomly extracted values, thus

providing perfect security of the computation against

probing. From a constructive point of view, the un-

protected computation is substituted by three phases:

1) an initial share-splitting, where every original bit-

value is split up into d+1 randomized values over dif-

ferent wires, 2) a transformation of the original com-

putation into one processing all the d+1 shares, and

3) a ﬁnal recombination, which must yield the same

result as the unprotected computation provided that

the composition of the masked values is properly han-

dled, as detailed in (Prouff and Rivain, 2013). As-

suming an unprotected circuit with depth h and size

of O(n) gates, the transformed circuit exhibits a depth

of O(h log d) and a size of O(n d

) gates.

Table 1 shows the computational costs to mask

bitwise operations as a function of the scheme order

d. For multi-bit arithmetic operations it is possible

to perform conversions between Boolean masked val-

ues and arithmetic masked ones and viceversa (De-

braize, 2012). In case the Boolean function is avail-

able in the form of a lookup table, the masking of

the looked-up values is safe up to the 2nd order ac-

cording to (Coron et al., 2007). The key idea is that,

whenever two share-split operands are combined to-

gether, fresh random values should be inserted in the

computation of the resulting output shares. As the

masking countermeasure is particularly computation-

ally demanding (see Table 1) applying it as sparingly

as possible, without lowering the security margin of

the cipher, is highly desirable.

3 SIDE CHANNEL RESISTANCE

To the end of better understanding the actual com-

putational effort required to perform a passive side-

channel attack, it is necessary to understand which

intermediate values of a software computation are el-

SECRYPT2014-InternationalConferenceonSecurityandCryptography

494

igible as targets of the attack. In particular, the com-

putational effort needed to exploit the leakage of a

particular intermediate value depends (see Section 2)

on the number of hypotheses (about a secret-key por-

tion) that an attacker need to formulate to correlate its

predictions with the actual measurements. Therefore,

understanding which and how many bits of the cipher-

key value contribute to the output value of every bit of

each intermediate instruction of the cipher, is crucial

to assess the SCA vulnerability of the cipher.

A useful instrument to be employed is the

DataFlow Analysis (DFA), which is commonly used

by compilers to manipulate the data dependencies

among the variables involved in the computation of

an implementation. The effectiveness of employing

dataﬂow analysis techniques to analyze the interme-

diate values of a cipher computation has been vali-

dated in (Agosta et al., 2013a), through characteriz-

ing a software AES implementation. In particular, the

aforementioned analysis provides a conservative mar-

gin on the resistance against passive side channels of

an intermediate value (i.e.at most the analysis under-

estimates its resistance). Dataﬂow analysis represents

a program in terms of its Control Flow Graph, which

is deﬁned as follows.

Deﬁnition 3.1 (Control-Flow Graph). A Control

Flow Graph (CFG) is a directed graph G(B, E) where

each node i∈B represents a statement of the program

(stat

). The graph is augmented with two additional

nodes i

, i

out

. An edge (i, i

)∈E is added if the state-

ment stat

is executed immediately after the state-

ment stat

, and each node has at most two immedi-

ate successors. For the ﬁrst statement (stat

) there

is an edge (i

, i

), while an edge (j, i

out

)) is added

for each node j bound to a statement (stat

) pre-

ceding an exit point of the program.

It is common practice when performing dataﬂow

analyses, to translate the program in a normal form,

where every intermediate variable is deﬁned (i.e.,

generated) in a single point, and only used after-

wards (i.e., its content is never re-assigned to a new

value). This form, known as Static Single Assign-

ment (SSA), allows the analysis to map each of the

intermediate variables of the program to the node of

the control ﬂow graph where it is computed for the

ﬁrst time. All the industry grade open source com-

pilers (e.g., LLVM, GCC, OPEN64) make extensive

use of the SSA form in their intermediate representa-

tion (IR) languages. A variable is said to be deﬁned

as a node outcome and used in any statement (node)

which computes a value depending on it. Transform-

ing a program into SSA form requires to deal with

the case of a variable which is deﬁned in more than

one statement of the original program. In the basic

case, a straight sequence of statements can be easily

transformed in SSA form through simply adding ex-

tra variables, one for each deﬁnition of the multiply

deﬁned one. In case the multiple deﬁnitions of the

same variable lie in two regions separated by a con-

trol ﬂow divergence (e.g., in different branches of a

selection-construct, or one inside and one outside of

a loop body), the problem of knowing which version

will be employed by the statements depending on it,

can only be resolved at runtime. To overcome this is-

sue, the SSA form employs the φ-function construct

as a placeholder construct. The φ-function takes as

arguments all the variables among which the runtime

selected one will be picked to perform the computa-

tion. Note that no code is emitted as a direct transla-

tion of the φ-function: the compiler simply employs it

as a constraint in the register allocation phase. In par-

ticular, all the arguments of the φ-function are stored

in the same register before the statement using the re-

sult of the φ-function is processed.

As the SSA form of the program allows to do so,

it is possible to identify each node of the CFG with

the actual new variable being deﬁned by it. From

this point on we will thus be using interchangeably

the two concepts without the risk of generating am-

biguity. To perform a dataﬂow analysis, the nodes

of the CFG are augmented with the actual dataﬂow

information, which is computed via a ﬁxed-point al-

gorithm which behavior depends on the information

the dataﬂow analysis should compute. To the end

of determining the inﬂuence of the key-material (i.e.,

the set of values including the input cipher-key and

the derived round-keys) on the intermediate values

of the cipher algorithm, we will employ the Secu-

rity DataFlow Analysis (SDFA) framework, as pro-

posed in (Agosta et al., 2013a). In this framework, the

dataﬂow information attached to each node is a bidi-

mensional vector of Boolean values, which are em-

ployed to indicate which and how many bits of the

key-material inﬂuence the computation of each bit of

the intermediate variable associated to the node.

For the sake of clarity, we will divide the nodes of

the CFG in three different categories, as follows:

Deﬁnition 3.2 (Key-material Node). A key-material

node is deﬁned recursively as either i) a node where a

memory load operation of a cipher-key portion is per-

formed, or ii) a node which uses only values produced

by other key-material nodes.

The deﬁnition of key-material node captures the

practical notion of the KEYSCHEDULE computa-

tion: in fact, the set of all the key-material nodes

corresponds to all the statements computing the

KEYSCHEDULE of the block cipher under exam, in-

cluding the initial load operations of the cipher-key.

OntheSecurityofPartiallyMaskedSoftwareImplementations

495

Deﬁnition 3.3 (Known-value Manipulation Node).

A known-value manipulation node is either a node

where a memory load operation of a known value

is performed (f.i., the plaintext in an encryption al-

gorithm, or the ciphertext in a decryption one), or

a node which uses only values produced by other

known-value manipulation nodes.

The set of known-value manipulation nodes de-

scribes all the actions which a block cipher may per-

form on known values alone, without the inﬂuence of

the key (e.g., the initial permutation of DES encryp-

tion). This set of nodes is relevant for two reasons: i)

it produces output values which are only dependent on

known values, thus no side-channel attack may be led

on them (as their computation lacks the dependency

from the key), and ii) all the nodes of the CFG which

computation does not depend directly or indirectly on

them cannot be targeted in a differential side-channel

attack, as the computed data does not change when

the input is changed.

Deﬁnition 3.4 (Cipher-computation Node). A cipher-

computation node is either a node using both a

known-value manipulation node and a key-material

node to compute its output, or it takes as input at least

one other cipher-computation node.

Cipher-computation nodes are the ones represent-

ing the statements of the program which actually

compute the block cipher mixing either directly its in-

put with the key-material, or carrying its computation

to completion. As such, they are the nodes on which

the SCAs focus to derive the values of the key bits.

The SDFA acts on the CFG to the end of evalu-

ating a metric of the computational effort required to

lead a SCA against each single bit of the instruction

outcome represented by each node. Prior to state a

formal deﬁnition of the side-channel resistance of an

instruction, it is worth remembering that a core block

cipher design guideline requires that the combination

of the key-material with the outcomes of the interme-

diate values of cipher should never result in the re-

moval of the effect of previously added key-material.

For instance, a key-bit should never be combined via

an xor-addition twice to the same value, as the sec-

ond addition would cancel the ﬁrst. We note that all

sound block ciphers are designed striving to achieve

this property. By contrast, having an internal cancel-

lation of the key-material contributions would imply

that increasing the rounds of the block cipher under

exam, its security margin would be reduced, which is

a clear design ﬂaw.

Deﬁnition 3.5 (Resistance). The resistance of any

bit computed by either an intermediate cipher-

computation node or a key-material node, is deﬁned

as the minimum number of key-material bits required

to derive its value. The resistance of any bit computed

by a known-values manipulation node is deﬁned to be

inﬁnite, as its value does not depend on any unknown.

We note that the resistance notion takes into ac-

count the fact the attacker may choose to retrieve any

of the key-material bits, instead of the cipher-key bits

required to compute them. This captures the com-

mon practice of attacking the intermediate value pro-

duced during the last rounds of a block cipher. In this

case, the attacker makes an hypothesis on the value

of the last key-material bits used, instead of making

his guesses on the cipher-key bits. This strategy al-

lows him to recover a portion of the key-material em-

ployed in a certain round (despite the fact this may

depend on the whole cipher-key), and subsequently,

to exploit the algebraic relations among the bits of

the key-material to invert the KEYSCHEDULE and to

obtain the cipher-key bits. Computing the resistance

metric taking into account only the original cipher-

key bit would thus lead to a signiﬁcant overestimation

of the security margin of the cipher.

To efﬁciently compute the resistance value of all

the bits of the cipher-computation nodes it is possible

apply two dataﬂow equations deﬁning how the key-

material bit contributions are propagated when the op-

eration corresponding to the statement represented by

the node is computed. The application of the equa-

tions is repeated on all the CFG nodes until a ﬁxed

point is reached. The equations deﬁne how the op-

eration under exam propagates the data dependencies

from the key-material of its uses into the value it de-

ﬁnes. For further details on the form of the dataﬂow

equations we refer the reader to (Agosta et al., 2013a).

We note that each application of the dataﬂow equa-

tions can only raise the number of key-material de-

pendencies of the bits of a node, thus the computation

always terminates within a ﬁnite time.

The worst-case time complexity of such an analy-

sis is given by the case where a single change to the

properties of a node triggers the need to re-apply the

dataﬂow equations to all the remaining ones. In ad-

dition, the maximum number of modiﬁcations to the

resistance values of the bits of a node is bounded by

the product of the number of key-material bits, by

the number of the bits contained in the node. The

worst-case time complexity of the analysis is thus

O(|B|(|B| − 1) · k · w), where k is the number of

key-material bits and w the largest number of bits en-

coding the value of the variable deﬁned by a node.

The proposed resistance metric enjoys the following:

Property 3.1 (Key-material nodes resistance). All the

bits of the key material nodes have a resistance value

equal to 1.

SECRYPT2014-InternationalConferenceonSecurityandCryptography

496

This is a direct consequence of the fact that an attacker

may directly hypothesize its value, to the end of per-

forming a simple power analysis (SPA) attack. T

The minimum number of key-material bits in-

volved in the computation of an output bit can be ob-

tained, after the ﬁxed point of the dataﬂow equations

application on the CFG has been reached, consider-

ing the relations binding them. Thanks to the non

cancelling property of the key contributions, the re-

sistance values of the cipher computation nodes enjoy

also the following

Property 3.2 (Resistance increases with distance

from known-values nodes). The resistance value of

the bits of the cipher-computation nodes increases

monotonically on the shortest path between the node

which generates them and the closest known-material

manipulation node.

This property captures the well known notion among

practitioners that the central rounds of a block ci-

pher tend to be intrinsically more robust against side-

channel attacks than the ﬁrst and last ones.

4 SOUNDNESS OF PARTIAL

MASKING

In this section we will state our claim on resistance of

a selectively masked algorithm against power-based

SCAs, reducing the capabilities of the side-channel

attacker to the ones of an attacker able to perform only

exhaustive key searches.

We start by stating the side channel attacker

model against which we substantiate the resistance

of a selectively masked algorithm, starting from an

adaptation of the ISW attacker model to an attack

against software implementations. The ﬁrst obser-

vation in this context concerns the fact that the pro-

tected Boolean operations in the form of combinato-

rial logic, computed by the hardware (i.e., xor, and,

not) are transformed in linear sequences of instruc-

tions (referred to as “macroinstruction” from now on)

to the end of being executed in software. Consequen-

tially, the single clock cycle restriction for the mea-

surements coming of the presented ISW scheme is

translated in terms of the attacker being able to col-

lect at most d measurements within the computation

of a single macroinstruction. In addition, the mea-

surements being collected are limited by the impos-

sibility of collecting more than d measures related

to the shares of the same value, regardless of when

a computation involving them happens. Moreover,

we assume that the d measurements performed by the

side channel attacker are not direct acquisitions of the

computed values, but instead the result of a leakage

function L applied to them.

We thus deﬁne our attacker model as follows:

Deﬁnition 4.1 (d-th order Software Attacker Model).

The d-th order attacker model is deﬁned as an at-

tacker able to sample any d values of the leakage

function L of the underlying platform, during the

computation of each single macroinstruction of a d+1

shares protected computation. Once a value has

been measured, it counts as measured in all the other

macroinstructions. In addition to the measured val-

ues, the software attacker is entitled to know the in-

puts and outputs of the algorithm, except for the val-

ues of the secret key bits. From the computational

standpoint, the attacker is polynomially computation-

ally bound in the size of the cipher key.

We note that this attacker model includes the com-

mon notion of d-th order attacker employed in prac-

tice (Rivain and Prouff, 2010; Mangard et al., 2007),

i.e., the one where d samples coming from the mea-

surement of an algorithm execution are employed to

lead the attack. Constraining the attacker to the use

of d samples from the measurement of the algorithm

execution allows the constraint of not collecting more

than d measurements of the shares of the same value

to be implicitly satisﬁed. The described d-th order

Software Attacker model thus provides a more pow-

erful attacker than the one usually employed to lead

a high-order SCA (Rivain and Prouff, 2010; Mangard

et al., 2007). In particular, if up to d measurements per

macroinstruction are employed to lead a d-order SCA,

it is possible to obtain more than one d-wide subset of

them leading to a successful attack, as each of the d

shares of the same value may be measured in more

than one instruction computation. These instructions

are the ones computing the macroinstruction deﬁning

the target value and the ones using it.

Deﬁnition 4.2 (Ideal Attacker). Given a software ci-

pher implementation running on a target platform,

the ideal attacker aims at recovering the value of the

cipher-key. He has the capability to choose arbi-

trary plaintexts/ciphertexts to be encrypted/decrypted

and obtain the corresponding outcomes, a number of

times polynomially bounded in the cipher key size.

Also, he may resort to an amount of computational

power which is polynomially bound in the size of the

cipher-key.

Proposition 4.1 (Security of a partially masked im-

plementation). Let A

sw,d

be a d-th order software at-

tacker, and A

ideal

an ideal one (i.e. the one able to

perform only an exhaustive search of the whole key).

Let C be a software cipher implementation with a

k-bit key running on a target platform with leakage

OntheSecurityofPartiallyMaskedSoftwareImplementations

497

function L, and C

the result of the application of a

d+1 shares ISW transformation to all the operations

in C having a resistance value r of at least one of

their output bits lower than the cipher key length, k.

The side channel attacker A

sw,d

cannot perform any

attack more efﬁcient on C

than the ones its counter-

part A

ideal

can lead on C.

Proof. To the end of validating the equivalence of

the attacks of the A

sw,d

on C

with respect to the

ones of A

ideal

on C we will analyze the how A

sw,d

can exploit the information derived from the mea-

surements to obtain a more efﬁcient attack. Perform-

ing a measure of the values computed by known-

values manipulation nodes does not yield any advan-

tage to the A

sw,d

with respect to the ideal attacker,

as he either already possesses that information or he

can compute it in polynomial time. Any d measure-

ments performed on operations of C

concerning the

masked values in a macroinstruction yield an infor-

mation equivalent to a randomly generated one, as

proven in (Ishai et al., 2003). Consequentially the side

channel attacker A

sw,d

does not gain any advantage

from these measurements too (note that no informa-

tion would be gained even in a scenario of an attacker

with unbounded computational capabilities). If A

sw,d

chooses to perform one or more of his measurements

on any sensitive intermediate instruction of C

where

the ISW masking has not been applied, he can obtain

up to d samples of the L function for values which de-

pend on at least r=k key material bits. We are assum-

ing that the attacker has no way of deriving directly

the input value of L from a single output, a widely

accepted assumption in Differential Power Analysis

(we note that deﬁning the precise form of a generic

L is still an open problem (Whitnall et al., 2014)). In

particular, the attacker is only able to obtain meaning-

ful information through comparing the acquired mea-

sures against the outputs of a key-value parametric

distinguisher D(k), typically by means of a statisti-

cal test. The distinguisher may be computed in two

ways: 1) either through an a-priori synthetic calcula-

tion on the known values and a key hypothesis (f.i., as

in classical DPA or CPA attacks), 2) or through mod-

eling L through recording the behavior of an identical

computing device (f.i., as in a template attack). An-

alyzing the efﬁciency of perform the former, we note

that the attacker must evaluate the distinguisher func-

tion D(k) on all the 2

possible key values, thus re-

sulting in a total complexity of Θ(2

D(k)

) , where

D(k)

is the time complexity of the distinguisher eval-

uation. Consequentially the complexity of this attack

strategy is lower bounded by the one of the plain sub-

key enumeration, Ω(2

). As the only instructions to

which the ISW masking is not applied are the ones

involving all the key bits, this results into the same

computational effort of the ideal attacker. Analyzing

the computational effort of the second attack strategy,

the attacker needs to collect enough measurements to

be able to distinguish the actual measured behavior

from one generated by a different key. To this end,

he will need to collect at least one measurement per

every value taken by all the key bits involved in the

computation of the values under attack. As the com-

putation of these values is inﬂuenced at minimum by

r=k key bits, the attacker will thus need to store at

least 2

measurements, fully proﬁling the behavior

of the target device. As the spatial complexity of a

computation provides always a lower bound for its

time complexity, we can state that the time complex-

ity of this approach is also Ω(2

). Thus, in both cases,

the advantage provided by measuring the unprotected

part only allows to reduce the computational effort of

sw,d

up to Ω(2

), which is the same as the ideal at-

tacker one.

Willing to analyze the relevance of the hypothe-

ses made in the side-channel attacker model, we will

now highlight the effect of lifting either the one on the

number of measurements, or the assumption that she

is not able to derive the input of leakage function L. If

the bound of d measurements is lifted, despite the fact

that attacking ISW protected cipher portions becomes

feasible, the side-channel attacker is unable to gain

any information from the portion of the cipher with a

resistance equal to the number of key bits, as extract-

ing it still requires a computational effort exponential

in the size of the cipher key. Assuming the attacker

can obtain the input of the leakage function L from

a single output implies that she can attack even inter-

mediate values of the cipher having a resistance equal

to the number of key bits. This can be done measur-

ing two values separated only by a key material ad-

dition, and deriving the added key material by differ-

ence. This hypothesis relaxation captures the expen-

sive and invasive microprobing attacks, where the at-

tacker directly taps the on-die lines via a small metal-

lic probe. If such attacks ﬁt into the attacker model

under consideration, we note that a partial masking of

the implementation would not hinder them.

5 CONCLUDING REMARKS

In this work we substantiate that applying a provably

secure d-order masking only to a portion of a block

cipher yields the same security of an all-out applica-

tion, against a computationally bound attacker. We

based our claim on the reasonable hypothesis that the

SECRYPT2014-InternationalConferenceonSecurityandCryptography

498

attacker is not able to derive directly from a single

measurement of the side channel the actual interme-

diate value being computed. Providing a description

of the leakage function, both formally analyzable and

modeling the actual experimental evidence, is still a

subject for open debate (Galea et al., 2014). More-

over, an interesting research direction is to provide

tighter lower bounds for the attacker effort, once such

a leakage function has been speciﬁed.

REFERENCES

Agosta, G., Barenghi, A., Maggi, M., and Pelosi, G.

(2013a). Compiler-based Side Channel Vulnerabil-

ity Analysis and Optimized Countermeasures Appli-

cation. In DAC, page 81. ACM.

Agosta, G., Barenghi, A., and Pelosi, G. (2012). A Code

Morphing Methodology to Automate Power Analysis

Countermeasures. In Groeneveld, P., Sciuto, D., and

Hassoun, S., editors, DAC, pages 77–82. ACM.

Agosta, G., Barenghi, A., Pelosi, G., and Scandale, M.

(2013b). Enhancing Passive Side-Channel Attack

Resilience through Schedulability Analysis of Data-

Dependency Graphs. In Lopez, J., Huang, X., and

Sandhu, R., editors, NSS, volume 7873 of Lecture

Notes in Computer Science, pages 692–698. Springer.

Agosta, G., Barenghi, A., Pelosi, G., and Scandale, M.

(2014). A Multiple Equivalent Execution Trace Ap-

proach to Secure Cryptographic Embedded Software.

In DAC, pages 1–6. ACM.

Barenghi, A., Pelosi, G., and Terraneo, F. (2013). Secure

and Efﬁcient Design of Software Block Cipher Imple-

mentations on Microcontrollers. IJGUC, 4(2/3):110–

118.

Coron, J.-S. (2014). Higher order masking of look-up ta-

bles. In Nguyen, P. Q. and Oswald, E., editors, EU-

ROCRYPT, volume 8441 of LNCS, pages 441–458.

Springer.

Coron, J.-S. and Kizhvatov, I. (2010). Analysis and Im-

provement of the Random Delay Countermeasure of

CHES 2009. In Cryptographic Hardware and Em-

bedded Systems, pages 95–109.

Coron, J.-S., Prouff, E., and Rivain, M. (2007). Side Chan-

nel Cryptanalysis of a Higher Order Masking Scheme.

In Paillier, P. and Verbauwhede, I., editors, CHES, vol-

ume 4727 of LNCS, pages 28–44. Springer.

Debraize, B. (2012). Efﬁcient and provably secure methods

for switching from arithmetic to boolean masking. In

Prouff, E. and Schaumont, P., editors, CHES, volume

7428 of LNCS, pages 107–121. Springer.

Galea, J. L., Martin, D., Oswald, E., Page, D., and Stam,

M. (2014). Making and breaking leakage simula-

tors. Cryptology ePrint Archive, Report 2014/357.

http://eprint.iacr.org/.

Garcia, F. D., van Rossum, P., Verdult, R., and Schreur,

R. W. (2009). Wirelessly Pickpocketing a Mifare

Classic Card. In IEEE Symposium on Security and

Privacy, pages 3–15. IEEE CS.

Ishai, Y., Sahai, A., and Wagner, D. (2003). Private Cir-

cuits: Securing Hardware against Probing Attacks. In

Boneh, D., editor, CRYPTO, volume 2729 of LNCS,

pages 463–481. Springer.

Mangard, S., Oswald, E., and Popp, T. (2007). Power Anal-

ysis Attacks - Revealing the Secrets of Smart Cards.

Springer.

Moradi, A., Barenghi, A., Kasper, T., and Paar, C. (2011).

On the Vulnerability of FPGA Bitstream Encryp-

tion against Power Analysis Attacks: Extracting Keys

from Xilinx Virtex-II FPGAs. In Chen, Y., Danezis,

G., and Shmatikov, V., editors, ACM CCS, pages 111–

124. ACM.

O’Flynn, C. and Chen, Z. (2012). A Case Study of Side-

Channel Analysis Using Decoupling Capacitor Power

Measurement with the OpenADC. In Garc

ıa-Alfaro

et al., J., editor, FPS, volume 7743 of LNCS, pages

341–356. Springer.

Prouff, E. and Rivain, M. (2013). Masking against Side-

Channel Attacks: A Formal Security Proof. In Jo-

hansson, T. and Nguyen, P. Q., editors, EUROCRYPT,

volume 7881 of LNCS, pages 142–159. Springer.

Rivain, M. and Prouff, E. (2010). Provably Secure Higher-

Order Masking of AES. In Cryptographic Hardware

and Embedded Systems, CHES, pages 413–427.

Schramm, K. and Paar, C. (2006). Higher Order Masking of

the AES. In Pointcheval, D., editor, CT-RSA, volume

3860 of LNCS, pages 208–225. Springer.

Tillich, S. and Herbst, C. (2008). Attacking State-of-the-Art

Software Countermeasures-A Case Study for AES. In

Oswald, E. and Rohatgi, P., editors, CHES, volume

5154 of LNCS, pages 228–243. Springer.

Whitnall, C., Oswald, E., and Standaert, F.-X. (2014). The

Myth of Generic DPA...and the Magic of Learning. In

Benaloh, J., editor, CT-RSA, volume 8366 of LNCS,

pages 183–205. Springer.

OntheSecurityofPartiallyMaskedSoftwareImplementations

499