On Obfuscating Compilation for Encrypted Computing

Peter T. Breuer

, Jonathan P. Bowen

, Esther Palomar

and Zhiming Liu

∗,4

Hecusys LLC, Atlanta, GA, U.S.A.

London South Bank University, London, U.K.

Birmingham City University, Birmingham, U.K.

RISE, Southwest University, Chongqing, China

Keywords:

Obfuscation, Compilation, Privacy, Encrypted Computing.

Abstract:

This paper sets out conditions for privacy and security of data against the privileged operator on processors

that ‘work encrypted’. A compliant machine code architecture plus an ‘obfuscating’ compiler turns out to be

both necessary and sufﬁcient to achieve that, the combination mathematically assuring the privacy of user data

in arbitrary computations in an encrypted computing context.

1 INTRODUCTION

A well-known argument (van Dijk and Juels, 2010)

equates privacy of user data on a computing platform

with cryptographic obfuscation (Hada, 2000) against

the privileged operator as adversary. Inputs, outputs,

and any intermediate data that may be accessible are

maintained by the processor in the user’s personal en-

cryption, or the data would intrinsically be readable

by any observer. Then privacy equates formally with

obfuscation (the argument will be set out at the start of

Section 2). Several prototype processors already sup-

port that encrypted mode of working (they will be de-

scribed in Section 3), allowing operators and operat-

ing system alike to see and manipulate user data while

keeping it in encrypted form. On such platforms, the

operators can single-step the machine, examine data

and program instructions in registers and memory,

and change anything and everything to which they

have access, copying and repeating as required, but

the unencrypted form of data is unavailable to them.

In that context, the fundamental question is whet-

her there exists a bona ﬁde computational process that

the operator can leverage to produce encryptions of

some known numerical values with a degree of cer-

tainty. That would enable a ‘known plaintext attack’

(KPA) against the encryption. A KPA may eventually

break the encryption and make accessible the user’s

data for reading and/or subversion. On the face of it,

∗

Correspondence: Z. Liu, RISE, Southwest University,

Tiansheng Rd, Beibei, Chongqing, China

there is at least one such ‘computational process’ that

does that, because the operator ought to be able to

issue an instruction that causes the processor to sub-

tract an encrypted number from itself, yielding an en-

crypted zero with absolute certainty.

This paper examines the conditions for that to

(not) be possible. ‘Non-functional’ avenues of at-

tack via statistics of cache hits, power consumption,

etc. are preventable by engineering means and are not

considered here, but there would be in principle noth-

ing to be done about using a processor’s computing

function as a vulnerability, if such an approach could

succeed. It is argued here that a compliant machine

code architecture plus an ‘obfuscating’ compiler are

necessary and sufﬁcient to prevent that, and our con-

tribution here is in setting out abstractly how hard-

ware, instruction set and compiler must play together

to mathematically assure the privacy and security of

data in encrypted computing.

The layout of the paper is as follows. Section 2

analyses the conditions for privacy and Section 3 dis-

cusses extant prototype processors that support en-

crypted running and satisfy the conditions of Sec-

tion 2 to varying degrees. Section 4 discusses in-

struction sets satisfying the conditions, and Sections 5

and 6 introduce obfuscating compilers. The latter sec-

tions contain the major mathematical results.

2 PRIVACY CONDITIONS

Van Dijk and Juels’ ‘well-known argument’ referred

Breuer, P., Bowen, J., Palomar, E. and Liu, Z.

On Obfuscating Compilation for Encrypted Computing.

DOI: 10.5220/0006394002470254

In Proceedings of the 14th International Joint Conference on e-Business and Telecommunications (ICETE 2017) - Volume 4: SECRYPT, pages 247-254

ISBN: 978-989-758-259-2

247

to at the start of Section 1 asks what happens when the

encrypted input data for a user’s problem and the en-

crypted code for treating it are both delivered for ex-

ecution to a virtual machine (VM) running on a plat-

form for encrypted computing. The VM should de-

crypt the data and the code, with probable hardware

assistance via instructions that physically do encryp-

tion and decryption as required, and run the code in

a private area of the processor with the data as input,

probably with interrupts disabled to prevent tamper-

ing. The results should be encrypted for return. The

question to put is: is this situation more helpful to

the hypothetical adversary than running the code on

the data on a real machine locked inside a metal safe,

with no access for operator and users alike, other than

to the (encrypted) inputs and outputs? If the answer is

‘no’, then (a) that is as good as privacy for computa-

tion can ever get, and (b) that is precisely Hada’s def-

inition for data to have been effectively cryptograph-

ically obfuscated on the processor.

That establishes

the equivalence of obfuscation with the goal: if the

data cannot be attacked any more effectively than on

a black box implementation of the program, then the

privacy that has been achieved cannot be bettered.

So effective obfuscation of data is what to aim for.

Reasoning about the problem begins by noting

that there are processors that already do ‘obfuscation

by design’. An example is the Ascend co-processor

(Fletcher et al., 2012), which executes code in Fort

Knox-style physical isolation, with no operator access

to it while running. The external observables, such as

power consumption, timing of I/O (including memory

accesses), cache hit/miss ratio, etc., that might leak

information via side-channels (Wang and Lee, 2006;

Zhang et al., 2013), obey statistics that are conﬁgured

beforehand and have no correlation with the running

program, apart from the run time.

For good mea-

sure, the executable is also input in encrypted form.

Since the operator’s access to the running program is

restricted by a physical black box, the program and

data are obfuscated, by the deﬁnition given. But is ef-

fective obfuscation still possible with a less extreme

This deﬁnition of cryptographic obfuscation follows

Hada but restricts what is of interest to data, not code.

An argument (Barak et al., 2001) that cryptographic

obfuscation as deﬁned by Hada is impossible in the gen-

eral case does not apply because the inputs and outputs are

encrypted here, and hardware assistance is permitted.

A side-channel consisting of signalling via repeat ac-

cesses to the same memory location is also closed in Ascend

by the use of oblivious RAM (Ostrovsky and Goldreich,

1992), which remaps the logical to physical address rela-

tion dynamically, maintaining aliases, so access patterns are

statistically impossible to spot. It also masks programmed

accesses among independently generated random accesses.

solution, one that permits debugging, interrupt rou-

tines and system calls, for example? Here we require:

Operators and users have conventional access.

That means access to registers and memory, and the

user’s code may be single stepped, repeated, retried,

altered, etc. by the operator.

For obfuscation to go through in those circum-

stances constrains the processor hardware as follows:

(1) Each machine code instruction is a black box.

That is, individual instructions are treated as Ascend

treats a whole program at a time. Only the inputs and

outputs from each instruction are visible, in order that

they may effectively be a ‘black box’. Instructions

are executed in a processor in many stages along a

pipeline, and those intermediate stages must not be

visible. That is functionally so in any standard pro-

cessor, where interrupts are only enabled at the point

where an instruction ﬁnishes, but there must also be

no clue such as cache hit statistics or power consump-

tion data that may reveal the action of the instruction.

It is usual practice in mathematics to phrase results

in relative form. It is assumed here:

The encryption used is secure in its own right.

That is, the reasoning in this paper will suppose en-

crypted values are unreadable by an adversary. Logi-

cally they also cannot be created to order (the writer,

knowing them, could ‘read’ them too). A spy’s game

is to guess their meaning from observing or interfer-

ing with a computation; encryption is not on the line,

computation is. If the spy sees x+x and also x∗x with

identical result, the spy will deduce that x is encrypted

0, 1, or 2, and the danger is of computation leaking in-

formation on encrypted data, or subverting it.

The abstraction is realistic for one-to-many en-

cryptions of block size such that the number of in-

structions that may execute is small relative to the ex-

pected interval between cipherspace collisions.

The inputs to and outputs from each instruction

must also be in encrypted form, as in Ascend, or they

would be readable by the operator:

(2) Each machine code instruction must be observed

to read and write data in encrypted form.

Otherwise, the operator, able to single-step the ma-

chine and access registers, can read every action.

This does not mean that inputs and outputs must

be in encrypted form all the time: it sufﬁces for the

hardware to encrypt when an instruction in privileged

mode views a processor register with user data in, for

example, or a user mode instruction writes to memory.

Beyond the above, software toolchain support is

also required, otherwise the code itself might obviate

SECRYPT 2017 - 14th International Conference on Security and Cryptography

248

all protections. A human author will write code in-

volving small numbers that can be guessed in a dictio-

nary attack. A kind of obfuscating compiler is needed

that ensures that the numbers used (encrypted) in code

and also the numbers circulating (encrypted) at run-

time have no predictable bias that can be leveraged

into an attack on the encryption.

The compiler at each recompilation of the same

program must produce an executable that in the code

and at any point in the runtime trace (the sequence of

instructions and register/memory states) has arbitrar-

ily and uniformly distributed differences in the data

values under the encryption with respect to any other

compilation. How is that compatible with producing

the same answer from the same inputs, no matter how

the source code is compiled? It is not, but so long

as the compiler agrees with or simply makes known

to the owner of the code an offset A on the input and

an offset B on the output of the intended functionality

f (x+A)+B under the encryption, then that is permis-

sible. The owner of the code must incorporate the

extra offset A before encryption of the data for execu-

tion, and remove the extra offset B after decryption of

the result. So long as A and B are independently ran-

domly generated and uniformly distributed, it follows

that the (decrypted) inputs x+A and outputs y+B are

too, no matter what biases select x and y, as required.

Considering a single instruction with functional-

ity y← f (x) under the encryption, it must be pos-

sible for the compiler to vary the functionality to

y← f (x+A)+B without an observer telling which A

and B have been selected, so their distributions remain

unbiased with respect to the observer. That is, each

instruction must be malleable in the following sense:

(3) Each instruction supports arbitrary additions A,

B to inputs and outputs via adjustments of (en-

crypted) parameters in the instruction.

There is just one permissible exception: an instruction

that copies a value from one location to another may

do so faithfully. An observer will still not be able to

rely on any statistical bias in any particular runtime

data value, but will know that whatever it is, it is the

same as in the datum it has been copied from, if any.

We also wish to consider the situation without the

complication of imagining that the operator, acting

as adversary, can experiment by taking an encrypted

data value from a user program trace and substitut-

An information theory argument establishes that x+A

(mod 2

, say) is randomly and uniformly distributed when

A is, no matter what distribution x follows provided it is

independent of A. The argument is that A has maximal en-

tropy and x+A cannot have less entropy (Shannon), so must

have maximal entropy too, which means it takes randomly

and uniformly distributed values across the whole range.

ing it into instructions of their own devising in or-

der to test it or further modify it using the processor.

Nor should the hardware allow the encrypted con-

stants that appear in some instructions to work cor-

rectly when used as inputs for arithmetic. That makes

it impossible for the spy in the computer room to pass

the encrypted constants seen in programs through the

processor arithmetic, patching the results back into

new code snippets. So the hardware should ensure:

(4) There are no collisions between (i) encrypted con-

stants that appear in instructions and (ii) runtime

encrypted data values in registers or memory.

That has to be actively enforced in the processor. It

is not a logical consequence of assuming secure en-

cryption. It may be achieved in an implementing pro-

cessor by different padding or blinding factors for the

two domains (i-ii), checked in the processor pipeline.

The rest of this paper will use conditions (1-4) to

construct a means for obfuscation of user data.

3 SUPPORTING PLATFORMS

As remarked in Section 1, there are prototype plat-

forms that satisfy at least the conditions (1-2). Apart

from Ascend, which satisﬁes them trivially, there is

HEROIC (Tsoutsos and Maniatakos, 2015), a proto-

type 16-bit machine running with a deterministic Pail-

lier (2048 bit) encryption (Paillier, 1999). Its core ex-

ecutes an encrypted addition in 4000 cycles on its fun-

damentally 200MHz hardware, roughly equivalent to

a 25KHz Pentium’s speed. Instructions work on en-

crypted data, producing encrypted data.

The KPU design (Breuer and Bowen, 2014;

Breuer and Bowen, 2016; Breuer et al., 2016) gener-

alises HEROIC’s approach, achieving encrypted run-

ning by modifying the arithmetic logic unit (ALU)

for encrypted working in a roughly conventional

32/64/128-bit RISC (Patterson, 1985) processor lay-

out. The modiﬁcation to the arithmetic causes data

to circulate and be processed in encrypted form. A

KPU design may embed any encryption the block size

of which matches the word size and the hardware for

which ﬁts into reasonably many stages of a processor

pipeline. Running the US Advanced Encryption Stan-

dard (AES) 128 bit encryption (Daemen and Rijmen,

2002) in 10 pipeline stages and using a 1GHz clock it

runs at the speed of a 300-500MHz Pentium, broadly

comparable with current PCs.

Those tests are with

Dhrystones v2.1 rating of 104-140MIPS. A 1GHz 686

family Pentium M is rated at 420MIPS according to the ta-

ble at www.roylongbottom.org.uk/dhrystone results.htm. A

200MHz 586 family classic Pentium is rated at 48.1MIPS.

On Obfuscating Compilation for Encrypted Computing

249

contemporary 13.5ns latency memory and 3ns latency

cache. The slowdown over unencrypted running is

10-50%. Ascend, running with similar technology,

instruction set, and AES-128, slows by 12-13.5×.

The HEROIC design also satisﬁes condition (3).

The ‘OI’ in HEROIC stands for ‘one instruction’,

meaning that the machine code architecture has only

one kind of instruction, a combined ‘add, compare,

branch’ (that is classically computationally complete

together with recursion and at least one nonzero

constant – the instruction comprises the mathemati-

cian J.H. Conway’s Fractran programming language

(Conway, 1987), often used for theoretical studies

in computability and complexity). Its instructions

may be considered compounds of addition of a con-

stant y←x+k and branches that compare with a con-

stant x<K. Addition of a constant satisﬁes condition

(3) because an instruction that has the effect of y ←

(x+A)+k + B, encrypted, can be obtained by running

y ← x + k

, encrypted, where k

is k+A+B mod 2

Compare with a constant also satisﬁes the condition,

because an instruction that compares x+A < K can

be obtained by running an instruction that compares

x < K

where K

is K−A mod 2

(to agree, the reader

needs to know that the conventional 2s complement

comparison u < v is invariant under translations).

The KPU runs the standard OpenRISC v1.1

instruction set (see https://opencores.org/or1k/

Architecture Speciﬁcation) that does not satisfy (3)

but it may be modiﬁed to do so.

Neither HEROIC nor the KPU support condition

(4) at present (that encrypted data should not work

as encrypted program constants and vice versa), but

it can be arranged in the case of the KPU since the

processor has internal access to the decrypted form of

data, including the padding under the encryption.

4 COMPLIANT INSTRUCTIONS

Instruction sets are conventionally made up of instruc-

tions that (a) perform a relatively simple arithmetic

operation, such as ‘addition’ on data in registers or

in memory and return a result to registers or mem-

ory, instructions that (b) perform a comparison oper-

ation such as ‘less than’ between values in registers

or memory, branching to a different point in the pro-

gram if it is satisﬁed, plus (c) unconditional control

instructions such as jump to and return from a sub-

routine that always alter which instruction is executed

next. In the following, we will omit ubiquitous ‘under

the encryption’ qualiﬁers in the instruction semantics.

Many standard instructions of type (a) such as ad-

dition z ← x +y of two runtime values do not sat-

isfy condition (3). There are no conﬁgurable pa-

rameters for the compiler to modify its semantics to

z ← (x+A

)+(y+A

)+B, as condition (3) stipulates.

At least one inline constant is required. For addition,

a nonstandard instruction with semantics z ← x+y+k

satisﬁes condition (3), with two operands ﬁlled at run-

time and one provided by the compiler. The seman-

tics z ← (x+A

) + (y+A

)+k+B may be obtained by

z ← x+y+k

adjusting k

= k+A

+B.

For multiplication, z ← x∗y does not satisfy con-

dition (3) and neither does y ← x∗k, the two standard

instructions. However, AMD and Intel in 2011-2013

introduced so-called fused instructions that combine

two arithmetic operations. While addition takes one

cycle to complete in a processor, multiplication takes

much longer (about ten cycles), and the repeating sub-

unit that forms the long multiplication logic multiplies

two short integers and adds in two short incoming

‘carry’ integers from subunits ‘right’ and ‘below’ in a

2-dimensional array. The column and row of subunits

at extreme ‘right’ and ‘bottom’ respectively may feed

two full integer addends into the calculation at no ex-

tra cost, and such a ‘fused multiply and add’ (FMA)

instruction was introduced in AMD and Intel’s FMA3

and FMA4 instruction sets for reasons of efﬁciency.

A fused multiply and add instruction satisfying (3) is,

for example, x

← (x

−k

)∗(x

−k

)+k

. In general,

if instruction semantics is x

← f (x

), then condition

(3) says that changing x

by A and x

by B is achieved

by modifying constant parameters, so the full instruc-

tion semantics must be x

← f (x

−k

)+k

A branch instruction (b) compares two operands

. Condition (3) requires there be parameters in

the instruction for the compiler to generate the seman-

tics of (x

−k

)<(x

−k

) and (x

−k

)6<(x

−k

) from

it. That may be done by emitting different instructions

with tests of the form x

and x

≥x

with

= k

−k

. In conclusion, a complete set of machine

code instructions compliant to (3) is very feasible.

Then consider non-probabilistic attacks on a pro-

gram C running in a processor working encrypted and

satisfying (1-4), with instructions satisfying (3). Re-

call the encryption is assumed secure in its own right:

Theorem 1. There is no deterministic method by

which the privileged operator can read the encrypted

data read or written by program C, nor alter C to

write an intended encrypted value.

Proof. (Sketch) Imagine that every runtime datum

circulating in the processor under the encryption has

magically increased by 7. We will adapt the program

C by changing some of its embedded encrypted con-

stants to accept that change while still generating a

trace T that looks the same as before, up to encryp-

tion. The supposed method sees the new program and

SECRYPT 2017 - 14th International Conference on Security and Cryptography

250

trace as the same as before and reports the old answer,

whereas it should report the old answer plus 7. Done.

(Detail) Suppose for contradiction that the opera-

tor has a method f (T,C)=y of ﬁnding that the output

[y]

of C encrypts y, having observed the trace T , a

sequence of states s

of the register/memory with in-

struction p

at address a

transforming s

→

i+1

. De-

ﬁne a transform s → s

by adding 7 to the number in

each location r in state s giving the state s

with s

r =

[[sr]

+ 7]

, [·]

representing de/encryption.

A program C

is obtained by (i) replacing ev-

ery instruction p at address a in C that is of the

form r

←[([r

]

−k

)

([r

]

−k

)+k

]

with p

the same form r

←[([r

]

−k

)

([r

]

−k

)+k

]

where k

= k

+ 7, i = 0,1,2; (ii) if the instruction

is a branch with test ([r

]

−k

)R([r

]

−k

), then it

is likewise changed to ([r

]

−k

)R([r

]

−k

) with

= k

+ 7, i = 1, 2; (iii) unconditional jump instruc-

tions p at address a are left just as they are, with

= p. The replacements (i) are designed so s

a:p

→

where s

a:p

→

t. The next instruction address is a+1.

Branches (ii) with a test ([r

]

−k

)

([r

]

−k

) take

the same jump at state s

as ([r

]

−k

)

([r

]

−k

)

does at state s, getting the same result from the test.

The unconditional jumps (iii) also take the same jump

at state s

as at state s.

Thus if T with s

→

i+1

is a trace of C, then T

with s

→

i+1

is at least a feasible trace of C

, hence

is the unique trace of C

in the deterministic processor.

The program C

‘looks the same’, C

∼C, differing

only by the encrypted numbers [k]

instead of [k

]

in it. The new program trace T

‘looks the same’ too,

∼ T , in the same sense, that is, up to the encrypted

numbers in it, because the branches after comparisons

go the same way as in T and make the same jump, giv-

ing rise to the same sequence of instruction addresses.

The difference is states s

instead of s

, and instruction

instead of p

at address a

, and the differences be-

tween all those are different encrypted numbers.

If the method f were sensitive to differences in

encrypted values then it would vary while the value

underneath the encryption stayed constant

, as it can-

not read the encryption by hypothesis. So it must be

constant with respect to changes in encrypted values,

with C∼C

and T ∼T

implying f (T,C) = f (T

So f (T

)= f (T,C)=y. Yet the output of C

is not

[y]

but [y+7]

, so the method does not work.

Formally it should be supposed that any feature of en-

crypted values that the attacker’s probes f may be sensitive

to, such as counting the number of 7s, is triggered by some

encryption [y]

of every value y, so f (... ,[y]

,... ) must be

constant with respect to that parameter if f really detects y.

For the second half of the result, suppose the op-

erator builds a new program C

= f (T,C) that returns

outputs [y]

where y is known to and decided by the

operator. Then the constants [k]

of C

are in C be-

cause the operator’s technique f has no way of arith-

metically combining them (the condition (4) means

they cannot be combined arithmetically in-processor

nor taken from the trace, and the operator does not

have the encryption key). The ﬁrst half of the result

now says the operator cannot read the outputs [y]

, a contradiction.

Remark 1. The proof inspected in detail shows that

runtime data through instructions satisfying condition

(3) (case (i) in the proof) in the trace can be altered

independently via the constants in the instructions.

So intuitively, knowing the instructions and/or trace

ought not to enable any bit of the runtime data under

the encryption to be guessed with any degree of statis-

tical accuracy because the program might have been

modiﬁed to have the data offset by any amount from

what might naturally have been guessed. But is any

possible offset from nominal, at any point in the pro-

gram, for any particular register or memory location,

equally as probable as another? That cannot be said

without the probabilistic setting of Section 6.

In particular that does not take into account that

human beings only write certain programs, so one

can bet, for example, on ﬁnding an encrypted 1 in

nearly any program trace. Leveraging the intuition

above into a practical tool for obfuscation requires a

compiler strategy, described in the next sections, that

varies compiled code so runtime data varies randomly

and uniformly from nominal values, else all is bluff.

5 OBFUSCATING COMPILATION

To make compilation to a compliant instruction set as

described in Section 4 use the possibilities for obfus-

cation in order to frustrate a dictionary attack against

the runtime data values, the compiler should set an

arbitrary offset ∆

for x

, where the value [x

]

is in

the register or memory location l, at different points

in the program. This is manipulated by the compiler.

The offset represents by how much the decrypted data

value x

in the location is to vary from nominal (with-

out obfuscation) at runtime. Each instruction that

writes at location l offers an opportunity for the com-

piler to reset the offset ∆

used there.

For example, in compiling a boolean-valued com-

putational conjunction expression A && B in the pro-

gram source code, the possible additive offset is 0 or

1 mod 2. An offset of 0 means the result is returned as

On Obfuscating Compilation for Encrypted Computing

251

is, ‘telling the truth’. An offset of 1 means the result

is inverted: ‘lying’. The compiler chooses between:

(a) whether A is compiled telling the truth or lying

(b) whether B is compiled telling the truth or lying

The (a) corresponds to whether ∆

=0 (truth) or ∆

(liar) is added mod 2 to the result for A. Similarly (b)

corresponds to whether ∆

=0 (truth) or ∆

=1 (liar)

is added in to the result for B. Let a be true when

(a) is to tell truth (∆

=0), and false when (a) is to lie

(∆

=1). Similarly for b with respect to (b) and c with

respect to (c). What is to be computed at runtime is:

c ↔ ((a ↔ A)&&(b ↔ B))

where the arrow in x ↔ y stands for the boolean bi-

conditional operator. That is:

if abc then A&&B

where the overline means boolean negation. The

compiler knows a and b and chooses c with 50/50

probability, deciding which of A&&B, A&&B, etc.,

it will generate machine code for. All the generated

codes will look alike, modulo the encrypted constants,

unreadable by the operator. If [A] is the compiled code

for A and [B] is the compiled code for B, producing

values 1/0 for true/false respectively in register t0,

then the compiler emits a machine code sequence [C]:

[A]; i

; beqi t0 [0]

l; [B]; i

; l : i

where if a is true (‘truth teller’) then i

is the ma-

chine code sequence that maintains the value set by

A in register t0 that the beqi instruction tests against

the zero supplied as an encrypted constant, and jumps

to the point l if equal, ‘short-circuiting’ the calcula-

tion. It sufﬁces to emit nothing, but it is required that

the sequence look the same for all possible cases, and

‘nothing’ would be a give-away. If a is false (‘liar’)

then i

is a machine code sequence of the same length

that ﬂips the value set by A, the compilation of A be-

ing such that it deliberately gives the ‘wrong’ result.

The ﬁnal i

sequence can move to both branches:

[A]; i

; i

; beqi t0 [c]

l; [B]; i

; i

; l :

The sequence i

either ﬂips the value no times,

once, or twice, the last being the same as no times,

so does the same as i

a↔c

and may be replaced by it:

[A]; i

a↔c

; beqi t0 [c]

l; [B]; i

b↔c

; l :

To avoid exposing via [c]

an encryption of 0 or 1,

the i

a↔c

code may produce a result that is offset by

a compiler-decided random constant k from nominal,

and then the branch tests against c+k instead of c:

[A]; i

a↔c

; beqi t0 [c+k]

l; [B]; i

b↔c

; l :

The codes for A and B always have the same length

and form, differing only in inline encrypted constants,

so that is true of the whole too. There is no possibility

of an observer spotting when the value in t0 is left

unchanged and when it is changed by i

a↔c

because

the encrypted value is always changed, even when the

decrypted value is not.

Let the instruction xori x

k k

have seman-

tics x

←(x

−k

)bk+k

, computing bitwise exclusive

or (‘xor’). The code for the i

∗

can be one of:

xori t0 t0 [0]

[0]

# keep bool value

xori t0 t0 [−1]

[0]

# ﬂip bool value

since x XOR 1 is the complement of the boolean value

x. The code for i

a↔c

is one of:

xori t0 t0 [0]

[k]

[0]

# keep bool value

xori t0 t0 [−1]

[k]

[0]

# ﬂip bool value

The [0]

and [−1]

will also be offset by values ar-

bitrarily chosen by the compiler via a further applica-

tion of the ‘shift by k’ technique during the compila-

tion of A and B, as set out in the next section.

6 COMPILING STATEMENTS

The compiler works with a database D : Loc → Int

containing the (32-bit) integer (type Int) offsets for

data, indexed per register or memory location (type

Loc). The offset represents by how much the runtime

data underneath the encryption is to vary from nomi-

nal at that point in the program.

The compiler also maintains a database L :

Var → Loc of the location for each source code vari-

ables (type Var) placement in registers or memory.

Let DB abbreviate the type of database D, then the

compiler has type signature:

[ : ] : DB × source code → DB × machine code

As syntactic sugar, a pair in the cross product is writ-

ten D : s, or D : m, and details of the entirely conven-

tional management of database L are omitted here.

Sequence: The compiler works left-to-right

through a source code sequence:

: s

] = D

: m

where D

: m

= C

: s

]

: m

= C

: s

]

SECRYPT 2017 - 14th International Conference on Security and Cryptography

252

The database D

that results from compiling the left

sequent s

in the source code, emitting machine code

, is passed to the compilation of the right sequent

, emitting machine code m

following on from m

Assignment: An opportunity for new obfuscation

arises at any assignment to a source code variable x.

An offset ∆

= D

Lx for the data in the target reg-

ister or memory location Lx is generated randomly,

replacing the old offset D

Lx that previously held for

the data at that location. The compiler emits code m

for the expression e which puts the result in a desig-

nated temporary location t0 with offset ∆

= D

t0. It

is transferred from there to the location Lx by a fol-

lowing add instruction. Let the machine code instruc-

tion ‘addi r

[i]

’ have semantics x

← x

+i where

the content of register r

is [x

]

and the content of

is [x

]

. Then the emitted code is

: x=e] = D

: m

;addi Lx t0 [i]

where i = ∆

− ∆

: m

= C

: e]

The t0 subscript for the expression compiler tells it to

aim at location t0 for the result of expression e. That

is one of the registers reserved for temporary values.

Return: The compiler at a ‘return e’ from function

f selects a ﬁnal offset ∆

ret

(functions f are subtyped

by offsets ∆

par0

, ∆

par1

, etc. in their formal parameters

and ∆

ret

in their return value) and emits an add in-

struction with target the standard function return value

The add instruction in the trailer adjusts to the offset

∆

ret

from the offset ∆

= D

t0 with which the result

from e in t0 is computed by the code m

: return e] = D

: m

;addi v0 t0 [i]

... # restore stack

jr ra # jump return

where i = ∆

ret

− ∆

: m

= C

: e]

The offset for v0 is updated in D

to D

v0=∆

ret

Other source code control constructs are treated

like return in the way they adjust the ﬁnal offset to

meet constraints. For an if statement, ﬁnal offsets in

each branch are adjusted to match at the join. A while

statement is an if statement that joins back to its own

start, so the ﬁnal offset in the loop must equal the ini-

tial offset. Each function deﬁnition is compiled sepa-

rately, the databases being ﬂushed before each.

Remark 2. The addi instructions in the cases (b-c)

above contain an embedded value i that is contributed

to by a freely chosen constant k=∆

in (b) and k=∆

ret

in (c), which will be referred to generically as k in the

proof of the theorem below. It is chosen by the com-

piler from a distribution designed to make uniform the

distribution of values written by the instruction.

Theorem 2. The probability across different compi-

lations that any particular 32-bit value x has its en-

cryption [x]

in location l at any given point in the

program at runtime is uniformly 1/2

Proof. Consider the arithmetic instruction I in the

program. Suppose that by ﬁddling with the embed-

ded constants in the other instructions in the pro-

gram it is already possible for all other locations l

other than that written by I and at all other points

in the program to vary the value x

= x+∆

with

]

in l

randomly and uniformly across compila-

tions, taking advantage of the possibilities in the in-

struction set, as exhibited in the compiler speciﬁca-

tion. Let I write value [y]

in location l. By con-

dition (3) (c.f. the remark above) I has a parameter

k that may be tweaked to offset y from the nominal

result f (x + ∆

) with respect to its input x + ∆

an amount ∆

. The compiler chooses k with a dis-

tribution such that ∆

is uniformly distributed across

the possible range. The instructions in the program

that receive y from I may be adjusted to compen-

sate for the ∆

change by changes in their controlling

parameters. Then p(y=Y )=p( f (x+∆

)+∆

=Y ) and

the latter probability is p(y=Y )=

∑

p( f (x+∆

)=Y

∧

∆

=Y −Y

). The probabilities are independent (be-

cause ∆

is newly introduced), so that sum is

p(y=Y)=

∑

p( f (x+∆

)=Y

)p(∆

=Y −Y

). That is

p(y=Y)=

∑

p( f (x+dx)=Y

). Since the sum is

over all possible Y

, the total of the summed proba-

bilities is 1, and p (y=Y )=1/2

. The distribution of

= x+∆

in other locations l

is unchanged. Done by

induction on the machine code graph structure.

A helpful intuition is that ∆

has maximal entropy, so

adding it in swamps all biases in the distribution of y.

The result provides the probabilistic setting for se-

mantic security. Recall that encryption is assumed

secure so collisions may be assumed to be avoided.

Consider a hypothetical probabilistic method F that

guesses for a particular runtime value ‘the top bit is

1, not 0’, as applied to a compiled program C and its

trace T with probability p > 0.5 over many trials of

the method. In fact, results 1 and 0 are equally likely

across all possible compilations according to Theo-

rem 2, and the probability (see below) F is right is

0.5(1−p) + 0.5p = 0.5 (*)

That is because the method F cannot tell which of

the compilations C it is looking at as all the compiled

On Obfuscating Compilation for Encrypted Computing

253

codes and their traces T are exactly the same mod-

ulo the encrypted values in them. There are no col-

lisions, certainly not between program constants and

runtime data, as condition (4) maintains, so each com-

piled code C and trace T consists of different values

never repeated internally or between different pairs C ,

T . All codes C are the same length and form and

all traces T are the same length and form (they all

branch the same way at the same points). The method

F applied to different C and T has nothing to cause it

to give different answers except incidental features of

the encrypted values (such as the total number of 7s in

the decimal representations, perhaps) and its own in-

ternal spins of a coin that result in it saying 1 a propor-

tion p of the time, and 0 a proportion 1−p of the time.

Both those are at least statistically independent of the

truth of if the bit is 1 or 0, as the encryption is secure

in the ﬁrst case and because of causal independence

in the second case, which justiﬁes the calculation (*).

That is semantic security at runtime for object

code from an ‘obfuscating compiler’,

following The-

orem 2, modulo the assumption that encryption is se-

cure and conditions (1-4) hold. Has data obfuscation

as deﬁned in Section 1 been obtained? Yes. The ﬂat

distribution of possible data values under the encryp-

tion means no information can be gained from traces.

CONCLUSION

This paper has considered privacy and security of data

on platforms for encrypted computing against the op-

erator or operating system as an adversary, assuming

the encryption is secure in its own right.

Conditions on the processor and machine code ar-

chitecture have been deﬁned such that a compiler may

obfuscate the runtime data under the encryption, pro-

ducing uniformly distributed variations across differ-

ent compilations, at every point in the program. That

eliminates attacks based on the use by a human author

of small numbers in program or data. No unencrypted

data value can then be statistically inferred from code

and trace, making a known plaintext attack on the en-

cryption impossible. That also amounts to semantic

security of an integrated system for encrypted com-

puting consisting of a processor with an instruction

set satisfying the conditions set out, plus an ‘obfuscat-

ing compiler’, modulo the security of the encryption.

Haskell source code for a prototype obfuscating C

compiler following our design may be downloaded from

nbd.it.uc3m.es/∼ptb/obfusc comp-0 9a.hs. The compiler

produces generic ‘fused operate and add’ instructions.

ACKNOWLEDGEMENTS

Zhiming Liu wishes to thank the Chinese NSF for

support from research grant 61672435, and South-

west University for research grant SWU116007. Peter

Breuer wishes to thank Hecusys LLC (hecusys.com)

for continued support in KPU development.

REFERENCES

Barak, B., Goldreich, O., Impagliazzo, R., Rudich, S.,

Sahai, A., Vadhan, S., and Yang, K. (2001). On

the (im)possibility of obfuscating programs. In Kil-

ian, J., editor, Proc. 21st Annu. Int. Cryptol. Conf.

(CRYPTO’01), Adv. Cryptol., pages 1–18. Springer.

Breuer, P. T. and Bowen, J. P. (2014). Towards a work-

ing fully homomorphic crypto-processor: Practice and

the secret computer. In J

orjens, J., Pressens, F., and

Bielova, N., editors, Proc. Int. Symp. Eng. Sec. Softw.

Syst. (ESSoS’14), volume 8364 of LNCS, pages 131–

140, Berlin/Heidelberg. Springer.

Breuer, P. T. and Bowen, J. P. (2016). A fully encrypted

microprocessor: The secret computer is nearly here.

Procedia Comp. Sci., 83:1282–1287.

Breuer, P. T., Bowen, J. P., Palomar, E., and Liu, Z. (2016).

A practical encrypted microprocessor. In Callegari,

C., van Sinderen, M., Sarigiannidis, P., Samarati, P.,

Cabello, E., Lorenz, P., and Obaidat, M. S., editors,

Proc. 13th Int. Conf. Sec. Cryptog. (SECRYPT’16),

volume 4, pages 239–250, Portugal. SCITEPRESS.

Conway, J. H. (1987). Fractran: A simple universal pro-

gramming language for arithmetic. In Open Problems

in Commun. & Comput., pages 4–26. Springer.

Daemen, J. and Rijmen, V. (2002). The Design of Rijndael:

AES – The Advanced Encryption Standard. Springer.

Fletcher, C. W., van Dijk, M., and Devadas, S. (2012). A

secure processor architecture for encrypted computa-

tion on untrusted programs. In Proc. 7th Scal. Trust.

Comput. Workshop (STC’12), pages 3–8, NY. ACM.

Hada, S. (2000). Zero-knowledge and code obfuscation. In

Okamoto, T., editor, Proc. 6th Int. Conf. Theor. Appli-

cat. Cryptol. Inform. Sec. (ASIACRYPT’00), number

1976 in LNCS, pages 443–457. Springer.

Ostrovsky, R. and Goldreich, O. (1992). Comprehensive

software protection system. US Pat. 5,123,045.

Paillier, P. (1999). Public-key cryptosystems based on com-

posite degree residuosity classes. In Proc. EURO-

CRYPT’99, Adv. Cryptol., pages 223–238. Springer.

Patterson, D. (1985). Reduced instruction set computers.

Commun. ACM, 28(1):8–21.

Tsoutsos, N. and Maniatakos, M. (2015). The HEROIC

framework: Encrypted computation without shared

keys. IEEE Trans. CAD IC Syst., 34(6):875–888.

van Dijk, M. and Juels, A. (2010). On the impossibility of

cryptography alone for privacy-preserving cloud com-

puting. HotSec, 10:1–8.

Wang, Z. and Lee, R. B. (2006). Covert and side chan-

nels due to processor architecture. In Proc. 2nd Annu.

Comp. Sec. Applic. Conf. (ACSAC’06), pages 473–

482. IEEE.

Zhang, C., Wei, T., Chen, Z., Duan, L., Szekeres, L., McCa-

mant, S., Song, D., and Zou, W. (2013). Practical con-

trol ﬂow integrity and randomization for binary exe-

cutables. In Symp. Sec. Priv., pages 559–573. IEEE.

SECRYPT 2017 - 14th International Conference on Security and Cryptography

254