A New Way to Think About Secure Computation:

Language-based Secure Computation

Florian Kerschbaum

SAP Research, Karlsruhe, Germany

Abstract. Assume two parties, Alice and Bob, want to compute a joint function,

but they want to keep their inputs private. This problem setting and its solutions

are known as secure computation. General solutions to secure computation re-

quire the construction of a binary circuit for the function to be computed. This pa-

per proposes the concept of language-based secure computation. Instead of con-

structing a binary circuit program code is directly translated into a secure compu-

tation protocol. This concept is compared to the approaches for language-based

information-ﬂow security and many connections between the two approaches are

identiﬁed. The major challenge in this translation is the secure translation of the

program’s control-ﬂow without leaking private information via a timing chan-

nel. The paper presents a method for translating an if statement with a secret

branching condition that may not be known to any party. Furthermore, that proto-

col can be optimized using trusted computing, such that the overall performance

of a program executed as a secure computation protocol can be greatly improved.

1 Introduction

Assume two companies each have a sales database and they are interested in identifying

common patterns using data mining techniques, but are afraid to reveal their database,

since it contains information that provides them with a competitive advantage. In an-

other scenario assume, that a multitude of companies is gathering at a provider and they

would like to benchmark their performance, but they are afraid to reveal their perfor-

mance indicators, since it could reveal their processes’ weaknesses and open points of

attack to the competitor. Both of these scenarios are of high interest in the business

world and both of them can be solved with the same technique. Secure computation al-

lows two or more parties to compute a common function such that both parties receive

the result, but keep their inputs private (except what can be inferred from the result).

In the ﬁrst scenario the inputs are the databases and the common function is the data

mining technique, and in the second scenario the inputs are the performance indicators

and the function is some statistical function, e.g. average, computed over them.

The basis for secure computation is to express the common function to be com-

puted as a binary circuit. Then each gate of the binary circuit can be executed securely

and privately with a secure gate protocol. One can proof by induction that a binary cir-

cuit, even consisting only of exclusive-or and logical-and gates, exists for any binary

input-/output-behavior. This generality leads to the generality of the solution, since any

function can now be computed with a binary circuit.

Kerschbaum F. (2007).

A New Way to Think About Secure Computation: Language-based Secure Computation.

In Proceedings of the 5th International Workshop on Security in Information Systems, pages 33-42

DOI: 10.5220/0002423300330042

 SciTePress

This paper suggests a new method of constructing secure computation protocols:

language-based secure computation. The basic idea is to compile a secure computation

protocol directly from the programming language and in doing so exploit the techniques

used in the manual construction of specialized protocols by automating them. This ap-

proach carries the potential for greater speed of the secure computation protocols, as

well as increased ﬂexibility in specifying (programming) them. E.g. one could allow

loops with a public loop condition, use a variable number of inputs from the parties (as

long as this number is known to both parties) and use specialized protocols improving

the performance, e.g. for strings.

The contribution of this paper is besides the introduction of the concept of language-

based secure computation, the analysis of the main challenges that need to be overcome

in realizing the approach and the investigation of one major problem identiﬁed. This

paper will present a protocol for securely and privately computing an if statement,

where the result of the condition expression may not be known to either party, and then

present an improvement in running time using trusted computing.

The structure of the paper is as follows: The next section will review related work in

more detail giving the necessary references. Section 3 describes the concept of language-

based secure computation. Section 4 elaborates on the problem of translating the control-

ﬂow of a program and section 5 introduces the problem of an if statement with secret

condition, as well as its solutions, the regular one and the optimized one using trusted

computing. The conclusions are presented in section 6.

2 Related Work

2.1 Secure Computation

Secure Computation was introduced in [23]. It introduced the problem of computing

a joint, public function f (a, b) between two parties Alice and Bob where Alice pri-

vately holds a and Bob privately holds b. The problem was solved for general functions

for two parties, but exempliﬁed with the famous Yao’s millionaires’ problem. In Yao’s

millionaires’ problem two millionaires want to compare their wealth, but do not want

to reveal the exact amount to the other party. It therefore computes the “greater-than”

function on private inputs. This paper also introduced the general mechanism of circuit

construction for the function f(a, b).

Since then the functions to evaluate each gate in the circuit have been generalized

and optimized. Clever generalizations to the multi-party setting with malicious attacker

have been found in the information theoretic setting [4] and the cryptographic setting

[10]. Many more clever result have been found in subsequent research on optimizing the

protocols with different settings (cryptographic or information theoretic) and different

attackers (semi-honest and malicious) which are not listed here.

The need for more efﬁcient protocols has been identiﬁed a long time ago [9] and

several such protocols have been developed and published in the literature. Problems

considered range over a wide variety, but the data mining community has been partic-

ularly active. A landmark paper here was [14] introducing the ID3 algorithm. It uses

circuit construction protocols only as sub-protocols and optimizes the overall perfor-

mance and communication.

2.2 Secure Computation Compilation

The problem of translating a program specifying the function f (a, b) into a secure com-

putation protocol has been addressed in [15]. The paper introduces the FairPlay com-

piler and programming system that compiles a programming language into a register

free, feed-forward only binary circuit that is guaranteed to be oblivious. Next to arith-

metic expressions, it provides programming language constructs: if statements, for

loops, functions, arrays and variable assignments. There are some restrictions on these:

Any expression can be used in the if condition, but both branches are always eval-

uated. The limit of for loops can only be static, not even public data. Functions are

completely inlined and therefore no recursion is allowed. Basic data types are boolean

and integer (ignoring enumerated).

2.3 Language-Based Security

From the vast research on language-based security, including safe programming lan-

guages [7, 12], proof-carrying code [17] etc., this paper is most interested in research

related to conﬁdentiality policies, especially preventing information ﬂows [20]. An in-

formation ﬂow occurs when the contents of one variable is inﬂuenced by the contents of

another variable [6]. Usually variables are arranged into multiple security levels (which

form a lattice) and the objective is to prevent information ﬂow from higher levels to

lower levels. Information ﬂows can be direct (e.g. by assignment), implicit (i.e. follow-

ing from the control structure) or via covert channels (e.g. timing channels). In [21]

a type system to prevent direct and implicit ﬂows is introduced which, of course, can

be statically enforced. I.e. in such a typed language no one can write a program that

violates the security policy by direct or implicit ﬂows. JFlow is an extension to the Java

language that enforces exactly such a language [16]. It also provides features such as

type variables, run-time type checking and type inference to make writing programs

easier. Type systems have been extended to also prevent information ﬂows from covert

channels. Especially timing channels and network messages which are interesting to se-

cure computation have been addressed in [19]. [1] introduced the idea of cross-copying

the branches of an if statement with secret condition.

Secure program partitioning addressed the problem of trusting a host to do the com-

putation [24]. Every party deﬁnes a set of other parties which it trusts to compute with

its data. Then the program is divided, such that only hosts trusted with the data do the

computation. This ensures that data may only be compromised by these hosts and that

untrusted hosts do not get access to the data. Nevertheless, differently from secure com-

putation, it requires a trusted (third) party to exist to do the computation, otherwise the

program cannot be compiled in accordance to the policy. Secure computation actually

intends to replace that third party and do computation between mutually distrusting

parties.

3 Concept

3.1 Data Classiﬁcation

Similarly to the classiﬁcation in information ﬂow, we assign each variable in the com-

mon program source a label. This label is the set of parties that know or may know

the contents of the variable. E.g. in two-party protocols (i.e. Alice and Bob) the labels

are: hAlice, Bobi, hAlicei, hBobi, hi. This naturally extends to multi-party computa-

tion and forms a lattice (as required for many analysis methods). In the remainder of

the paper we will refer to variables and data with three adjectives:

1. public: known to all parties, e.g. with label hAlice, Bobi in the two-party case.

2. private: known to one party, e.g. with label hAlicei or hBobi.

3. secret: known to no party, i.e. with label hi.

3.2 Compilation Process

The compilation process takes a common program source and translates it into two (or

more for multi-party settings) protocol programs. The protocol programs then execute a

secure computation protocol. As described in previous sections such a tool exists [15],

but due to its complicated translation process, it is inﬂexible and the result often lacks

performance. The proposed approach to translation adopts ideas from compiler writing

and programming language research. Its model of computation is based on common

programming languages which leads to higher ﬂexibility and performance.

Each party in the computation has a set of variables in which it stores intermediate

computation results. These can be additional variables introduced by the compilation

process or correspond to variables in the common program source. Compared to regular

compilation we can ignore register allocation and spilling if we compile the protocol

programs in another high-level source language.

Then the common program source is translated into building block protocols. These

building block protocols correspond to the machine instructions of regular compilation.

The translation can be done via an intermediate language, if that eases translation. There

should be building block protocols for all operations and statements, such as +, −, etc.,

and for each possible assignment of data classiﬁcations. I.e. an operation x

hAlicei

is translated into another building block protocol than x

hAlicei

+ y

hBobi

and

. This paper will present the building block protocol for ifon a secret condition

in Section 5.1.

4 Control-ﬂow

This section will highlight a major challenge in translating programs into secure com-

putation protocols. A program as written in a programming language has a control-ﬂow.

The instantiation of the control-ﬂow, the ﬂow of a particular program run, may depend

on the input data of the program. The problem is that if the control-ﬂow supports ex-

ecuting a particular basic block a variable number of times (depending on input data),

then the number of executions of that basic block “leaks” information about the input.

In order to obtain that information, the attacker needs to observe the control-ﬂow of the

program. He can do that in two ways:

– Locally, by inspecting the program counter (debugging or emulating the program).

– Remotely (and locally), by timing the program.

A key insight is that the control-ﬂow of the resulting secure computation proto-

col programs is a transform of the common program. This transform is public and

reversible, i.e. if an attacker is able to inspect the program counter of the protocol

program, he can infer a virtual program counter in the common program. One could

imagine that the control-ﬂow could be split between the two parties, such that only one

party has this equivalence, but this approach only works for a limited set of programs

where that one party may indeed know the control-ﬂow. And for interactive protocol

programs, the control-ﬂow of both protocol programs has to correspond, since each

message sent must be received by the appropriate, corresponding receptor. This enables

both parties in an interactive protocol to track the control-ﬂow of the common program.

Control-ﬂow obfuscation [22] intends to make the transform hard to analyze. Ex-

cellent results can be obtained against a static adversary, but a dynamic adversary that

is able to execute the program in a debugger or emulator is much more powerful and no

theoretically founded security results exist.

5 if on Secret Data

An important control-ﬂow problem is a simple if statement, but on secret data, i.e.

neither Alice nor Bob may know the result of the branching condition, because they

could infer information about the inputs not inferable by the result. It is anticipated

that in a reasonably complex program, such if statements are the rule, and not the

exception.

5.1 if Protocol

The ideas from above can be composed into a language-based protocol for if. We

consider the problem of executing the if statement on secret data, i.e. neither Alice

nor Bob may infer anything about the result. We assume that the problem of computing

operators using language-based protocols has been solved, i.e. such protocols exist.

The ﬁrst observation is that since the condition is secret, due to the control-ﬂow

dependency all assigned variables in the branches are secret as well. Therefore we

will outline how secret variables are to be stored. We use any 2-out-of-2 secret shar-

ing scheme, i.e. both parties need to cooperate to reveal the secret. Operator protocols

are then deﬁned on the shares and will likely need to result in shares again. The actual

choice of secret sharing scheme may depend on the operators used and be optimized to

increase performance, but some candidates are:

– Exclusive-Or

– Modular addition

E.g. Alice may have the value 6 and Bob may have the value 3. The combined 3-bit

secret would be 1 = 6+3 mod 8. Let x

be Alice’s and x

be Bob’s share of a variable

x. Then, note that the conditional probability P r[x = c|x

] that x has a certain value

c ∈ D

given share x

is equal to the a priori probability P r[x = c], i.e. no party gains

any additional information from its share.

P r[x = c|x

] = P r[x = c|x

] = P r[x = c] =

We then assume that the condition for the if statement can be evaluated, such

that the result is shared between Alice and Bob. E.g. let c be a boolean condition (i.e.

0 = false, −1 = true), then from the evaluation of c Alice obtains c

and Bob

obtains c

, such that c = c

⊕ c

. The protocol for the if statement, then reduces

to an Oblivious Transfer where one party may switch the inputs and the other retrieves

according to his share. This is the same protocol used for the evaluation of a logical and

gate in [8], but the difference is in the message being retrieved.

if (c)

else

Fig.1. if statement.

The translation of the if statement in ﬁgure 1 proceeds as follows:

1. Gather all variables being assigned in the “then” branch b

. Let V

be this set.

2. Gather all variables being assigned in the “else” branch b

. Let V

be this set.

3. Compute the union V = V

∪ V

of the two sets.

4. Append to the “then” branch b

assignments of the form v = vfor all v∈ V \V

5. Append to the “else” branch b

assignments of the form v = v for all v ∈ V \ V

6. In the “then” branch rename any assigned variable v∈ V to v

. I.e. the assigment of

the form v = exp becomes v

= exp and every subsequent use of that variable

v in the branch is renamed to v

as well. Denote the set of renamed variables v

by R

7. In the “else” branch rename any assigned variable v ∈ V to v

. Similarly rename

subsequent uses of that variable and denote the resulting set by R

8. To the current translated protocol programs append code for evaluating the branch-

ing condition, such that the result is shared. Let Alice obtain c

and Bob obtain c

such that the branching condition c = c

⊕ c

. Ensure that c

and c

are fresh

variables, i.e. not used anywhere else in the program.

9. Recursively apply the translation to the two branches: First b

, then b

. Append

the result of the translation to the end of the current translations. The branches are

non-interfering, since they assign to differently renamed variables, i.e. they can be

safely executed sequentially. This also implies that both branches will be executed

in the ﬁnal translated protocol.

10. Append code to Alice’s protocol program that generates two messages m

and m

Let m

be the concatenation of all assigned variables from b

: m

= v

, . . . , v

∀v

∈ R

. Similarly let m

be the concatenation of all assigned variables from

: m

= v

, . . . , v

∀v

∈ R

. Such a concatenation is called a variable store,

since it allows later decomposition.

11. Append code to Alice’s protocol program that generates a random number r

length |m

| = |m

| (the same length as the two messages). Compute two messages

′

= m

⊕ r

and m

′

= m

⊕ r

12. Append code to Alice’s protocol program that generates two messages: o

= m

′

0⊕c

= m

′

1⊕c

13. Append code in Alice’s and Bob’s programs that does an Oblivious Transfer of one

out of (o

, o

). Bob will obtain the message o

= m

′

⊕c

= m

⊕ r

14. Repeat steps 10 to 13, but with the roles of Alice and Bob interchanged. Let Alice

obtain o

′

and Bob choose r

. The variable store for Alice’s protocol program is

then o

′

⊕ r

and for Bob’s protocol program it is o

⊕ r

15. The translation continues with the next statement.

5.2 Trusted Computing Solution

The problem with the previous solution is running time. For every if statement both

branches are always executed. Let O(b

) and O(b

) be the running times of branches

and b

, respectively. Then the running time of the if statement is the sum O(b

) +

O(b

) + δ (δ is the time to execute the common code).

The motivation for executing both branches in secure computation derives from the

attacker’s ability to inspect the program at run-time, i.e. debugging or emulating. This

enables him to trace through the program and determine the branch taken and, since

the compilation process is public, he can determine the branch in the source program

and the result of the condition. This contradicts the secrecy requirement of the condi-

tion. Now, if we remove the ability to inspect the program from the attacker, can we do

better? If the program is executed in a trusted computing processor, it can no longer be

inspected by anyone. A trusted computing processor is capable of receiving encrypted

code and executing it privately, such that no one can inspect it. For this purpose it pub-

lishes a public key, such that software providers can create encrypted versions of their

programs and securely deliver them to the clients. Nevertheless the trusted computing

processor does not remove all side-channels an attacker can observe. Most notably, the

program’s timing is still observable and may reveal information about the input. In the

only proposed solution for secure computation using trusted computing [3] this has been

recognized, but not solved. The authors assume that the computation is “oblivious” for

which the current solution is to use binary circuits which also execute both branches. In

this section we will outline an algorithm that can achieve better performance.

First, recall the observations that can be made about a program to infer private in-

puts:

– messages

– timing

Two trusted computing processors can communicate conﬁdentially, if they know the

public key of the other processor or share a common trusted certiﬁcate authority. In this

case, they can use any session key establishment protocol to generate a private session

key they can use to hide the content of the messages. The content is then not observable

to an attacker. The length of the message can be padded to a common maximum length,

such that it does not reveal any information either. The only information that an attacker

can gain from a message is that the fact that is sent and when, i.e. its timing. We assume

that all building block protocols employ these techniques (padding and encryption).

The approach to secure the timing of the computation can be similar to padding

the length. In the simplest case, one computes the maximum time a computation can

take, then measures the time it actually takes and idles the rest before returning the re-

sult. This can be a very difﬁcult approach, since even elementary operations, such as

multiplication, may not take a uniform time to execute [13], and cache timing can de-

pend on the access pattern of private data [2]. Our algorithm assumes that each building

block protocol used in compiling the source program is oblivious, i.e. it is secure and

of constant time.

For each building block protocol P construct a corresponding dummy protocol

that has the same observable behaviour (i.e. timing and messages), but no effect on the

computation result. This can e.g. be achieved by the above variable renaming (as in

the if protocol), but then not using the result values. Assign each building protocol an

unique element P

, . . . , P

. Several building block protocols may have the same ele-

ment as long as they have the same observable behaviour, e.g. a protocols for computing

the product or integer division of two secret inputs. The key is that for any protocol in

neither Alice nor Bob can differentiate which it is and cannot differentiate it from

P. We write P ≡ P to denote that P and P are indistinguishable (which includes

encryption and padding as mentioned above).

The translation procedure for the if statement on secret condition is then:

1. Translate each branch b

∈ {b

, b

2. For each branch b

∈ {b

, b

} of the secure if statement, compute a sequence

= s

i,0

, . . . , s

i,n

of the building block protocols P ∈ {P

, . . . , P

} used. Then

compute the supersequence S = s

, . . . , s

of S

and S

. This can be done in time

O(nm), the product of the length of the two sequences.

3. Fill each branch with dummy protocols

P ∈ {P

, . . . , P

}, such that they match

the supersequence. I.e. for each symbol s

of the supersequence have a matching

symbol t

with t

≡ s

. Note, that the result of the computation remains unaffected,

but both branches have an observable behaviour identical to S.

4. Create code for a protocol C that is a regular if statement with the two branches b

padded to S. Obviously the condition needs to be evaluated securely. Let C

be the

code Alice’s side of the protocol and C

for Bob’s.

5. Encrypt C

with the public key of Alice’s trusted computing processor (E

))

and C

with Bob’s (E

)).

6. Insert code in Alice’s protocol program to execute E

) in Alice’s trusted com-

puting processor and similarly in Bob’s protocol program for E

The computation complexity of this protocol is linear in the length |S| of the su-

persequence, which is bounded between max(|S

|, |S

|) ≤ |S| ≤ |S

| + |S

|. We

can therefore expect some speed-up by this protocol and only in the worst case it will

deteriorate to the performance of the protocol without trusted computing.

6 Conclusion

The concept of language-based secure computation was introduced. The major chal-

lenge of securely translating the control-ﬂow was exempliﬁed with the secure if state-

ment protocol and the advantages of language-based secure computation have been

shown by an optimization on that protocol that requires the if statement to be trans-

lated directly. Many of the outlined challenges, e.g. comprehensive proofs and other

control-ﬂow problems, such as for loops with secret bounds, remain to be solved and

are subject of future research.

References

1. J. Agat. Transforming out timing leaks. Proceedings of the ACM Symposium on Principles

of programming languages, 2000.

2. J. Agat, and D. Sands. On Conﬁdentiality and Algorithms. Proceedings of the IEEE Sympo-

sium on Security and Privacy, 2001.

3. Z. Benenson, F. G

artner, and D. Kesdogan. Secure Multi-Party Computation with Security

Modules. Proceedings of SICHERHEIT, 2005.

4. M. Ben-Or, and A. Wigderson. Completeness theorems for non-cryptographic fault-tolerant

distributed computation. Proceedings of the 20th ACM symposium on theory of computing,

1988.

5. D. Brumley, and D. Boneh. Remote Timing Attacks Are Practical. Proceedings of the

USENIX security symposium, 2003.

6. D. Denning. A lattice model of secure information ﬂow. Communications of the ACM 19(5),

1976.

7. C. Fournet, and A. Gordon. Stack Inspection: Theory and Variants. Proceedings of the 29th

ACM symposium on principles of programming languages, 2002.

8. O. Goldreich. Secure Multi-party Computation. Available at

www.wisdom.weizmann.ac.il/˜oded/pp.html, 2002.

9. S. Goldwasser. Multi party computations: past and present. Proceedings of the 16th ACM

symposium on principles of distributed computing, 1997.

10. O. Goldreich, S. Micali, and A. Wigderson. How to play any mental game. Proceedings of

the 19th ACM conference on theory of computing, 1987.

11. O. Goldreich, and R. Ostrovsky. Software protection and simulation on oblivious RAMs.

Journal of the ACM, 1996.

12. J. Gosling, B. Joy, and G. Steele. The Java Language Speciﬁcation. Addison-Wesley, 1996.

13. P. Kocher. Timings attacks on implementations of Difﬁe–Hellman, RSA, DSS and other

systems. Proceedings of CRYPTO, 1996.

14. Y. Lindell, and B. Pinkas. Privacy Preserving Data Mining. Proceedings of CRYPTO, 2000.

15. D. Malkhi, N. Nisan, B. Pinkas, and Y. Sella. Fairplay - A Secure Two-party Computation

System. Proceedings of the USENIX security symposium, 2004.

16. A. Myers. JFlow: Practical Mostly-Static Information Flow Control. Proceedings of the ACM

Symposium on Principles of Programming Languages, 1999.

17. G. Necula, and P. Lee. Safe Kernel Extensions Without Run-Time Checking. Proceedings of

USENIX Symposium on Operating Systems Design and Implementation, 1996.

18. O. Rabin. How to exchange secrets by oblivious transfer. Technical Memo TR–81, Aiken

Computation Laboratory, 1981.

19. A. Sabelfeld, and H. Mantel. Static conﬁdentiality enforcement for distributed programs.

Proceedings of the Symposium on Static Analysis, 2002.

20. A. Sabelfeld, and A. Myers. Language-Based Information-Flow Security. IEEE Journal on

selected areas in communications 21(1), 2003.

21. D. Volpano, G. Smith, and C. Irvine. A sound type system for secure ﬂow analysis. Journal

of Computer Security 4(3), 1996.

22. C. Wang, J. Davidson, J. Hill, and J. Knight. Protection of Software-based Survivability

Mechanisms. Proceedings of the international conference of dependable systems and net-

works, 2001.

23. A. Yao. Protocols for Secure Computations. Proceedings of the IEEE Symposium on foun-

dations of computer science 23, 1982.

24. S. Zdancewic, L. Zheng, N. Nystrom, and A. Myers. Secure program partitioning. ACM

Transactions on Computer Systems 20(3), 2002.