UPCARE: User Privacy-Preserving Cancer Research Platform

Georg Bramm

1 a

, Melek

Onen

2 b

, Martin Schanzenbach

1 c

, Ilya Komarov

, Frank Morgner

Christian Tiebel

and Juan Cadavid

5 d

Fraunhofer AISEC, Lichtenbergstraße 11, 85748 Garching bei M

unchen, Germany

EURECOM, Campus Sophia Tech, 450 Route des Chappes, 06410 Biot, France

Bundesdruckerei GmbH, Kommandantenstraße 18, 10969 Berlin, Germany

IDG Institut f

ur Digitale Gesundheitsdaten RLP, Große Bleiche 46, 55116 Mainz, Germany

Softeam Docaposte, 110 Esplanade du G

eral de Gaulle 92931 Paris, France

Keywords:

Distributed Analytics, Medical Data Management, Privacy-Preserving Computation, Cryptography,

Homomorphic Cryptography, Attribute Based Cryptography.

Abstract:

Cancer research has entered a new era with the advent of big data and advanced computational analytics.

However, the utilization of such medical data poses signiﬁcant privacy and security challenges. This paper

presents a comprehensive examination of User Privacy-preserving Cancer Research Platform (UPCARE), a

research platform that enables the secure and ethical sharing of sensitive medical cancer data for collaborative

researchers, while safeguarding patient privacy. To our knowledge, only a few approaches have been pursued

so far in building a uniform cancer research platform protected by modern cryptography. We try to provide a

uniform platform for research UPCARE leverages cutting-edge cryptographic access methods, like attribute-

based encryption, as well as data anonymization techniques, like multiparty homomorphic encryption, to allow

secure data sharing with researchers, ensuring compliance with stringent regulatory requirements. This paper

discusses the architecture, methodologies, and applications of UPCARE, highlighting its potential to improve

cancer research and accelerate advancements in precision medicine while preserving user privacy.

1 INTRODUCTION

Cancer research has made remarkable strides in recent

years, driven by recent advancements in genomics,

computational biology, and data analytics. However,

these exciting developments come with a unique set

of challenges, particularly concerning the privacy and

security of sensitive medical data, especially in the

case of cancer data. Up until now, the data silos in

the different cancer registries are very heterogeneous

in nearly all aspects, for example like the utilized

software, and are therefore nearly impossible to com-

bine. The critical need to balance data sharing for

collaborative research and patient privacy protection

has never been more evident.

The conventional methods of data sharing and col-

laboration in the medical and research community are

https://orcid.org/0000-0002-9020-5856

https://orcid.org/0000-0003-0269-9495

https://orcid.org/0000-0001-6153-504X

https://orcid.org/0009-0002-3062-401X

often fraught with privacy risks, compliance issues,

and ethical dilemmas. Striking the right balance be-

tween unlocking the full potential of data-driven can-

cer research and safeguarding the privacy and rights

of individuals is a complex and pressing challenge.

Furthermore, the failure to address privacy and

security concerns can lead to data breaches, reputa-

tional damage, and legal consequences for research

institutions. In an interconnected world, where data

breaches are a constant threat, safeguarding sensitive

health data is paramount.

1.1 Regulatory Context

The signiﬁcance of privacy in cancer research extends

far beyond legal and ethical obligations. While strict

regulatory frameworks such as the Health Insurance

Portability and Accountability Act (HIPAA) and the

General Data Protection Regulation (GDPR) enforce

the protection of patients’ privacy, the ethical and so-

cietal implications are equally profound. Addition-

Bramm, G., Önen, M., Schanzenbach, M., Komarov, I., Morgner, F., Tiebel, C. and Cadavid, J.

UPCARE: User Privacy-Preser ving Cancer Research Platform.

DOI: 10.5220/0012718100003767

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 21st International Conference on Security and Cryptography (SECRYPT 2024), pages 52-63

ISBN: 978-989-758-709-2; ISSN: 2184-7711

ally, use cases need to respect national regulations,

which speciﬁes what kind of information needs to be

collected, how it should be made available to the pub-

lic for allowing cross-sectoral impact of the cancer

research as well as safeguarding the work split and

responsibilities between different cancer registries as

well as the entities communicating with the registries.

1.2 Research Objectives

The primary objective of this paper is to present

a holistic examination of UPCARE, which was de-

signed to address the privacy and security challenges

associated with cancer research. This platform aims

to facilitate the secure and ethical sharing of sensi-

tive cancer data among researchers while ensuring

compliance with stringent regulatory requirements.

Speciﬁcally, the research objectives include:

• Exploring the challenges and privacy risks in can-

cer research data sharing and computing.

• Assessing the potential use cases and applications

of the platform.

• Describing the architecture and functionalities of

UPCARE.

• Evaluating the cryptographic security and privacy

measurements taken, to ensure compliance with

stringent regulatory requirements .

• Outlining some future directions and challenges

in the ﬁeld of privacy-preserving computation in

the area of cancer research.

The goal of UPCARE is to break down the restric-

tive data silos within the cancer registry landscape

and enable the combined usage of existing data sets

while preserving patient privacy. A user-centered ac-

cess decision and consent process should also allow

patients to make purpose-speciﬁc decisions and avoid

giving blanket data access. Instead of reinventing the

concept of a cancer registry from the ground up or

making unrealistic assumptions during the restructur-

ing of current processes and organizations, UPCARE

adopts a pragmatic perspective and respects the ex-

isting standards and processes. The focus is on re-

searching, developing, and demonstrating novel cryp-

tographic technologies that make personal data usable

while simultaneously protecting individuals’ privacy.

We support two different research scenarios, which

will be described in Section 3 in particular.

2 RELATED WORK

Projects related to the efforts undertaken in UPCARE

include the national Network Genomic Medicine

(nNGM) Lung Cancer, a collaborative research al-

liance of the German Oncology Centers

, and the

German Biobank Node (BBMRI)

. Such medical re-

search platforms usually face the same problem as

UPCARE: Heterogeneous data silos must be inte-

grated and a consolidated data pool must be provided

to researchers that want to perform any kind of data

analysis as part of their investigations. All platforms

fall under one or the other data protection regula-

tion whenever they are actually operated in the real

world. And while there exist tool boxes to support

data protection

, such efforts primarily address reg-

ulatory compliance and consent management. What

makes UPCARE unique is its use of state-of-the-art

cryptographic mechanisms to ensure data privacy for

patients, which also minimizes the attack surfaces.

One of the foundational works that introduced the

concept of privacy-preserving data mining and dis-

cussed methods for protecting individual privacy in

the context of data analysis was written by Agrawal

et al. (Agrawal and Srikant, 2000). Academic ap-

proaches in the literature are often based on the use

of a blockchain, like the general architecture that was

introduced by Yue et al. (Yue et al., 2016). It uti-

lizes a blockchain to enable patients to own, control

and share their data easily and securely without vio-

lating privacy, which provides a new potential way to

improve the intelligence of healthcare systems while

keeping patient data private. Some approaches try to

combine blockchain usage with attribute based en-

cryption (Pournaghi et al., 2020). Other approaches

only seem to be based on attribute based encryption

(Barua et al., 2011; Narayan et al., 2010).

Geva et al. (Geva et al., 2023) tried to access

cancer registry data in a privacy preserving man-

ner through the use of Homomorphic Encryption

(HE). The authors present a tool set for collaborative

privacy-preserving analysis of oncological data using

multiparty fully homomorphic encryption (FHE), al-

though most of the available solutions in literature de-

scribe very detailed and speciﬁc use cases (Son et al.,

2021). We designed UPCARE from the ground up in

such a way, that the utilized ABE scheme could be

easily swapped, in order to test different candidates

regarding performance and efﬁciency, like Bethen-

court et al, (Bethencourt et al., 2007), Sahai et al.

(Sahai and Waters, 2008), Lewko et al. (Lewko and

Waters, 2011), M

uller et al. (M

uller et al., 2009) and

Bramm et al. (Bramm et al., 2018). Because of its

stable performance and speciﬁc design decisions, like

its open world attribute setup, FAME in the KP-ABE

https://nngm.de/en/

https://www.bbmri.de/

https://www.tmf-ev.de/unsere-arbeit/projekte/magic

UPCARE: User Privacy-Preserving Cancer Research Platform

version by Agrawal et al. (Agrawal and Chase, 2017)

was chosen to be integrated into the platform. Thanks

to the foundational work of Gentry et al. (Gentry,

2009) and others (Van Dijk et al., 2010) schemes like

CKKS could be developed. This project vastly em-

ploys Lattigo to do HE computations, which imple-

ments a multi-party variant of the CKKS scheme by

Cheon et al. (Cheon et al., 2017).

Compared to all the other mentioned approaches

above, UPCARE tries to unify and simplify both

privacy preserving cryptographic techniques, namely

Attributed Based Encryption (ABE) and Homomor-

phic Encryption (HE) in an overall architecture. This

further improves privacy preserving data access to

cancer registries for eligible researchers.

3 USE CASES

The following two use cases exemplify UPCARE’s

commitment to cutting-edge privacy-preserving tech-

nologies and show how privacy friendly access to can-

cer registry data is possible. In the ﬁrst scenario, UP-

CARE employs Multiparty Homomorphic Encryp-

tion (HE) to enable secure collaboration among multi-

ple cancer research institutions. This approach allows

researchers to analyze encrypted data without com-

promising individual patient details, fostering collec-

tive insights while preserving conﬁdentiality. In the

second use case, Attributed Based Encryption (ABE),

coupled with a distinct ”patient disagreement” at-

tribute, empowers patients to exercise granular con-

trol over the sharing of their personal cancer data.

This user-centric approach allows patients to make

purpose-speciﬁc decisions by attaching conditions to

their data. This ensures that the access, which was

granted by implicit patient consent in the beginning

can be withdrawn by any patient on request. To-

gether, these use cases demonstrate UPCARE’s prag-

matic yet innovative approach to unlocking the po-

tential of cancer registry data without compromising

individual privacy.

3.1 Use-Case 1 (UC-1): Secure

Multi-Institutional Data

Collaboration

This use case ensures that sensitive patient informa-

tion remains conﬁdential throughout the collaborative

research process, allowing institutions to glean valu-

able insights collectively without compromising indi-

vidual privacy.

Data Collaboration Setup. Each participating

institution encrypts its cancer registry data using

some customized public key. The encrypted data

is shared with a processing entity responsible for

performing aggregated analyses, the query platform.

Secure Data Analysis. The query platform can

perform computations on the encrypted data without

decrypting it, thanks to HE. By leveraging HE, it can

perform computations directly on the encrypted data,

eliminating the need for decryption and safeguarding

individual patient details. This approach enables

aggregated statistical analyses and patterns to be

derived without ever exposing sensitive information.

Consequently, the platform remains completely

data-blind, upholding both patients’ rights as data

subjects and cancer registry managers’ conﬁdential-

ity obligations. Aggregated statistical analyses and

patterns can be derived without exposing individual

patients’ details.

Privacy-Preserving Results. The ﬁnal results are

sent back to the requesting researchers in an en-

crypted form. Only authorized researchers with the

necessary decryption keys can access the detailed re-

sults and decrypt them.

3.2 Use-Case 2 (UC-2): Patient-Driven

Data Sharing Control

This use case empowers patients to actively partic-

ipate in and control the sharing of their health data

for research purposes, contributing to a more trans-

parent and patient-centric approach within the cancer

research ecosystem.

Attribute-Based Encryption Setup. Patient

data is encrypted using attribute-based encryption.

Attributes include medical information, research cat-

egories, and a speciﬁc ”patient agreement” attribute.

User-Centric Data Sharing Decision. Patients are

presented with requests for data access, specifying

the intended purpose (e.g., research study, clinical

trial). Patients can attach conditions to their data,

such as time limitations or restricted access to speciﬁc

set of attributes.

Selective Data Disclosure. The encrypted data

may only be decrypted by researchers that meet the

speciﬁed, policy based conditions, i.e. where the

attributes match the policy. A Patient disagreement

attribute acts as an additional layer of patient access

control, ensuring data is accessed only with implicit

patient consent, and not with explicit patient dis-

consent.

SECRYPT 2024 - 21st International Conference on Security and Cryptography

Users

Query result

cache (extra)

1. User Agent / Browser

2. Authorization layer

3. Query Platform

Privacy

Proxy

Existing

System

4. Privacy Proxies in Cancer Registries

Cancer Register

Data

management

Data Owner / Cancer Register

Operator

PP-Agg

Protection

Module

ABE encryption

module

PP-Agg

aggregation

module

Researcher

Portal, web front-

end

Key Server

ABE

ABE cache

(optional)

researchers

projects

queries

data owners

Ciphertext

computaion

and caching

PP-Agg Reveal

Module

ABE Decryption

Client

Management

System

Crypto

Backends

GUI

Auth* mechanism

with external IDP

FHIR Data

Storage

Module

Read API

Write API

Identity Provider

OpenID

Connect

Figure 1: The UPCARE platform components, composed of an authorization layer with a KS, a central QP and one or more

PP’s attached to existing cancer registries.

Audit Trail and Transparency. A transparent au-

dit trail is maintained, detailing when and how patient

data is accessed. Patients have visibility into who ac-

cessed their data and for what purpose.

4 UPCARE PLATFORM

The success of the UPCARE platform hinges on the

synergy of multiple pivotal components: the Query

Platform (QP), Key Server (KS), and Privacy Prox-

ies (PP). The QP serves as central point for re-

searchers and institutions, providing a secure inter-

face for querying and analyzing cancer registry data

without compromising individual privacy. Comple-

menting this, the KS acts as the guardian of Attributed

Based Encryption (ABE) key material, ensuring se-

cure and authorized access to encrypted data. Mean-

while, the PP are the owners of the data requested by

the researchers. In speciﬁc applications like those ex-

plored in UPCARE, these privacy proxies (Privacy

Proxies (PP)s) would be deployed by cancer reg-

istries, at either national or regional level, within their

data infrastructure. Functioning as vigilant interme-

diaries, they act as safe gates to the registries’ data

management systems, as illustrated in Figure 1. To-

gether, these components form a robust and intercon-

nected framework, embodying UPCARE’s commit-

ment to fostering collaborative research while safe-

guarding the sensitive health information that propels

advancements in cancer research. A graphical repre-

sentation of the foundational architecture can be seen

in Figure 1.

4.1 Components

The UPCARE platform is composed of three main

components, namely a Query Platform (QP), a Key

Server (KS) and two or more Privacy Proxies (PP).

4.1.1 Query Platform (QP)

The central QP is responsible for accepting UC-1 or

UC-2 query requests by authorized researchers via

a web interface. It is also responsible for presenting

the query result to the researcher. Its web interface

needs to be able to run the required cryptographic

algorithms and protocols required for each use

case. UC-1 requires the web interface to be able to

decrypt ABE ciphertexts with a given secret key. In

comparison, UC-2 requires the web interface to be

able to generate HE key material, as well as decrypt

HE ciphertexts.

UC-1 Queries. are accompanied by ABE policies.

An ABE policy is dynamically generated and fetched

from the KS, in form of a secret key, on each reload

of or interaction with the web interface. The secret

key is used to decrypt the results of the UC-1 query

request. The request itself is based on plaintext JSON

and is deﬁned by a selection, a projection, an upper

limit and a choice of PPs. On the left side of the

UI the selection, projection and choice of PP can be

conﬁgured for each query. On the bottom of the UI

the temporary generated key material is seaming-less

integrated into the web interface.

UC-2 Queries. are based on Multiparty Homomor-

phic Encryption (MHE) (Mouchet et al., 2021). A

temporary researcher key pair, composed of public

and private keys, is ﬁrst generated by deriving it

from a common random polynomial, given as refer-

ence string. The common random polynomial is pro-

vided by the QP. The public key is used to re-encrypt

(through so called key-switching) the ﬁnal result of

the UC-2 query request to the researcher, so that he

can, later on, decrypt the data with the corresponding

private key. The request itself is based on plain-text

UPCARE: User Privacy-Preserving Cancer Research Platform

JSON and is deﬁned by a selection, a projection, a

computation method, and an upper limit of results. As

we want to protect the privacy of the patients, the re-

searcher has no information and no choice on which

PPs are going to be used in a UC-2 query. Again,

on the bottom of the UI the temporary generated key

material is seaming-less integrated into the web inter-

face.

4.1.2 Key Server (KS)

The central KS is responsible for creating, editing

and holding access policies, as well as deriving

ABE secret keys from those stored policies. Since

we regenerate the key material before each query

dynamically, it is in our interest to create the policy

at the time the query is formulated on the QP. This

approach also allows short-term changes to patient

consent to be included in the key material. Therefore

the decision was made to integrate a Key Policy At-

tributed Based Encryption (KP-ABE) scheme in our

platform. The corresponding derived secret keys give

the researchers the right to decrypt those ciphertexts,

where the attributes exactly match the policy. The

component is part of the authorization layer, which

allows internal or external Identity providers to be

attached to the system, via OpenID Connect. This

allows researchers to identify themselves via foreign

institutions.

ABE attribute universe. The attribute universe

is deﬁned by all PPs in the network during the

initialization phase. In this phase all i PPs query their

external cancer registry database for all available ﬁeld

descriptors d

. The ﬁeld descriptors are collected on

the QP, where their union U =

j=i

j=0

is computed.

The set of all descriptors U, together with a special

additional patient-centric disagreement attribute d

dis

form the attribute universe U.

Patient Disagreement. Based on legal decisions

made in Germany, each German cancer patient gives

implicit agreement for the usage of the available med-

ical data. A patient can afterwards opt-out by giv-

ing explicit disagreement. This workﬂow is now in-

tegrated into the UPCARE platform by storing the at-

tribute d

dis

in the context of a registered patient, on

his disagreement. The central KS allows a patient to

store d

dis

for a speciﬁc research project or researcher

in an additional web front-end. The complete work-

ﬂow can be seen in Figure 2.

4.1.3 Privacy Proxies (PP)

Two or more distributed PP are responsible for giving

access to the underlying cancer register DBMS in a

privacy preserving manner, according to the given

use cases UC-1 and UC-2. Each PP is composed of

three internal modules, an ABE module, a Privacy

Preserving Aggregation (PPAG) module and a data

module. The data module is responsible for importing

and converting all external data sources, provided

by the cancer registries, into a uniform JSON data

format. The data is not imported in a bulk, but rather

is accessed dynamically, on request by the ABE

module or the PPAG module.

The ABE Module. The abe module is responsible

for encrypting the plain-text input data from the data

module using ABE. It does so by ﬁrst parsing the

key-value data sets for a speciﬁc patient ,deﬁned by

the selection criteria of the query. Followed by the

parsing is the encryption of all values with the key as

corresponding attribute. The ABE module therefore

uses the master key (downloaded from the key server

on initialization) to calculate ciphertext values for all

given keys. The attribute set is extended by d

dis

, if

and only if the patient has expressed his disagreement

on the key server interface via opting out. More

details on the ﬂow of data in the overall architecture

is given in section 4.4.1.

The PPAG Module. is responsible for encrypting

the plain-text input data from the data module us-

ing a particular HE scheme named multi-party ho-

momorphic encryption (MHE) (Mouchet et al., 2021)

which allows the secure computation of their sum

and a collaborative decryption of this result encrypted

with a public key generated from multiple secret keys.

Thanks to the use of MHE, neither the researcher nor

any single privacy proxy can access the individual en-

crypted inputs before their aggregation. The aggre-

gation is performed collaboratively through the query

platform. Hence, the module allows a researcher the

dynamically generate a key pair in the browser and

use this key pair to generate the overall encryption key

and further aggregate encrypted data from the PPs in

a privacy preserving manner. More details on the ﬂow

of data in the overall architecture is given in section

4.4.2.

4.2 Threat Model

The main threat model in the whole query process

is an honest-but-curious (HBC) QP who follows the

protocol, but carries out an insider attack on the data

ﬂowing through itself. A successful insider attack on

the QP would compromise the data of all connected

SECRYPT 2024 - 21st International Conference on Security and Cryptography

Privacy Proxy #n

UI Query InterfaceQuery PlatformPrivacy Proxy #1

forward Query

Cancer

DBMS #1

[CTi] = Encrypt(PK, [key], [value])

Combine all [CTi]

Query

{"key": "value"}

Encrypted [CT

, ...]

Download SK

Query

WASM

Results ← Decrypt(SK ,[CTi])

Sample query: {

mode:

sum/avg/cnt,

selection:

(age > 50 ∧

BMI < 2.3) ∨

height > 1.75,

projection:

["blood-press",

"heart-rate"]

}

Privacy-Preserving ABE

Privacy Proxy #2

Keyserver

Attribute

Disagree

getDatatypes

["key", "key2", ..., "key_n"]

get Datatypes

Query

Keygen(key1, ...keyn)

Figure 2: UC-1 in the Query phase. White boxes are coded in WebAssembly. Blue boxes are coded in Rust. Purple boxes are

coded in Go.

cancer registries, and thus would be the worst case.

PPs are responsible for ensuring the privacy protec-

tion of the cancer patients whose data they own. Be-

cause of this, they have no reason to deviate from the

protocol or exchange information with other actors.

Consequently, PPs are also considered HBC. The re-

searchers use the platform to perform some research

over the cancer data. We assume that researchers al-

ways act according to their best self-interest. This

means that researchers can deviate from the protocol

if this allows them to have access to more informa-

tion. For this reason, the researcher is considered ma-

licious. In addition to this setting, we consider that

PPs can not communicate with each other, privately,

and that the query platform can collude with the Re-

searcher.

4.3 Implementation

Our envisioned architecture ensures a secure, efﬁ-

cient, and privacy-preserving environment for cancer

research, aligning with the goals of UPCARE. The

architecture leverages Rust’s performance, memory

safety, and concurrency features while incorporating

state of the art cryptographic techniques to protect

sensitive patient data. Its main features are:

Front-End. The user interface where researchers

and authorized patients interact with the system.

The UI was built using sveltekit, HTML, CSS, and

JavaScript, with additional Rust-generated (ABE) and

Go-generated (HE) browser-capable cryptographic

code (each compiled using WebAssembly).

Rocket Servers. The core back-end APIs (of

KS, QP and PP) are implemented in Rust using

the Rocket web framework. Each back-end API

manages incoming HTTP requests and orchestrates

communication from the front-end up until the PPs.

Every API handles a different, speciﬁc action from

authentication and query processing to response

generation.

Query Platform. A module within the Rocket

server responsible for processing and executing

queries on the cancer registry data. It allows either

personal ABE or statistical HE queries. The QP uti-

lizes secure multiparty homomorphic encryption to

perform computations on encrypted data or attribute

based encryption to allow encrypted data access.

The PPAG module implements privacy-preserving

algorithms for data analysis, currently sum, average

and count are available as options.

Key Server. The KS manages ABE keys used for

the decryption of sensitive, i.e. personal patient data.

It ensures the secure key distribution and thus allows

access control. The KS implements OAuth 2.0 for

secure authorization and data access.

Privacy Proxies. The PPs are intermediate compo-

nents between the cancer registry data sources and

the QP. They implement cryptographic techniques to

preserve individual patient privacy during data anal-

ysis or in the case of personal data access request by

a researcher. The ABE module was implemented in

Rust, while the HE module was implemented in Go.

They both act as guardians of sensitive information,

ensuring that only aggregated and privacy-preserving

HE encrypted statistical results or either attribute

based encrypted patient record data sets are exposed

to the query platform.

Logging and Auditing. The platform implements

logging and auditing mechanisms to track researcher

activities and data access on each of its components.

This facilitates transparency and compliance with

privacy regulations.

UPCARE: User Privacy-Preserving Cancer Research Platform

Privacy Proxy #n

UI Query InterfaceQuery PlatformPrivacy Proxy #1

store temporary PubK

k-anon Agg. of [CTi]

Cancer

DBMS #1

store [sum

]

Fetch [CTi]

Query

[CTs] = Encrypt([sum

])

[CTi] = Encrypt([n

])

Query: [{sum

, n

}, ...]

Encrypted [CT

, ...]

KeySwitch

KeySwitch to

QP's tmp key

WASM

PrivK/PubK ← gen_keypair()

Query,

PubK

j = Decrypt(QPK,

[CTi]+... )

No Resultswith [CTs]

Continue .. Fetch sum

Fetch [CTs]

Encrypted [sum

, ...]

k-anon Agg. of [CTs]KeySwitch

KeySwitch to PubK

KeySwitch to temp. PubK of [CTs]

WASM

Results ← Decrypt(PrivK ,CTs)

j ≥ k .. continue,

j < k .. halt

Sample query: {

mode:

sum/avg/cnt,

selection:

(age > 50 ∧

BMI < 2.3) ∨

height > 1.75,

projection:

["blood-press",

"heart-rate"]

}

Privacy-Preserving HE

Privacy Proxy #2

PubK

[CTs]

[ ]

Figure 3: UC-2 in the Query phase. White boxes are coded in WebAssembly. Blue boxes are coded in Rust. Purple boxes are

coded in Go.

External Interfaces. Each component provides an

API for integration with external systems or research

applications. This procedure adheres to industry

standards and interoperability protocols.

Adaptability. The data import module allows to

import data from various sources like NoSQL, SQL,

CSV and so on. Therefore the system can be adopted

to various cancer databases, as long as a mapping

between input data format and the format standard,

based on FHIR, can be built. More details on FHIR

usage can be found in Section 5.1.

4.4 Integration

Both privacy-preserving components, the ABE and

PPAG modules, are seamlessly integrated into both

the individual PPs and the overall system architec-

ture through a standardized query protocol based

on JSON. Before responding to user queries, both

modules require an initialization phase upon system

startup, followed by a dedicated query phase where

they actively process queries.

4.4.1 ABE Flow

Initialization phase A master secret key and a

public key are generated on the KS based on all FHIR

data types available on the connected cancer registry

databases, which are queried using the corresponding

PP’s. In addition to the FHIR attributes, an attribute

”disagreement” is added to the set of attributes, in

order to represent patient disagreement. The public

key is distributed via a secure channel

to the PP’s,

where they are used to encrypt patient data from the

connected cancer registry database. A local database

is ﬂushed and initialized on the KS, where user

disagreement can be stored permanently.

Query Phase. Once the key generation is ﬁnished,

the researcher sends a query to the query platform,

which forwards it to each of the PP’s. Each proxy pp

queries the corresponding data from the connected

cancer registry dbms and then encrypts the result us-

ing the public key PK, obtained from the KS and the

value of the current key as the current attribute. The

encrypted results are collected and combined on the

query platform before they are handed back to the UI

query interface, where a eligible researcher is able the

decrypt the results directly inside his browser, with

the help of his secret key SK and the decryption algo-

rithm compiled in Wasm. This exact ﬂow can be seen

in ﬁgure 2.

4.4.2 PPAG ﬂow

In this section, we describe the PPAG ﬂow which

makes use of a dedicated multi-party homomorphic

encryption (MHE) scheme (Mouchet et al., 2021)

inspired by the work in (Bozdemir et al., 2023).

Initialization. This phase consists of the generation

of the common public key used for encryption by all

privacy proxies participating in the actual aggregation

operation. To collaboratively generate the public

Protected by the standard https protocol

SECRYPT 2024 - 21st International Conference on Security and Cryptography

key derived from MHE (Mouchet et al., 2021),

each party generates a public key from their secret

key (by encrypting a zero-message) and an initially

distributed/agreed common reference string. The

query platform performs the aggregation of these

public keys (a simple sum) in order to obtain the

common public key which is then distributed to

privacy proxies.

Query Phase. Once the key generation is ﬁnished,

the researcher sends the query to the query plat-

form which forwards it to the privacy proxies. Each

proxy pp

ﬁrst determines how many instances of

its database it is gonna respond with. This num-

ber c

is encrypted with the public key and sent to

the query platform. The query platform aggregates

these numbers and the obtained total number is j de-

crypted and compared with a threshold value k. Only

if this threshold is exceeded, then the actual aggrega-

tion phase can start. Each proxy encrypts its data with

the common public key and sends it to the query plat-

form which aggregates these data similar to the previ-

ous aggregation operation. This sum is further resent

to the proxies for collaborative key switching so that

the result is encrypted with the researcher’s public key

at the end. This exact ﬂow can be seen in ﬁgure 3.

4.4.3 Common Query Protocol

The queries of both uses cases are covered by a com-

mon, JSON-based query language. At ﬁrst, a request

is generated on the UI, which is then added to either

a personal or a statistical query. Each request is trans-

formed into a Job on the QP and is stored locally in a

database for further procession by the QPs. As soon

as there is a new entry in the local database, the PP’s

start to work on the request. In a UC-1 query the

PP’s encrypt the corresponding query results using the

ABE key material. In a UC-2 query the PP’s encrypt

the corresponding query results using the HE key ma-

terial. After ﬁnishing their work, they write back the

result into the Job database. As soon as the job is ﬁn-

ished, the results are streamed back to the researcher.

5 EVALUATION

The evaluation of the Privacy-Preserving Cancer Re-

search Platform (UPCARE) is paramount to ascertain

its effectiveness in addressing the complex challenges

inherent in collaborative medical research while safe-

guarding patient privacy. In this section, we present a

comprehensive evaluation of UPCARE, focusing on

its utilized dataset, its query performance, the sys-

tem scalability, a security analysis and additionally

compliance with regulatory frameworks. By scruti-

nizing the platform’s capabilities and limitations, we

aim to provide insights into its real-world applicabil-

ity, resilience to potential threats, and ability to meet

the evolving needs of cancer research stakeholders.

Through rigorous testing and analysis, we endeavor

to validate the integrity of UPCARE as a robust and

privacy-conscious solution poised to drive innovation

in the ﬁeld of cancer research while maintaining the

highest standards of data privacy and security.

5.1 Dataset Description

To address a future compatibility question a Fast

Healthcare Interoperability Resource (FHIR) (Ayaz

et al., 2021) format of Health Level 7 Standards Or-

ganization (HL7) was used to store and request the

patient data. The Format is supported by FHIR Data

Storage module based on the FIDES System (Bun-

desdruckerei, 2018) as a native format such that no

additional conversion for data processing is required.

The patient data are high sensitive data. For a pur-

pose of test and proof of concept a generated patient

dataset of Synthea (Walonoski et al., 2018) was used

in the project. The dataset contains one million syn-

thetic patient medical records, encoded in HL7 FHIR

including demographic information, medical history,

medications, and lab results.

5.2 Security Analysis

Our purposed UPCARE solution solves the problem

of an (hbc) QP, by encrypting all plaintext data on

each PP, before arriving at the QP. Even when calcu-

lations are done on the QP, as in UC-2, QP is not able

to leak the corresponding, underlying plaintext data.

Both personal as well as statistical data are protected

from leaking at a central platform through means of

encryption. Indeed, in UC-2, thanks to MHE, nei-

ther the QP and the researcher nor any single PP can

access the encrypted data before aggregation. The ag-

gregation is performed collaboratively between PPs

through the QP. At the end of the protocol, a collab-

orative key switching is performed on the aggregated

data to enable its decryption by the actual researcher,

only. Regarding malicious insider attacks on the lo-

cal PP at the cancer registries, our proposed solution

does unfortunately not protect from data leakage, as

the medical plaintext data must be readable by the PP

for encryption. As previously mentioned these PP are

considered HBC, only.

UPCARE: User Privacy-Preserving Cancer Research Platform

5.3 Performance

In this section we evaluate the performance of both

Queries, i.e. UC-1 which involves secure data re-

trieval based on speciﬁc patient attributes and UC-2

which involves aggregated data analysis and statisti-

cal computations on encrypted data.

5.3.1 Methodology

The following methodology was used to measure

performance Initially, execute multiple UC-1 queries

across a range of test scenarios, varying the number of

attributes and the complexity of search criteria. Then

measure the query response time from the initiation of

the query to the retrieval of results, capturing the time

taken for data processing, encryption/decryption, and

transmission. Execute multiple UC-2 queries using

sample datasets of varying sizes, encompassing di-

verse research categories and demographic attributes.

Finally, Measure the query response time for the inte-

grated statistical operations.

5.3.2 Results

The following results where thereby gathered regard-

ing UC-1 queries: The performance of a UC-1 query

exhibits a positive correlation with the number of at-

tributes and the complexity of search criteria.On aver-

age, queries with fewer attributes and simpler search

criteria demonstrate faster response times, typically

ranging from 5 seconds to 25 seconds. This be-

haviour can be seen in ﬁgure 4 However, as the num-

ber of attributes and search complexity increases, the

query response time may exceed 25 seconds, particu-

larly in scenarios involving large datasets or complex

attribute-based access policies. The response time can

be optimized by an additional ABE ciphertext cache

on the QP.

Regarding UC-2 queries, the following results

were collected: This query demonstrates competitive

performance in conducting aggregated data analysis,

with response times typically ranging from 2000 mil-

liseconds to 15000 milliseconds. The query response

time exhibits scalability with dataset size, as larger

datasets incur marginally longer processing times due

to increased computational complexity. Complex

statistical computations, such as correlation analy-

sis across multiple attributes, may result in slightly

longer response times, particularly in datasets with

extensive attribute diversity.

Factors such as encryption/decryption overhead

and data transmission latency contribute to ﬂuctu-

ations in query response time across different sce-

narios. Optimization strategies, such as indexing,

caching, and parallel processing, may be explored to

mitigate latency and enhance the efﬁciency of secure

data retrieval operations.

5.3.3 Conclusion

The performance of UC-2 queries underscores the

need for efﬁciency in homomorphic encryption opera-

tions. Despite the computational overhead associated

with encrypted data processing, UC-2 queries main-

tain satisfactory response times, facilitating timely in-

sights and decision-making in cancer research. Fur-

ther optimization opportunities, such as algorithmic

reﬁnement and distributed computing strategies, may

be explored to enhance the scalability and efﬁciency

of aggregated data analysis operations. The evalua-

tion of query performance in UPCARE highlights the

platform’s capability to deliver secure and efﬁcient

data retrieval and analysis functionalities. By sys-

tematically analyzing the response times of UC-1 and

UC-2 queries across diverse scenarios, we gain valu-

able insights into the platform’s performance charac-

teristics and identify opportunities for optimization.

Moving forward, continued reﬁnement and optimiza-

tion efforts will be essential to ensure that UPCARE

maintains high-performance standards while fulﬁlling

the evolving demands of privacy-preserving cancer

research.

6 USE CASE EVALUATION

Cancer registration is an essential component of the

health care system and provides valuable insights into

the prevalence, treatment and outcomes of oncolog-

ical diseases. The comprehensive understanding of

cancer epidemiology and the evaluation of oncologi-

cal care heavily rely on the systematic collection and

analysis of data. In Germany, the data on oncolog-

ical care is reported to the German state cancer reg-

istries by the medical facilities involved in the treat-

ment according to the standardized oncological basic

data set (Tumorzentren eV, 2015). Various essential

pieces of information are recorded, including patient

details, diagnosis, surgical interventions, radiother-

apy, systemic therapy, and the course of the disease

leading to death (oBDS2021, 2021). Consequently,

the cancer registry is a systematic collection of infor-

mation in the form of a database on tumor diseases.

The evaluation of the use cases for the platform is es-

sentially based on the expertise of the Cancer Registry

Rhineland-Palatinate and the Rhineland-Palatinate le-

gal framework for the provision of personal patient

data from the cancer registry for research purposes.

SECRYPT 2024 - 21st International Conference on Security and Cryptography

2 4 6 8 10 2 4 6 8 10 2 4 6 8 10 2 4 6 8 10 2 4 6 8 10

10 Patients 20 Patients 30 Patients 40 Patients 50 Patients

Time in seconds

UC-1 Pat. query

UC-2 Stat. query

Figure 4: Performance, measured as query response time in seconds.

6.1 Current Situation

Cancer registries offer epidemiological and clinical

evaluations for all tumor entities, facility-related eval-

uations as well as individual evaluations for spe-

ciﬁc inquiries. For research purposes, aggregated,

pseudonymized or personal data can be provided for

precisely deﬁned scientiﬁc oncological questions re-

lating to cancer prevention, cancer care and research

into the causes of cancer (KRRLP, 2012). Requesting

and providing data for research purposes is a man-

ual process, as there is currently no standardized, se-

cure request platform for cancer registry data for re-

searchers. The application is subject to an internal

formal and content-related review. In the case of a re-

quest for personal data, additional authorisations from

the Ministry of Science and Health, an ethics vote

and, if necessary, a hearing of the State Commissioner

for Data Protection and Information Security are re-

quired (LKRG215, 2015).

The possibility of requesting personal patient data for

research purposes has rarely been utilized to date. The

requested data is provided in a secure way. If personal

patient data is requested from several German can-

cer registries, each registry enables data access via its

own channel. A separate application must be submit-

ted to the respective registry for each cancer registry

from which data is requested. In such cases, several

ethics committees are involved, each overseeing the

ethical considerations for the registry in question.

As can be seen above, obtaining personal patient

data for a research project is a time-consuming pro-

cess for the applicant. Especially when data is re-

quested from several cancer registries.

6.2 Data Query via UPCARE

The use cases were evaluated from the perspective of

the cancer registry with regard to the following as-

pects:

Data Privacy and Security. Even with aggregated

data, there is a certain risk of drawing conclusions

about individual patients. Numerous factors con-

tribute to the creation of a high re-identiﬁcation

potential. These factors may relate to data character-

istics, such as the uniqueness of feature expressions

and their combinations, as well as temporal and

spatial information. However, the likelihood of

successful re-identiﬁcation depends on external

knowledge, whether publicly or non-publicly avail-

able, along with other inﬂuencing factors (Drechsler

and Pauly, 2024). By using HE, it is possible to

securely aggregate data without exposing individual

patient details. The encrypted aggregated data can

be analyzed to derive general patterns or statistics,

reducing the risk of re-identiﬁcation of speciﬁc cases

based on unique characteristics. From a cancer

registry perspective, this is a big step towards patient

privacy. ABE leads the cancer registry to a standard-

ized data protection and data security concept for

the provision of personal patient data for research

purposes.

Accessibility. UPCARE provides researchers with

easy and secure access to a diverse range of cancer

data. The platform can be accessed at any time via a

web interface and enables secure data queries from

the connected cancer registries at a central hub.

Feasibility. The platform makes it possible to query

whether there is enough patient data for a study. The

applicant formulates the characteristics of the enquiry

and receives an aggregated result, for example the

number of patients, their gender and age. If the result

is sufﬁcient for a cohort, the applicant can request the

patient data from the selected cancer registries via

the platform. In addition, by combining the cancer

registries via the platform, there is a good chance of

collecting sufﬁcient data on rare tumors.

Interoperability. HL7 FHIR

is the chosen stan-

dard for communication between the platform and

https://www.hl7.org/fhir/

UPCARE: User Privacy-Preserving Cancer Research Platform

the cancer registries. The FHIR implementation

is currently based on FHIR resources generated

by Synthea (Walonoski et al., 2018). This is a

synthetic patient generator that produces realistic but

completely ﬁctitious health data.

7 CONCLUSIONS

In conclusion, the Privacy-Preserving Cancer Re-

search Platform (UPCARE) represents a signiﬁcant

advancement in the ﬁeld of cancer research, empha-

sizing the paramount importance of preserving pa-

tient privacy while fostering collaborative data anal-

ysis. Through the integration of advanced crypto-

graphic techniques such as ABE and HE, UPCARE

ensures that sensitive health data remains conﬁdential

and secure throughout the research process. The ﬁne-

grained access control facilitated by ABE empowers

patients to maintain control over their data, while HE

enables secure and private computations on encrypted

data, safeguarding individual privacy. Moreover, the

automation of data processing and the use of cryptog-

raphy not only enhance privacy but also expedite the

speed of data requests and analysis, facilitating faster

insights and discoveries in cancer research. In this

way, data privacy-friendly analysis methods are de-

veloped and integrated into the architecture of cancer

registries, which, on the one hand, maintain a high

level of data protection and, on the other hand, pro-

vide meaningful ﬁndings for cancer research.

7.1 Future Work

Identifying avenues for future work is crucial for ad-

vancing the Privacy-Preserving Cancer Research Plat-

form (UPCARE) and addressing emerging challenges

in the ﬁeld. Some potential directions for future work

include: Exploring advanced privacy-preserving tech-

niques, such as differential privacy or secure multi-

party computation, to further fortify data protection

and conﬁdentiality mechanisms within UPCARE. In-

vestigate ways to integrate these techniques seam-

lessly into the existing UPCARE architecture without

compromising performance. Embrace emerging tech-

nologies such as blockchain and federated learning

to augment data security, transparency, and collabo-

ration within the UPCARE cancer research ecosys-

tem. Investigate opportunities to leverage these tech-

nologies to enhance data provenance, traceability, and

veriﬁability while preserving patient privacy. Con-

tinuously optimize UPCARE’s scalability and per-

formance to accommodate the growing volume and

complexity of cancer registry data. Explore meth-

ods to streamline data processing, improve query

response times, and reduce computational overhead

while maintaining privacy guarantees.

By embarking on these future work initiatives,

UPCARE can evolve into a dynamic and adaptive

platform that not only advances cancer research but

also serves as a model for privacy-preserving data

sharing and collaborative innovation in healthcare.

ACKNOWLEDGEMENTS

This work was supported by the MESRI-BMBF

French-German joint project UPCARE (ANR-20-

CYAL-0003-01, 16KIS1383K).

REFERENCES

Agrawal, R. and Srikant, R. (2000). Privacy-preserving data

mining. In Proceedings of the 2000 ACM SIGMOD in-

ternational conference on Management of data, pages

439–450.

Agrawal, S. and Chase, M. (2017). Fame: fast attribute-

based message encryption. In Proceedings of the 2017

ACM SIGSAC Conference on Computer and Commu-

nications Security, pages 665–682.

Ayaz, M., Pasha, M. F., Alzahrani, M. Y., Budiarto, R., and

Stiawan, D. (2021). The fast health interoperability

resources (fhir) standard: systematic literature review

of implementations, applications, challenges and op-

portunities. JMIR medical informatics, 9(7):e21929.

Barua, M., Liang, X., Lu, R., and Shen, X. (2011). Espac:

Enabling security and patient-centric access control

for ehealth in cloud computing. International Jour-

nal of Security and Networks, 6(2-3):67–76.

Bethencourt, J., Sahai, A., and Waters, B. (2007).

Ciphertext-policy attribute-based encryption. In 2007

IEEE symposium on security and privacy (SP’07),

pages 321–334. IEEE.

Bozdemir, B.,

Ozdemir, B. A., and

Onen, M. (2023).

PRIDA: PRIvacy-preserving Data Aggregation with

multiple data customers. IACR cryptology e-print

archive, Paper 2024/074.

Bramm, G., Gall, M., and Sch

utte, J. (2018). Blockchain-

based distributed attribute based encryption. In Pro-

ceedings of the 15th International Joint Conference on

e-Business and Telecommunications (ICETE 2018)-

Volume, volume 2, pages 99–110.

Bundesdruckerei (2018). From the almighty

administrator to the self-determined user.

https://www.bundesdruckerei.de/en/whitepaper/

download/2835/Whitepaper-Fides.pdf.

Cheon, J. H., Kim, A., Kim, M., and Song, Y. (2017). Ho-

momorphic encryption for arithmetic of approximate

numbers. In Advances in Cryptology–ASIACRYPT

2017: 23rd International Conference on the Theory

SECRYPT 2024 - 21st International Conference on Security and Cryptography

and Applications of Cryptology and Information Se-

curity, Hong Kong, China, December 3-7, 2017, Pro-

ceedings, Part I 23, pages 409–437. Springer.

Drechsler, J. and Pauly, H. (2024). Das reidentiﬁka-

tionspotenzial von strukturierten gesundheitsdaten.

Bundesgesundheitsblatt - Gesundheitsforschung -

Gesundheitsschutz, 67(2):164–170.

Gentry, C. (2009). Fully homomorphic encryption using

ideal lattices. In Proceedings of the forty-ﬁrst annual

ACM symposium on Theory of computing, pages 169–

178.

Geva, R., Gusev, A., Polyakov, Y., Liram, L., Roso-

lio, O., Alexandru, A., Genise, N., Blatt, M.,

Duchin, Z., Waissengrin, B., et al. (2023). Col-

laborative privacy-preserving analysis of oncologi-

cal data using multiparty homomorphic encryption.

Proceedings of the National Academy of Sciences,

120(33):e2304415120.

KRRLP (2012). Bericht des krebsregisters rheinland-

pfalz 2022/23. https://www.krebsregisterrlp.de/

ﬁleadmin/user upload/dokumente/04 Ver%C3%

B6ffentlichungen/2023/KRB2022 Webversion 01.

pdf.

Lewko, A. and Waters, B. (2011). Decentralizing attribute-

based encryption. In Annual international conference

on the theory and applications of cryptographic tech-

niques, pages 568–588. Springer.

LKRG215 (2015). Landeskrebsregistergesetz - § 12 - abgle-

ichung und

Ubermittlung personenidentiﬁzierender

daten. https://www.landesrecht.rlp.de/bsrp/document/

jlr-KrebsRegGRP2015pP12.

Mouchet, C., Troncoso-Pastoriza, J., Bossuat, J., and

Hubaux, J. (2021). Multiparty Homomorphic Encryp-

tion from Ring-Learning-with-Errors. In Privacy En-

hancing Technologies Symposium (PETS).

uller, S., Katzenbeisser, S., and Eckert, C. (2009). Dis-

tributed attribute-based encryption. In Information

Security and Cryptology–ICISC 2008: 11th Interna-

tional Conference, Seoul, Korea, December 3-5, 2008,

Revised Selected Papers 11, pages 20–36. Springer.

Narayan, S., Gagn

e, M., and Safavi-Naini, R. (2010). Pri-

vacy preserving ehr system using attribute-based in-

frastructure. In Proceedings of the 2010 ACM work-

shop on Cloud computing security workshop, pages

47–52.

oBDS2021 (2021). Einheitlicher onkologischer basisdaten-

satz 2021. https://basisdatensatz.de/basisdatensatz.

Pournaghi, S. M., Bayat, M., and Farjami, Y. (2020).

Medsba: a novel and secure scheme to share medi-

cal data based on blockchain technology and attribute-

based encryption. Journal of Ambient Intelligence and

Humanized Computing, 11:4613–4641.

Sahai, A. and Waters, B. (2008). Revocation systems with

very small private keys. Eprint2008.

Son, Y., Han, K., Lee, Y. S., Yu, J., Im, Y.-H., and Shin,

S.-Y. (2021). Privacy-preserving breast cancer re-

currence prediction based on homomorphic encryp-

tion and secure two party computation. Plos one,

16(12):e0260681.

Tumorzentren eV, A. D. (2015). Gesellschaft der epidemiol-

ogischen krebsregister in deutschland ev einheitlicher

onkologischer basisdatensatz. Verf

ugbar: http://www.

tumorzentren. de/onkol-basisdatensatz. html [Stand:

19.06. 2015] View in Article.

Van Dijk, M., Gentry, C., Halevi, S., and Vaikuntanathan, V.

(2010). Fully homomorphic encryption over the inte-

gers. In Advances in Cryptology–EUROCRYPT 2010:

29th Annual International Conference on the The-

ory and Applications of Cryptographic Techniques,

French Riviera, May 30–June 3, 2010. Proceedings

29, pages 24–43. Springer.

Walonoski, J., Kramer, M., Nichols, J., Quina, A., Moe-

sel, C., Hall, D., Duffett, C., Dube, K., Gallagher, T.,

and McLachlan, S. (2018). Synthea: An approach,

method, and software mechanism for generating syn-

thetic patients and the synthetic electronic health care

record. Journal of the American Medical Informatics

Association, 25(3):230–238.

Yue, X., Wang, H., Jin, D., Li, M., and Jiang, W. (2016).

Healthcare data gateways: found healthcare intelli-

gence on blockchain with novel privacy risk control.

Journal of medical systems, 40:1–8.

UPCARE: User Privacy-Preserving Cancer Research Platform