
 
parties (caregivers, researchers, etc.). Specific 
legislation, regulations and ethical guidelines with 
respect to (patient) privacy have therefore been put 
in place at different levels (European, national and 
regional).  
In this context, the capability to satisfy varying 
ethical concerns and ensure compliance to data 
protection legislation and regulations is fundamental 
to the success (viability in the long run) of any 
solution aiming to integrate health information on a 
large scale. 
Our approach to this matter comprises the design 
of a comprehensive Data Protection Framework 
(DPF) which outlines the boundaries within which 
services (and organisations) are required to operate. 
The DPF brings “compliance by design” by 
combining both a governance framework (policies 
and procedures) and a set of technical 
implementations aimed at enforcing the latter. It 
implements the rules set by the relevant National 
and EU legislation and sector best practice policies 
(ethics). The framework not only manages and 
enforces rules defining “Who has access to what 
data for which purpose, and under what conditions”, 
but also integrates solutions which enable access to 
otherwise unavailable data (a.o. Trusted Third Party 
supported de-identification).  
Introducing a uniform layer (technical solutions 
integrated in a single governance framework) upon 
which applications can (and need to) build has 
already proven to be a successful approach 
(Claerhout, 2008) to efficiently deal with regulatory 
issues of large scale transnational sharing of medical 
and biological data in the clinical trial context. One 
of the things that the overall governance and security 
framework referenced above introduced was  a novel 
practical solution (concept of “de-facto anonymous 
data”) that covers the inherent issues tied to de-
identification of individual person records (Li, 
2007). That work will serve as a basis for our DPF 
which needs to deal with the broader scope of bi-
directional cross-domain interaction between the 
care and research domain. 
Technically, the DPF will rely on (centralised) 
policy based authorization services to translate the 
legal rule sets into authorisation decisions for 
“access to” or “processing of” highly sensitive data 
over distributed resources. This approach ensures 
flexibility towards changing legislation and policies 
(and regional variations thereof). 
To meet the specific requirements of the DPF, 
the authorisation system (both decision and 
enforcement parts) needs to support concepts such as 
“purpose of use” and “conditions on use” (e.g. by 
introducing sticky policies (Chadwick, 2008) 
associated with datasets, or other types of privacy-
metadata) and work at least at the granular level of 
“a logical dataset”. Meeting these requirements in a 
generic (loosely coupled) way and with sufficient 
performance is challenging. 
Patient consent is another important aspect which 
is unmistakably connected to data protection, for 
example with respect to re-use of personal data 
beyond its originally intended use (e.g. use of EHR 
data for automated eligibility scanning, for export 
for research purposes, etc.). Technically, “Consent 
Management Services” fit into the framework as 
specialised authorization services (consent rules 
form a policy). Such services need to ensure the 
integrity of consent directives and correctly combine 
them to avoid conflicting preferences. 
Complementary to preventive security measures, 
the framework requires audit mechanisms allowing 
detection of security breaches and data leakage (and 
tools for subsequent incident handling).  
Currently, the majority of auditing mechanisms 
log individual events per application or computer 
system. In order to reconstruct a logical chain of 
events for proper audit in large distributed networks, 
these different logs would need to be combined. Few 
standards and solutions are available providing 
manageable uniform audit trails in distributed 
systems.  
Furthermore, to be useful for checking 
compliance of a (large) system with data protection 
legislation, audit trails need to include extended 
contextual information, which they rarely do (e.g. 
type of data accessed, identity of the person listed in 
the medical record accessed, etc.). Moreover, logs 
need to be readily accessible in a user-centric and 
data-centric way (e.g. be able to give an overview of 
activity of a single user throughout the network or 
the actions performed on a specific logic dataset). 
Reconstruction of such user-centric or data-centric 
audit trails based on standard logs is typically not 
feasible in practice: audit trail data is too large to 
efficiently query, identity of data subjects is not 
recorded or cannot be linked across applications, etc. 
In order to undeniably assess the compliance of 
data flows with regulations, the provenance of 
received information and stored data must be 
recorded. Knowing the provenance of a data set can 
for example inform a user or system about the 
applicable data privacy policies (cf. consent). But 
provenance goes beyond security, and for one plays 
a very important role in data quality management 
(who is the original source, how was it recorded, 
cleansed, transformed, etc.). 
BRIDGING THE GAP BETWEEN CLINICAL RESEARCH AND CARE - Approaches to Semantic Interoperability,
Security & Privacy
285