2011), leak that the last decimals of the value, for in-
stance are 593, but give no clue about whether it is
an 1.63593 meter small person or 1.93593 meter tall
person. In this paper we address the question whether
such leakage is serious. For instance if we know from
the helper data of a cyclist that his heart rate is equal
to an unknown integer plus some known fraction, how
much does that tell us about the likelihood of an en-
larged EPO concentration in his blood. In this paper
we analyze such questions.
Another form of key or privacy leakage (de Groot
and Linnartz, 2011) can occur when the attacker has
a priori knowledge about the prover, or about any per-
son in the data base. For instance that the cyclist is a
28 year old female.
Our current paper has been motivated by an im-
plementation project that records data from epileptic
patients from body sensor networks, with biometric
configuration of the radio links. Here we encountered
the question of how severe such issues are for practi-
cal biometrics.
We perform a security analysis for three important
scenarios. (i) The case of a mismatch between the true
distribution of the features x and the distribution used
for creating helper data w. The attacker is assumed to
know the true distribution. (ii) An attacker who has
partial information about enrolled users, e.g. a medi-
cal indication or gender, and tries to learn something
about the stored secret. (iii) An attacker who tries to
learn something about the enrolled user’s characteris-
tics by exploiting the public helper data and some a
priori partial information about the user.
These scenarios lead to a mismatch between the
distribution as seen by the attacker and the distribu-
tion used to make w. The question is how much the
ZSL helper data w leaks under these circumstances,
in addition to the already existing leakage. We prove
an upper bound on this additional leakage.
2 ZERO SECRECY LEAKAGE
SCHEME
We consider a commonly accepted verification
scheme which consists of an enrollment and verifi-
cation phase. In the enrollment phase the prover pro-
vides his biometric data x = (x
0
,...,x
M− 1
). From this
data, the system extracts a secret s
= Q(x), which the
system stores safely in the hashed form (h(skz),w),
where w is the helper data, which is generated as
w
= g(x) and z is the salt. The salt is a system
and/or user specific random string to prevent cross-
matching between different databases. In the verifica-
tion phase the prover provides his correlated biomet-
ric data y
= (y
0
,...,y
M− 1
) to prove his identity. All
variables, except for the salt z, are length M vectors
extracted by some means of preprocessing, to ensure
that the components are (nearly) independent, but not
necessarily identically distributed. Independence can
be obtained by for example applying a principle com-
ponent analysis (PCA) to the raw data.
Analysis will be carried out per dimension since
we have assumed the features to be independent. In
this case the total leakage in a verification scheme will
be a summation of the leakage per dimension. For
clarity notation of the biometric feature x, secret s and
helper data w will be without subscript i.
Initially, leakage elimination has been studied
(Verbitskiy et al., 2010) for secret values that are
equiprobable (Fuzzy Extractor). Each interval be-
longing to a secret is then subdivided in equiproba-
ble intervals to define the helper data. The helper data
intervals are repeated for each interval of the secrets.
This construction yields helper data whose probabil-
ity is independent of the enrolled secret.
Meanwhile, it has been argued that verification
performance highly depends of effective quantiza-
tion of the analog (continuous valued) biometrics and
continuous-valued helper data within the quantiza-
tion intervals (Linnartz and Tuyls, 2003; Chen et al.,
2007). Also in this domain, leakage is a concern
(de Groot and Linnartz, 2011; de Groot and Linnartz,
2012). Instead of demanding equiprobable discrete
values as helper data, helper data w is defined as a
continuousvariable that indicates the relative position
of the enrollment feature x within a quantization in-
terval belonging to a secret s. To achieve ZSL the
scheme has to take into account the probability den-
sity of the features. ZSL is achieved in this case by
s = Q(x) = ⌊N · F
X
(x)⌋, (1)
w = g(x) = N · F
X
(x) − s (2)
in which N is the number of quantization intervals
and F
X
is the cumulative distribution function (CDF)
of feature x. The number of quantization intervals N
does not necessarily have to be a power of 2.
The above construction yields a continuous helper
data w that reveals no information about the enrolled
secret s. In fact one can only reconstruct N possible x
values, each in a different quantization interval. This
reconstruction is given by
x
s
(w) = F
−1
X
s+ w
N
(3)
In this work we will limit ourselves to a leakage
analysis on the continuous scheme only, since the dis-
crete scheme can be considered a special case of the
continuous version.
DiagnosticCategoryLeakageinHelperDataSchemesforBiometricAuthentication
507