ON THE (NON-)REUSABILITY OF FUZZY SKETCHES

AND EXTRACTORS AND SECURITY IN THE COMPUTATIONAL

SETTING

∗

Marina Blanton and Mehrdad Aliasgari

Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, U.S.A.

Keywords:

Biometrics, Fuzzy sketches and extractors, Reusability, Computational setting.

Abstract:

Secure sketches and fuzzy extractors enable the use of biometric data in cryptographic applications by correct-

ing errors in noisy biometric readings and producing cryptographic materials suitable for many applications.

Such constructions work by producing a public sketch, which is later used to reproduce the original biometric

and all derived information exactly from a noisy biometric reading. It has been previously shown that release

of multiple sketches associated with a single biometric presents security problems for certain constructions.

Through novel analysis we demonstrate that all other constructions in the literature are also prone to similar

problems, which hinders their adoption in practice. To mitigate the problem, we propose for each user to

store one short secret string for all possible uses of her biometric, and show that simple constructions in the

computational setting have numerous security and usability advantages under standard hardness assumptions.

Our constructions are generic in that they can be used with any existing secure sketch as a black box.

1 INTRODUCTION

The motivation for this work comes from practical use

of biometric-derived data and its suitability for adop-

tion. Secure sketches and fuzzy extractors (Dodis

et al., 2004) were introduced as mechanisms of deriv-

ing cryptographic material from noisy biometric data,

which can be used for authentication, encryption, and

other purposes. Such constructions produce a helper

string (secure sketch) – which is viewed as public

– from a biometric and later re-produce the crypto-

graphic string from a close noisy biometric reading

using the helper string. Only minimal information

about the biometric should be leaked due to the re-

lease of the helper string.

While this powerful concept enables new applica-

tions and can be attractiveto users who no longer need

to maintain secrets to participate in cryptographicpro-

tocols, it has been shown that leakage of informa-

tion associated with the biometric in such construc-

tions is unavoidable (Smith, 2004; Dodis and Smith,

2005). Furthermore, this concept has been more heav-

ily studied in the context when the construction is ap-

plied to a biometric only once. Consecutive publica-

∗

This work was partially supported by grant FA9550-

09-1-0223 from the Air Force Ofﬁce of Scientiﬁc Research.

tions (Boyen, 2004; Simoens et al., 2009) explored

the security guarantees of such schemes in terms of

their reusability, when a single biometric or its noisy

version is used to produce multiple secure sketches

using the same or different algorithms. Informa-

tion leakage prevents such constructions from meet-

ing standard security requirements sought of them in

cryptographic applications such as indistinguishabil-

ity (inability to link two records to the same biomet-

ric) and irreversibility (inability to reverse the con-

struction and directly recover information about the

biometric). Some of the more popular constructions

have been shown to have serious security weaknesses

in presence of even very weak adversaries (Simoens

et al., 2009). In this work, we analyze other schemes

from the literature and show that they also cannot be

safely reused. In particular, our novel analysis shows

that the remaining constructions fail to satisfy stan-

dard security expectations with respect to reusability

and therefore cannot be used in security applications.

In such schemes, information leakage is quanti-

ﬁed as the entropy loss associated with the release

of the helper string, providing a rough upper bound.

For the current error rates and typical sets of parame-

ters in biometric data, the information theoretic anal-

ysis provides bounds that result in leakage of most or

Blanton M. and Aliasgari M..

ON THE (NON-)REUSABILITY OF FUZZY SKETCHES AND EXTRACTORS AND SECURITY IN THE COMPUTATIONAL SETTING.

DOI: 10.5220/0003454900680077

In Proceedings of the International Conference on Security and Cryptography (SECRYPT-2011), pages 68-77

ISBN: 978-989-8425-71-3

 2011 SCITEPRESS (Science and Technology Publications, Lda.)

even all entropy contained in a biometric (see (Blan-

ton and Hudelson, 2009) for a sample iris code analy-

sis). Because this information leakage is unavoidable,

it presents problems even in case of weak adversaries.

To overcome the issues of information leakage

and unsafe reuse of biometrics, we propose to use the

computationalsetting, where a user stores a single key

and the adversary is computationally bounded. The

key is introduced for the purpose of avoiding informa-

tion leakage and improving security of the schemes

and does not change the functionality. We believe

that keeping a single short key for all possible uses of

biometric-based material in different security applica-

tions is a small price to pay for achieving signiﬁcant

security improvements (which otherwise are not pos-

sible) and the ability to safely use such constructions

in a variety of applications. We show that the use of

one key and standard computational assumptions (ex-

istence of pseudo-random and hash functions) is suf-

ﬁcient for achieving very attractive properties using

simple schemes. Our constructions are generic in that

they can use any existing secure sketch scheme as a

black box for any type of distance metric).

We would like to note that the use of the secret in

our schemes should not be confused with multi-factor

authentication or the use of shared secrets, as in our

schemes the secret never leaves the user and is not

shared and a single secret is sufﬁcient for all possi-

ble uses including multiple biometric types, multiple

applications, and multiple servers.

The security beneﬁts of our schemes are:

• We achieve provably no information leakage.

• Previously, only certain restricted types of error-

correcting codes could be used to ensure security

of fuzzy sketches and extractors (Boyen, 2004).

Our solution lifts such restrictions and can be used

with any type of error-correcting code.

• Prior (Simoens et al., 2009) and our analysis of

secure sketch constructions shows that they all

fail to achieve standard security requirements for

cryptographic applications, while our solution is

secure in a much stronger adversarial model.

• Previously, exposure of a biometric-derived key

was shown to reveal no information about the bio-

metric for a speciﬁc construction in the random

oracle model (Boyen, 2004). Our construction, on

the other hand, achieves this result in the standard

model using any existing secure sketch.

In our analysis of existing constructions, we use

a very weak adversary. The security of our own

schemes, on the other hand, is shown using a very

strong adversary (the strongest in the literature).

To summarize, our contributions are two-fold: (i)

new analysis of fuzzy sketch schemes that shows that

even a weak adversary has a signiﬁcant advantage in

compromising security of existing constructions, and

(ii) simple schemes that use a single secret to achieve

strong security under standard assumptions.

2 MODEL AND DEFINITIONS

2.1 Fuzzy Sketches and Extractors

Secure (or fuzzy) sketches, introduced by (Dodis et al.,

2004), correct errors in noisy secrets by releasing a

helper string S. Let W denote a random variable and

w its value.

Deﬁnition 1. A (M ,m,m

′

,t)-secure sketch is a pair

of randomized algorithms:

• SS is a function that, on input w from metric space

M with distance function dist, outputs a sketch S.

• Rec is a function that, on input w

′

∈ M and

S = SS(w), recovers and outputs the original w

if dist(w, w

′

) ≤ t.

Secure sketches have been constructed for different

metric spaces M , for which dist(a,b) is deﬁned for

all a, b ∈ M . Security of a secure sketch is evaluated

in terms of entropy of W before (m) and after (m

′

)

releasing the string S, i.e., the entropy loss m− m

′

as-

sociated with making S public. Precise deﬁnitions can

be found in (Dodis et al., 2008).

Fuzzy extractors allow one to extract randomness

from w (to use it as cryptographic material) and later

reproduce it using w

′

close to the original w.

Deﬁnition 2. A (M , m, m

′

,t,ε)-fuzzy extractor is a

pair of algorithms:

• Gen is a function that, on input w ∈ M , outputs

extracted random string R and a helper string P.

• Rep is a function that, on input w

′

and P repro-

duces and outputs R that was generated using

Gen(w) if dist(w,w

′

) ≤ t.

The security requirement is that, for any W of min-

entropy m, the statistical distance between the distri-

bution of R and the uniform distribution of strings of

the same length is no greater than ε, even after ob-

serving P. A fuzzy extractor can be built from a se-

cure sketch using a generic construction from (Dodis

et al., 2004):

Gen(w):

1. Execute S ← SS(w;r

), where r

denotes random

coins used by SS (if any).

2. Use a strong extractor Ext to extract a random

string R from w, i.e., R ← Ext(w;r

), where r

denotes random coins used by Ext.

3. Output public P = (S,r

) and secret R.

ON THE (NON-)REUSABILITY OF FUZZY SKETCHES AND EXTRACTORS AND SECURITY IN THE

COMPUTATIONAL SETTING

Rep(w

′

,P = (S, r

)):

1. Execute w ← Rec(w

′

,S). If Rec fails (i.e., when

dist(w,w

′

) > t such that S = SS(w)), stop.

2. Extract R from w using r

as R ← Ext(w,r

) and

output R.

Strong extractors (Nisan and Ta-Shma, 1999) can ex-

tract at most m − 2log(

) + O(1) nearly random bits

(m and ε are as deﬁned above). The entropy loss of

2log(

) + O(1) is in addition to the loss due to re-

lease of sketch S, unless the extractor is modeled as a

random oracle.

Many constructions utilize error-correcting codes.

A code C is a subset of K elements {w

,. . .,w

K−1

} of

M . The minimum distance of C is the smallest d such

that dist(w

) ≥ d for all i 6= j, which implies that

the code can detect up to d − 1 errors; and the error-

correcting distance is t = ⌊(d − 1)/2⌋. A linear error-

correcting code C over ﬁeld F

is a k-dimensional lin-

ear subspace of the vector space F

which uses Ham-

ming distance as the metric. For any linear code C, an

(n− k) × n parity-check matrix H projects any vector

v ∈ F

to the space orthogonal to C. This projection

is called the syndrome and denoted by syn(v) = Hv.

Then v ∈ C iff syn(v) = 0. The syndrome contains all

information necessary for decoding, i.e., when code-

word c is transmitted and noisy w = c+ e is received,

syn(w) = syn(c) + syn(e) = 0+ syn(e), where syn(e)

can be used to determine the error pattern e.

Metric-speciﬁc secure sketch constructions are

known for the Hamming distance (used for iris

codes), the set difference (used for ﬁngerprints), and

the edit distance (used for DNA comparisons). Also,

the permutation-based construction is available for

any transitive metric (e.g., Hamming distance and

set intersection). Schemes for the Hamming distance

have been most heavily analyzed, and some schemes

are known to have security problems when reused on

related biometrics. In this work we analyze remaining

known constructions and show their insecurity.

2.2 Secure Sketch Constructions

(Simoens et al., 2009) show that two popular secure

sketch constructions – the code offset construction

with a linear error-correctingcode (the syndrome con-

struction) and the construction based on permutation

groups – do not withstand the requirements of indis-

tinguishability and reversibility, i.e., the adversary can

win such experiments with overwhelming probability.

The former scheme is for the Hamming distance (and

is among the most widely studied schemes) and the

latter is for any transitive distance metric. We con-

centrate on the analysis of other schemes and outline

schemes for the set difference and edit distance. In

what follows, we use a

← A to denote that the value a

is chosen uniformly at random from the set A.

Fuzzy Vault. The fuzzy vault scheme (Juels and

Sudan, 2002) can be used as a fuzzy sketch for set

difference. A biometric is comprised of unordered

elements w = {w

,. . .,w

} (e.g., minutiae points in

ﬁngerprints), which are disguised by adding a large

number of chaff points. The genuine points carry

information that allows w to be reconstructed from

noisy w

′

. Here t ∈ [1,s] and r ∈ [s+ 1,n] are system-

wide parameters, where n is the set of all possible

points, or the universe. Work is over ﬁeld F

, where n

is a prime power.

To compute SS(w):

1. Choose a random polynomial p(·) of degree at

most s− t − 1 over F

2. For each w

∈ w, let x

= w

and y

= p(x

3. Choose r− s distinct points x

s+1

,. . .,x

at random

from F

\ w and set y

← F

\ {p(x

)} for i = s+

1,...,r.

4. Output SS(w) = {(x

),. . .,(x

)} sorted by

the value of x

’s.

To compute Rec(w

′

,S):

1. Create the set D of pairs (x

) such that x

∈ w

′

2. Run Reed-Solomon decoding on D to recover the

polynomial p(·).

3. Output s points of the form (x

, p(x

)) from S.

Privacy of the biometric depends on the number and

distribution of points S (i.e., the difﬁculty of identi-

fying the original points and the number of spurious

polynomials created by the chaff points). The en-

tropy loss due to the release of S is upper bounded

by tlogn+ log





− log



n−s

r−s



+ 2.

Improved Fuzzy Vault. (Dodis et al., 2008)observed

that the polynomial in the above construction does not

need to be random, which allows for a secure sketch

with signiﬁcantly lower entropy loss, t logn.

To compute SS(w):

1. Compute unique monic polynomial p(x) =

∏

∈w

(x− w

) of degree s.

2. Output the coefﬁcientsof p() of degree s−1 down

to s− t, which will form SS(w) = (c

s−1

,. . ., c

s−t

To compute Rec(w

′

,S = (c

s−1

,. . .,c

s−t

)):

1. Create a new polynomial p

high

of degree s that

shares the top t + 1 coefﬁcients with p(), i.e.,

high

(x) = x

∑

s−1

i=s−t

2. Evaluate p

high

on points of w

′

to obtain pairs

), ..., (a

3. Use Reed-Solomon decoding to ﬁnd a polynomial

low

of degree s− t − 1 such that p

low

) = b

for

at least s− t/2 values of a

’s. If none are found,

output fail.

4. Output the roots of the polynomial p

high

− p

low

SECRYPT 2011 - International Conference on Security and Cryptography

Another construction for set difference, Pinsketch,

is suitable for large universe sizes and variable num-

ber of elements in w. It is syndrome-based, and its

(in)security is not difﬁcult to reduce to the previously

analyzed code-offset scheme. We thus omit its anal-

ysis. For the edit distance, the only known way to

construct a secure sketch is by embedding it into a

transitive metric of larger dimension and applying a

secure sketch construction to the target metric. An

embedding with attractive properties was developed

in (Dodis et al., 2008) using Pinsketch. Once again,

the insecurity of the resulting scheme can be shown

using prior results and is omitted. This covers all se-

cure sketch schemes.

2.3 Security Notions

The original security deﬁnitions of fuzzy sketches and

extractors were formulated for a single instance of a

fuzzy sketch or extractor in isolation (Dodis et al.,

2004). Consecutive literature (Boyen, 2004; Simoens

et al., 2009) considered a stronger (and more re-

alistic) adversarial model where such constructions

can be invoked multiple times and therefore the se-

curity guarantees must hold when the constructions

are reused. Furthermore, the power granted to the

adversary can greatly differ. In this work we use

weak adversaries while analyzing existing construc-

tion (to show that they do not provide sufﬁcient secu-

rity guarantees even in presence of weak adversaries)

and strong adversaries when proving security of our

proposed solution. In a nutshell, a weak adversary

is given two fuzzy sketches and tries to determine

whether they were produced using related biometrics

or what the biometric was, while a strong adversary

can adaptively ask for fuzzy sketches and private keys

that fuzzy extractors output.

Let t be the maximum amount of errors that the

biometric system can tolerate. We deﬁne ∆

to be the

set of all perturbation functions that represent differ-

ences in sampling biometric data; we get ∆

= {δ :

M → M such that dist(w, δ(w)) ≤ t}. We next de-

ﬁne a security game for weak adversaries with access

to public sketches and then proceed with a security

game for strong adversaries. Two security properties

for weak adversaries were deﬁned in (Simoens et al.,

2009): sketch indistinguishability and irreversibility.

2-Indistinguishability Game. (Simoens et al.,

2009):

1. The challenger chooses a random variableW ∈ M

and samples it to obtain w

∈ M . The challenger

computes S

= SS(w

) and gives S

to A .

2. The challenger chooses b

← {0,1}. If b = 1, the

challenger chooses δ

← ∆

and produces related

= δ(w

). Otherwise, the challenger samplesW

to obtain a different w

. The challenger computes

= SS(w

) and gives S

to A .

3. The adversary A eventually produces a bit b

′

and

wins if b

′

= b.

A ’s advantage in this game is deﬁned as Adv

ind



Pr[b

′

= b] −



= 2



Pr[b

′

6= b] −



Deﬁnition 3. An (M ,m,m

′

,t)-secure fuzzy sketch

(SS, Rec) is ε-indistinguishable in ∆

if for any ad-

versary A it holds that Adv

ind

≤ ε. The fuzzy sketch is

reusable when ε is negligible.

The irreversibility property of a fuzzy sketch scheme

means that an adversary who obtains access to multi-

ple sketches generated from the same noisy input us-

ing possibly different sketching functions is unable to

recover the original input. In the current version of

this work we do not treat irreversibility, since a fail-

ure to achieve the indistinguishability property alone

points out weaknesses of a fuzzy sketch scheme.

We now proceed with deﬁning security games for

more powerful adversaries using what we term weak

biometric privacy and strong biometric privacy. In

both of them the adversary is allowed to query the

scheme a large number of times, but the difference is

that in the ﬁrst the adversary obtains access only to the

public information, while in the second it also obtains

access to the key output by a fuzzy extractor. Thus,

we use the ﬁrst deﬁnition for secure sketches and the

second one for fuzzy extractors.

The two security games below are roughly equiva-

lent to outsider and insider chosen perturbation secu-

rity in (Boyen, 2004), but are stronger than the respec-

tive deﬁnitions in (Boyen, 2004). In particular, in our

deﬁnition of weak biometric security we require the

adversary to only distinguish between two sketches,

while the adversary was required to recover the bio-

metric w in (Boyen, 2004). Furthermore, instead of

allowing the adversary to query fuzzy sketches for a

particular biometric w and then challenging the ad-

versary by asking it to distinguish between a sketch

for w and a sketch for a randomly chosen biometric,

we setup two biometrics w

and w

and allow the ad-

versary to query sketches for both. Then during the

challenge, the adversary is asked to determine which

biometric was used in producing the challenge sketch.

This can give the adversary advantage over the prior

formulation, especially in the computational setting

where different users will possess different keys.

As our schemes work in the computational setting,

we use κ to denote the security parameter. All algo-

rithms are assumed to be polynomial time in κ. Then

a function ε(κ) is negligible if for all positive polyno-

mials p(·) and sufﬁciently large κ ε(κ) < 1/p(κ).

ON THE (NON-)REUSABILITY OF FUZZY SKETCHES AND EXTRACTORS AND SECURITY IN THE

COMPUTATIONAL SETTING

Weak Biometric Privacy.

1. (Preparation) A chooses a random variable W ∈

M and sends its speciﬁcation to the challenger.

2. (Sampling) The challenger randomly samples W

to obtain w

∈ M and w

∈ M and initializes two

users U

and U

, resp., using that information.

3. (Queries) A makes up to q possibly adaptive

sketching queries: to form query i, A chooses

∈ ∆

and sends it and a bit b

to the challenger.

The challenger computes S

← SS(δ

);r

) us-

ing fresh randomness r

and returns S

to A .

4. (Challenge) The challenger chooses a bit b

←

{0,1} and δ

← ∆

, and produces a biometric

′

= δ(w

). The challenger then computes S ←

SS(w

′

;r) using fresh random r and gives S to A .

5. (More queries) A can run more queries up to the

bound q as speciﬁed in step 3.

6. (Response) A eventually produces a bit b

′

and

wins if b

′

= b.

A ’s advantage in this game is deﬁned as Adv

wbp

(κ) =



Pr[b

′

= b] −



= 2



Pr[b

′

6= b] −



Deﬁnition 4. An (M ,m,m

′

,t)-secure fuzzy sketch

(SS, Rec) has weak biometric privacy if for any prob-

abilistic polynomial-time (PPT) adversary A it holds

that Adv

wbp

(κ) ≤ ε(κ) for a negligibly small ε(κ).

Note that unlike previous deﬁnitions, we explicitly

specify the security parameter κ and deﬁne the ad-

versary’s advantage as a function of it.

The next deﬁnition corresponds to the strongest

version of the insider chosen perturbation security

deﬁnition in (Boyen, 2004). The adversary can query

the challenger to obtain sketches on both related and

unrelated biometrics and private key corresponding to

unrelated biometrics. This time we ask the adver-

sary to distinguish between the secret key output by

a fuzzy extractor on a related biometric and a ran-

domly chosen string. Note that we do not ask the ad-

versary to distinguish between biometric-derivedkeys

of two users because the adversary has the choice of

the sketch that it can use in the challenge. This means

that the adversary will trivially know for which user

the secret key will be produced. We, however, note

that in order to distinguish secret keys corresponding

to two users, the adversary need to be able to distin-

guish at least one of them from a random string. Thus,

our deﬁnition of security will imply the security in

the game with two users. Let ∆ denote all perturba-

tion functions over space M , i.e., ∆ = {δ : M → M }

where dist(w,δ(w)) can be greater than t.

Strong Biometric Privacy.

1. (Preparation) A chooses W ∈ M and gives its

speciﬁcation to the challenger.

2. (Sampling) The challenger randomly samples W

to obtain w ∈ M .

3. (Public queries) A makes up to q possibly adap-

tive generation queries: to form query i, A

chooses δ

∈ ∆ and sends it to the challenger. The

challenger computes (P

) ← Gen(δ

(w);r

) us-

ing fresh random r

and returns public P

to A .

4. (Private queries) A makes up to q

′

possibly adap-

tive reproduction queries that can be interspersed

with public queries as follows: to form query i,

A chooses δ

′

∈ ∆ and a public data P

′

and sends

them to the challenger. The challenger computes

′

← Rep(δ

′

(w);P

′

) and returns R

′

to A .

5. (Challenge) A chooses string P

∗

∈ {P

,. . .,P

}

from one of the strings returned by the challenger

in a public query such that P

∗

was produced using

a public query δ

with dist(w,δ

(w)) ≤ t and in any

private query (δ

′

∗

) the distance dist(w,δ

′

(w)) >

t. A sends P

∗

to the challenger. The challenger

chooses a bit b

← {0,1}. If b = 1, the challenger

computes R ← Rep(w, P

∗

) and gives it to A . Oth-

erwise, if b = 0, it chooses a random string of the

same length and gives it to A instead.

6. (More queries) A can run additional queries as

speciﬁed in steps 3 and 4 (up to q and q

′

queries,

respectively) with the exception that any query

(δ,P

∗

) such that dist(w,δ(w)) ≤ t is not allowed.

7. (Response) A eventually produces a bit b

′

and

wins if b

′

= b.

A ’s advantage in this game is deﬁned as Adv

sbp

(κ) =



Pr[b

′

= b] −



= 2



Pr[b

′

6= b] −



Deﬁnition 5. We say that an (M ,m,m

′

,t,ε)-secure

fuzzy extractor (Gen, Rep) has strong biometric pri-

vacy if for any PPT adversary A it holds that

Adv

sbp

(κ) ≤ ε(κ) for a negligibly small ε(κ).

3 ANALYSIS OF EXISTING

SCHEMES

Fuzzy Vault. Before proceeding with the analysis,

we note that the basic idea for the strategy in attacking

the fuzzy vault scheme when two or more sketches are

available – computing the intersection of the points

– is straightforward and is not new. This attack ap-

peared in (Scheirer and Boult, 2007; Kholmatov and

Yanikoglu, 2008; Poon and Miri, 2009). We still an-

alyze the construction here because all previous pub-

lications assume that given sketches are related and

proceed with identifying original points. Our work,

however, assumes a signiﬁcantly weaker (and perhaps

more realistic) adversary that would like to determine

if two given sketches are related or not, which is a

SECRYPT 2011 - International Conference on Security and Cryptography

much more difﬁcult task. Therefore, we present a

rigorous new analysis that shows weaknesses of the

scheme even in the presence of the weakest adversary.

The adversary receives two secure sketches P

{(x

),. . .,(x

)} and P

= {(x

′

),. . ., (x

′

)},

and its goal is to determine the coin ﬂip, i.e., whether

the biometrics w

and w

are related or not. Let

and P

denote projections of P

and P

, resp.,

on the x-coordinate, i.e., P

= {x

,. . .,x

} and P

′

,. . .,x

′

}. The basic attack idea is to compute the

intersection of P

and P

and use its size to make a

distinction between related and unrelated biometrics.

Related sketches will overlap in at least s − t orig-

inal biometric points, while unrelated sketches will

have fewer original biometric points overlap. In addi-

tion, a number of chaff points in P

can collide with

chaff points in P

or points in w

\ (w

∩ w

) (simi-

larly, points from w

\(w

∩w

) can collide with chaff

points in P

). Thus, the size of P

∩ P

follows a

certain distribution, but the expected overlap size is

larger for related sketches. We ﬁrst analyze the prop-

erties of such a distribution.

Let α = |w

∩ w

| denote the number of biomet-

ric points in the intersection, i.e., α ≥ s−t for related

biometric samples and α ≤ s − t − 1 otherwise. Let

a = r−α and b = n−α, i.e., a is the number of sketch

points that do not correspond to the overlapping bio-

metric points and b is the overall space for such

points. As customary in the literature, we assume that

the biometric points of w are distributed uniformly in

the space; the chaff points are also drawn uniformly

at random from the remaining space. Then to de-

termine how many points from P

′

= P

\ (w

∩ w

)

will collide with points from P

′

= P

\(w

∩w

), sup-

pose there are b = n − α bins and points from P

′

oc-

cupy a = r− α of them, i.e., there are a random bins

with a ball in them. Then we throw another a balls

(points from P

′

) into the bins without replacement and

count the number of bins with two balls in them (i.e.,

if a bin has two balls, it is removed, so that no bin

has more than two balls; this is dictated by the re-

quirement that all r points in a sketch are distinct).

The above can be modeled as hypergeometric experi-

ment. Let X be a random variable that corresponds to

the number of collisions in P

and P

(i.e, its size is

|(P

∩ P

) \ (w

∩ w

)|). We obtain:

Pr[X = k] =





b−a

a−k







where X can range between 0 and a. This distribu-

tion’s mean value is E[X] = a · (a/b).

This analysis leads to the following attack strat-

egy: given sketches P

and P

, A computes P

, P

and c = |P

∩ P

|. Let β denote the value (r − s +

/(n − s + t) rounded to the nearest integer. If

c ≥ (s− t + β), output 1, otherwise, output 0.

Let α

auth

(α

imp

) denote a random variable cor-

responding to the distribution of |w

∩ w

| when w

and w

are related or authentic (unrelated or impos-

tor, resp.). Adversary A has the smallest probabil-

ity of distinguishing between authentic and impostor

sketches when the values of α

auth

and α

imp

are the

closest, i.e, α

auth

= s − t and α

imp

= s − t − 1. Ac-

cording to the indistinguishability deﬁnition, we have

Adv

ind

= 2



Pr[b

′

= b] −



. If we let X

denote the

random variable distributed according to the hyper-

geometric distribution above with α

= s− t and X

denote a similar random variable with α

= s− t − 1,

we obtain that A is successful with at least:

Pr[b

′

= b] = Pr[b

′

= 1| b = 1]Pr[b = 1] +

+ Pr[b

′

= 0| b = 0]Pr[b = 0] ≥

≥



Pr[X

≥ c− α

] + Pr[X

< c− α

]





Pr[X

≥ β] + Pr[X

< β+ 1]





r−s+t

∑

i=β



r−s+t



n−r

r−s+t−i





n−s+t

r−s+t



∑

i=0



r−s+t+1



n−r

r−s+t+1−i





n−s+t+1

r−s+t+1





This probability and Adv

ind

can be easily computed

for a given set of parameters n, r, s, and t. In re-

ality, each parameter above has limitations placed

on it by the behavior of the actual biometric data.

In particular, (Clancy et al., 2003) study applicabil-

ity of the fuzzy vault construction to ﬁngerprint data

and determines optimal parameters to use to achieve

adequate resistance of the construction against brute

force search (when an adversary is given a sketch and

tries to determine sensitive information by searching

through polynomials). While the fuzzy vault con-

struction was not used exactly as a secure sketch

in (Clancy et al., 2003) and was generalized, we nev-

ertheless obtain information about the parameters that

would be used for ﬁngerprint data. The ﬁeld F

for prime p, is used for representing ﬁngerprint fea-

tures in 2-D and the value of p is set to 251 giving us

n = 251

= 63001 (this value of n also provides many

choices for the decoding algorithm). The number of

biometric points in a ﬁngerprint was empirically de-

termined on average to be s = 38 (it can vary based on

the equipment and quality of data, but generally is in

a similar range). For this value of s, having 20 points

overlap would provide excellent distinguishing capa-

bility and low false acceptance rate (Pankanti et al.,

2002). Finally, the value of r is constrained in that the

complexity of decoding for legitimate users can grow

as r increases (this is caused by spurious polynomi-

als introduced by the chaff points). In particular, at

ON THE (NON-)REUSABILITY OF FUZZY SKETCHES AND EXTRACTORS AND SECURITY IN THE

COMPUTATIONAL SETTING

Figure 1: Adversary advantage Adv

ind

with parameters n =

251

, s = 38, t = 20, and varying r.

the decoding time, when a legitimate user computes

∩ S, where S = SS(w

), the decoding complexity

can grow when points from w

\ (w

∩ w

) coincide

with chaff points in S. Since |w

\ (w

∩ w

)| ≤ t

for legitimate users, the experiment now consists of

throwing t points in b = n − s + t bins, where a =

r − s+ t bins are already occupied. We want r to be

such that the expected (integer-valued)number of col-

lisions t(a/b) is 0.

Figure 1 plots the adversary’s advantage Adv

ind

for the above parameters as a function of r near the

suggested in (Clancy et al., 2003) value of r ≈ 300.

As evident from the ﬁgure, the advantage is signiﬁ-

cant even in the worst (for the adversary) case when

only one overlapping point separates authentic data

from impostor. The jumps in the plot correspond to

the places where the (integer-valued) mean of the dis-

tribution, E[X], increases by 1.

Improved Fuzzy Vault. An important observation

in designing an attack strategy for this construction is

that it is deterministic. This immediately implies that

the same biometric will always produce the same se-

cure sketch, giving the adversary the ability to distin-

guish sketches. Thus, as an important special case we

ﬁrst consider the adversary’s ability to win the indis-

tinguishability game when no noise affects multiple

sketches of the same w (this arises in several applica-

tions, where multiple keys are issued using the same

copy of w). Thus, when A obtains challenge S

, it

outputs 1 if S

= S

and 0 otherwise. This means that

when b = 1, A will always guess the bit correctly, but

when b = 0 it might still sometimes output 1 if the

two sketches happened to be the same. The probabil-

ity of the latter, however, is small and can be bound

as follows. Recall that sketch S consists of t coefﬁ-

cients of a polynomial p(x) = x

+ c

s−1

+ ... +

x + c

, where for w = {w

,. . .,w

} c

s−1

∑

s−2

∑

i6= j

, .. ., c

s−t

∑

C⊂[1,s],|C|=t

(

∏

i∈C

First, for an unrelated random biometric ˆw, the prob-

ability that

∑

ˆw

= c

s−1

(i.e., without any restric-

tions, there are

∏

s−1

i=0

(n − i) choices for s elements

without repetitions from the set of n elements, and

when the sum of the elements is ﬁxed (in F

), the

number reduces to

∏

s−1

i=1

(n− i)).

Now consider c

s−2

. We start with a simpler func-

tion x

= b in F

for a ﬁxed value of b. Recall

that n = p

for a prime p. We enumerate all pos-

sible solutions x

and x

for this function such that

6= x

(since all points in a biometric are different).

When b is zero, there are n−1 unordered pairs (x

)

with x

6= x

whose product equals to b (one value is

zero and the other can take n − 1 remaining values).

All elements other than zero form a cyclic multiplica-

tive group, and when b 6= 0 there are either

n−1

− 1 pairs (x

) with distinct x

and x

, when b

is a quadratic non-residue or quadratic residue, resp..

Therefore, the number of pairs (x

) satisfying the

congruence for any b is at most n − 1 from the over-

all space of

n(n−1)

such pairs, giving us the fraction

(n− 1)/

n(n−1)

Now recall that c

s−2

is composed of a summa-

tion of products w

for each i 6= j. When there is

only one product w

(i.e., s = 2), we obtain that

it is equal to 0 more frequently than to other values.

When, however, s > 2 this is no longer the case. Be-

cause all w

have to be unique and each w

appears

in a number of products w

, the value of the sum

tends to be distributed more evenly as s increases.

This means that the frequency of the most common

value of c

s−2

approaches

when s grows. To il-

lustrate this phenomenon, we plot empirical data for

small values of n = p

. In particular, for s = 2, 4,

and 6 and all possible w = (w

,. . ., w

) ∈ F

we ﬁnd

the value of the sum which occurs the highest num-

ber of times. Let it be denoted by count

max

and the

fraction of all biometrics w that results in such value

by f

max

= count

max





. To evaluate how the value

of f

max

compares to

, we plot their ratio f

max

Figure 2. For s = 2, f

max

is constant; for s > 2 it

is clear that f

max

rapidly approaches

from the above

even for very small values of s. This means that

a generous upper bound on the probability that c

s−2

of a randomly chosen ˆw will coincide with a speciﬁc

value of that coefﬁcient for an unrelated biometric w.

Extendingthis analysis to c

s−3

∑

, where

i, j, and k are pairwise distinct, we obtain that the

most frequently occurring value of c

s−3

is 0 and when

s = 3 (i.e., only one product). In that case, the number

of possibilities that result in that product is

(n−1)(n−2)

out of

n(n−1)(n−2)

2·3

total choices (and the number of

SECRYPT 2011 - International Conference on Security and Cryptography

Figure 2: The ratio of the fraction of most frequent value of

the sum c

s−2

for varying n and s.

possibilities when the product is non-zero is at most

n−3

n−1

). Thus, the fraction of triples that can result

in any given product is ≤

. For c

s−4

, the maximum

fraction is ≤

; for c

s−5

, it is ≤

, etc. Therefore, the

adversarial error is at most

, and in practice will be

close to

because s > t. Both of these quantities are

very low even for small values of t (such as 2), and the

probability with which the adversary considers two

unrelated biometrics to be related is very small. Its

advantage in the 2-indistinguishability game is:

Adv

ind

= 2



Pr[b

′

= b] −



= 2



Pr[b

′

= 1|b = 1]Pr[b = 1] +

+ Pr[b

′

= 0|b = 0]Pr[b = 0] −



2Pr[b

′

= 1|b = 1]

+ 2Pr[b

′

= 0|b = 0]

− 1



Pr[b

′

= 1|b = 1] + 1− Pr[b

′

= 1|b = 0] − 1



Pr[b

′

= 1|b = 1] − Pr[b

′

= 1|b = 0]



> 1−

The above analysis addresses an important special

case of w = w

′

. We defer analysis of the more gen-

eral case of related sketches to the full version.

4 OUR CONSTRUCTIONS

In what follows, let (SS

′

,Rec

′

) denote any existing

fuzzy sketch scheme (for any metric). The key k de-

notes the long-term user’s key of size κ, where κ is

the security parameter. This key k is not shared with

any parties. We ﬁrst provide additional deﬁnitions.

Deﬁnition 6. Let F : {0,1}

× {0,1}

ℓ

(κ)

→

{0,1}

ℓ

(κ)

be a family of functions. For k ∈ {0,1}

the function F

: {0,1}

ℓ

(κ)

→ {0,1}

ℓ

(κ)

is deﬁned as

(x) = F(k,x). F is said to be a family of pseudo-

random functions (PRF) if for every PPT adversary

A with oracle access to a function F

and all sufﬁ-

ciently large κ, |Pr[A

) − Pr[A

)]| is negligi-

ble in κ, where k

← {0, 1}

and f is a function chosen

at random from all possible functions mapping ℓ

(κ)-

bit inputs to ℓ

(κ)-bit outputs.

Deﬁnition 7. A family of functions h : {0,1}

{0,1}

→ {0,1}

ℓ

(κ)

is pairwise independent univer-

sal hash function if for all x, x

′

∈ {0,1}

, where x 6= x,

Pr[h

(x) = h

′

)] = 1/2

ℓ

(κ)

for y ∈ {0,1}

In the following secure sketch construction, it is re-

quired that ℓ

(κ) ≥ |SS

′

(w)|, where |a| is the length

of string a. We discuss the choice of parameters later.

To compute SS(w,k):

1. Choose r

∈ {0,1}

ℓ

(κ)

at random.

2. Output S = (S

) = (r

) ⊕ SS

′

(w)).

To compute Rec(w

′

,k,S = (S

)):

1. Compute u ← F

2. Output what Rec

′

⊕ u) outputs.

Theorem 1. Assuming that F is a family of PRFs, the

above fuzzy sketch scheme achieves weak biometric

privacy.

We omit security proofs due to space constraints.

Note that in our construction deterministic

schemes for the underlying SS

′

are preferred because

they produce most concise sketches. So far we as-

sumed that the output length of F, ℓ

(κ), is at least as

large as the output length of secure sketch |SS

′

(w)|.

While this will hold for many types of biometrics and

a reasonable choice of security parameter κ, in some

cases the representation of SS

′

(w) can be longer. In-

stead of increasing κ, we suggest modifying the al-

gorithm to use more than one application of F to

produce a longer pseudo-random sequence. For in-

stance, if ℓ

(κ) < |SS

′

(w)| ≤ 2ℓ

(κ), the sketch can

be produced as (r

,(F

)||F

((r

+ 1) mod 2

)) ⊕

′

(w)), where || denotes string concatenation. This

increases the number of random values on which F is

evaluated and thus the probability of their collision.

However, as long as |SS

′

(w)|/ℓ

(κ) is a constant or

polynomial in κ, the security guarantees still hold.

In the fuzzy extractor construction below we split

the key k into two keys k

and k

. This is done to

simplify the analysis. In practice, the sub-keys k

and

can be computed by applying a PRF keyed with k

to two different inputs.

To compute Gen(w,k

1. Compute S = SS(w,k

) using the fuzzy sketch

scheme above.

ON THE (NON-)REUSABILITY OF FUZZY SKETCHES AND EXTRACTORS AND SECURITY IN THE

COMPUTATIONAL SETTING

2. Choose r

← {0,1}

and compute s ← h

(w).

3. Output P = (S,r

) and R ← F

(s).

To compute Rep(w

′

,P = (P

))

1. Run Rec(w

′

) above to recover w. If it fails,

output ⊥.

2. Otherwise, reproduce the key R as F

′

), where

′

← h

(w), and output R.

When it is desirable that failures during reconstruc-

tion are not reported explicitly, Rep can be modiﬁed

to output a (wrong) private string, e.g., computed as

R = F

′

)).

We would like to explain the design choices made

in our construction. Because a PRF is a powerful

primitive, it by itself is sufﬁcient to produce the pri-

vate string R indistinguishable from random. For ex-

ample, setting R ← F

(w||r) for random r would sat-

isfy the security game requirements. The reason for

including the hash function h in the construction is to

compress the biometric w without loosing the amount

of its unpredictability. That is, the n-bit representation

of biometric is normally substantially longer than the

m bits of entropy it contains. For example, for iris the

standard values of these parameters are n = 2048 and

m = 256. Because m ∼ κ, we can use a hash function

h : {0,1}

× {0,1}

→ {0, 1}

to reduce the size of w

from n to m bits without loosing its entropy. In cases

when the value of m exceeds the desired length of the

input to a PRF, the hash function output length can be

further reduced, i.e., in general ℓ

(κ) ≤ m.

We note that the generic conversion of a secure

sketch to a fuzzy extractor (in Section 2.1) uses a

strong extractor, which can be built using a univer-

sal hash function alone. The use of the hash function

in a strong extractor is, however, constrained in that

the output length of the extractor must necessarily be

smaller than m to be able to meet the requirement of

the output being close to the uniform distribution. In

particular, at least 2log(

)− 2 bits of entropy are lost,

where the parameter ε determines the statistical dis-

tance between distribution of the output and the uni-

form distribution. In our case, no requirements on the

uniformity of the output must be met, and therefore

no reduction of the output length or entropy loss has

to take place.

Theorem 2. Assuming that F is a family of PRFs

and h is a universal hash function, the above fuzzy

extractor scheme achieves strong biometric privacy.

We would like to note that certain constructions of

PRFs are known to produce uniformly distributed se-

quences. For example, (Shparlinski, 2001) shows that

PRF in (Naor and Reingold, 1997) has this property

for almost all values of parameters. For us this means

that the adversary does not obtain advantage in distin-

guishing pseudo-random strings from random.

We also note that similar results can be achieved

by using encryption instead of PRF, and such schemes

might be known or used in industry.

5 RELATED WORK

The overall literature on fuzzy sketches and extrac-

tors is extensive, and we therefore highlight the most

fundamental results and analysis related to this work.

(Davida et al., 1998) proposed the ﬁrst off-line bio-

metric identiﬁcation scheme, where error-correcting

codes were used to reconstruct a biometric from its

noisy readings. (Juels and Wattenberg, 1999) devel-

oped a fuzzy commitment scheme, which became the

basis of the code-offset secure sketch for the Ham-

ming distance. (Juels and Sudan, 2002) proposed a

fuzzy vault scheme. (Dodis et al., 2004; Dodis et al.,

2008) formalized the notion of secure sketches and

fuzzy extractors in their seminal work, which gave a

generic conversion from a secure sketch to a fuzzy

extractor and developed a number of other schemes.

(Boyen et al., 2005) introduced robust fuzzy ex-

tractors secure against active adversaries, where the

reconstruction process fails if the sketch has been

tampered with. (Dodis et al., 2006) continue that line

of research and also study the keyed setting in the

bounded storage model. The use of the key in our set-

ting is fundamentally different from that work, where

two parties share a long-term secret key and use it to

generate a session key for data authentication. Our

constructions can potentially be applied to a robust

fuzzy extractor to improve reusability properties.

There are also publications that combine fuzzy

extractors with passwords to improve their security

properties such as (Ballard et al., 2008). This work

offers a simpler and more ﬂexible construction.

Security requirements for adequate use of fuzzy

sketches and extractors in cryptographic applications

have been developing over time. (Boyen, 2004)

showed that a number of original constructions can-

not be safely applied multiple times to the same bio-

metric. That work developed improved constructions

using certain error-correcting codes and permutation

groups that satisfy the reusability requirements. Our

security deﬁnitions for the strong adversary were in-

ﬂuenced by that work. Compared to (Boyen, 2004),

our solution leaks no information about the biomet-

ric data (while leakage is unavoidable in the setting

of (Boyen, 2004)) and works for all distance metrics

and all secure sketch schemes in the standard model

(while Boyen’s scheme is limited to special codes and

SECRYPT 2011 - International Conference on Security and Cryptography

a particular metric in the random oracle model).

(Scheirer and Boult, 2007) proposed three classes

of attacks on secure sketches and fuzzy vault in partic-

ular, one of which is equivalent to sketch reusability.

It has been empirically evaluated in (Kholmatov and

Yanikoglu, 2008) on the fuzzy vault scheme using 200

matching pairs of fuzzy vault sketches. The authors

were able to unlock (i.e., reconstruct the polynomial)

118 out of 200 pairs within a short period of time. We

note that this evaluation was performed on a speciﬁc

set of parameters already knowing that two stored

sketches are related. Our analysis, on the other hand,

is more general and can be applied to a wide variety

of parameters. It is also does not assume prior knowl-

edge of related sketches, but rather helps to identify

those records. (Poon and Miri, 2009) also describe

collusion attacks on the fuzzy vault scheme assuming

that the sketches are related. Finally, (Simoens et al.,

2009) introduced the notions of indistinguishability

and irreversibility for reusable sketches and showed

weaknesses of code-offset and permutation groups

constructions. We analyze other constructions with

respect to the indistinguishability property. (Kelk-

boom, 2010) also analyzes certain schemes.

6 CONCLUSIONS

This work investigates the reusability properties

of secure sketch and fuzzy extractor constructions.

Through new analysis we show that, in addition to

the schemes that have been previously shown to have

security weaknesses, other existing schemes do not

meet our security expectations. To mitigate the prob-

lem, we propose to use the computational setting.

Maintenance of a single key for all uses of such

schemes results in solutions with remarkable secu-

rity and usability improvements which are not possi-

ble otherwise. In particular, our general construction

works with any existing secure sketch and mitigates

information leakage associated with biometrics in the

standard model under generic hardness assumptions.

REFERENCES

Ballard, L., Kamara, S., Monrose, F., and Reiter, M. (2008).

Towards practical biometric key generation with ran-

domized biometric templates. In ACM CCS.

Blanton, M. and Hudelson, W. (2009). Biometric-based

non-transferable anonymous credentials. In ICICS,

pages 165–180.

Boyen, X. (2004). Reusable cryptographic fuzzy extractors.

In ACM CCS, pages 82–91.

Boyen, X., Dodis, Y., Katz, J., Ostrovsky, R., and Smith, A.

(2005). Secure remote authentication using biometric

data. In EUROCRYPT, pages 147–163.

Clancy, T., Kiyavash, N., and Lin, D. (2003). Secure

smartcard-based ﬁngerprint authentication. In ACM

SIGMM Workshop on Biometrics Methods and Appli-

cations, pages 45–52.

Davida, G., Frankel, Y., and Matt, B. (1998). On enabling

secure applications through off-line biometric identi-

ﬁcation. In IEEE Symposium on Security and Privacy,

pages 148–157.

Dodis, Y., Katz, J., Reyzin, L., and Smith, A. (2006). Ro-

bust fuzzy extractors and authenticated key agreement

from close secrets. In CRYPTO, pages 232–250.

Dodis, Y., Ostrovsky, R., Reyzin, L., and Smith, A. (2008).

Fuzzy extractors: How to generate strong keys from

biometrics and other noisy data. SIAM Journal of

Computing, 38(1):97–139.

Dodis, Y., Reyzin, L., and Smith, A. (2004). Fuzzy extrac-

tors: How to generate strong keys from biometrics and

other noisy data. In EUROCRYPT, pages 523–540.

Dodis, Y. and Smith, A. (2005). Correcting errors with-

out leaking partial information. In ACM STOC, pages

654–663.

Juels, A. and Sudan, M. (2002). A fuzzy vault scheme. In

International Symposium on Information Theory.

Juels, A. and Wattenberg, M. (1999). A fuzzy commitment

scheme. In ACM CCS, pages 28–36.

Kelkboom, E. (2010). On the performance of helper data

template protection schemes. PhD thesis, University

of Twente.

Kholmatov, A. and Yanikoglu, B. (2008). Realization of

correlation attack against the fuzzy vault scheme. In

Proceedings of SPIE, volume 6819.

Naor, M. and Reingold, O. (1997). Number-theoretic con-

structions of efﬁcient pseudo-random functions. In

IEEE FOCS, pages 458–467.

Nisan, N. and Ta-Shma, A. (1999). Extracting randomness:

A survey and new constructions. Journal of Computer

and System Sciences, 58:148–173.

Pankanti, S., Prabhakar, S., and Jain, A. (2002). On the in-

dividuality of ﬁngerprints. IEEE Transactions on Pat-

tern Analysis and Machine Intelligence, 24(8):1010–

1025.

Poon, H. and Miri, A. (2009). A collusion attack on the

fuzzy vault scheme. ISC International Journal of In-

formation Security, 1(1):27–34.

Scheirer, W. and Boult, T. (2007). Cracking fuzzy vaults

and biometric encryption. In IEEE Biometrics Sym-

posium, pages 1–6.

Shparlinski, I. (2001). On the uniformity of distribution

of the Naor-Reingold pseudo-random function. Finite

Fields and Their Applications, 7(2):318–326.

Simoens, K., Tuyls, P., and Preneel, B. (2009). Privacy

weaknesses of biometric sketches. In IEEE Sympo-

sium on Security and Privacy, pages 188–203.

Smith, A. (2004). Maintaining secrecy when information

leakage is unavoidable. PhD dissertation, MIT.

ON THE (NON-)REUSABILITY OF FUZZY SKETCHES AND EXTRACTORS AND SECURITY IN THE

COMPUTATIONAL SETTING