Exploit the Leak: Understanding Risks in Biometric Matchers

Dorine Chagnon

, Axel Durbet

1 a

, Paul-Marie Grollemund

2 b

and Kevin Thiry-Atighehchi

1,∗ c

University Clermont Auvergne, LIMOS (UMR 6158 CNRS), Clermont-Ferrand, France

University Clermont Auvergne, LMBP (UMR 6620 CNRS), Clermont-Ferrand, France

∗

Keywords:

Privacy-Preserving Distance, Hamming Distance, Information Leakage, Biometric Security.

Abstract:

In a biometric authentication or identiﬁcation system, the matcher compares a stored and a fresh template to

determine whether there is a match. This assessment is based on both a similarity score and a predeﬁned

threshold. For better compliance with privacy legislation, the matcher can be built upon a privacy-preserving

distance. Beyond the binary output (‘yes’ or ‘no’), most schemes may perform more precise computations, e.g.,

the value of the distance. Such precise information is prone to leakage even when not returned by the system.

This can occur due to a malware infection or the use of a weakly privacy-preserving distance, exempliﬁed by

side channel attacks or partially obfuscated designs. This paper provides an analysis of information leakage

during distance evaluation. We provide a catalog of information leakage scenarios with their impacts on data

privacy. Each scenario gives rise to unique attacks with impacts quantiﬁed in terms of computational costs,

thereby providing a better understanding of the security level.

1 INTRODUCTION

Biometric authentication protocols involve the com-

parison of a fresh biometric template with the refer-

ence template. This comparison computes the dis-

tance between the newly acquired data and the stored

template. If this distance is below a given thresh-

old, access is granted; otherwise, it is denied. Ham-

ming distance is a widely used metric in biomet-

ric applications e.g., biohashing (Patel et al., 2015;

Bernal-Romero et al., 2023), iriscode (Daugman,

2009; Dehkordi and Abu-Bakar, 2015; Daugman,

2015), face recognition (Yang and Wang, 2007; He

et al., 2015), gait recognition (Tran et al., 2017),

keystroke (Rahman et al., 2021), ear authentica-

tion (Wang et al., 2021) and palm-vein recogni-

tion (Cho et al., 2021). Computing this distance may

inadvertently leak information that adversaries might

exploit to reconstruct the stored template. These

vulnerabilities may arise from implementation er-

rors, inherent ﬂaws, and server-level attacks such

as malware (Sharma et al., 2023), which can com-

promise system-wide security. Furthermore, Aydin

https://orcid.org/0000-0002-4420-1934

https://orcid.org/0000-0002-1273-1658

https://orcid.org/0000-0003-0042-8771

and Aysu (Aydin and Aysu, 2024) and Hashemi et

al. (Hashemi et al., 2024) have highlighted an increas-

ing prevalence of side-channel attacks. Side-channel

techniques, including timing, differential power anal-

ysis, cache-based, electromagnetic, acoustic, and

thermal attacks, exploit various operational artifacts

to extract sensitive information (Sharma et al., 2023).

One of the concerns is the partial or total leakage

of distance computation information. Such informa-

tion leakage poses signiﬁcant security and privacy

risks, especially in sensitive applications like privacy-

preserving applications (e.g., biometric recognition

systems). In this paper, we focus on the following

attacks:

• Ofﬂine exhaustive search attacks refer to scenar-

ios for which a leaked yet obfuscated database is

available for an attacker. The attacker employs the

public transformation to verify a candidate vector.

This veriﬁcation may give additional information

beyond the minimal information leakage (‘yes’ or

‘no’), for example via side-channel attacks.

• Online exhaustive search attacks correspond to at-

tacks for which an attacker must interact with the

biometric system to infer information about the

targeted vector. Then, the attacker needs to force

the system to leak additional information beyond

the minimal information leakage (‘yes’ or ‘no’),

Chagnon, D., Durbet, A., Grollemund, P.-M. and Thiry-Atighehchi, K.

Exploit the Leak: Understanding Risks in Biometric Matchers.

DOI: 10.5220/0013250600003899

In Proceedings of the 11th International Conference on Information Systems Security and Privacy (ICISSP 2025) - Volume 2, pages 353-362

ISBN: 978-989-758-735-1; ISSN: 2184-4356

353

for example via a malware infection.

Related Works. To the best of our knowledge, two

papers investigate information leakage of biometric

systems using privacy-preserving distance. Pagnin et

al. (Pagnin et al., 2014) shows that the output of a

privacy-preserving distance can be exploited to infer

the hidden input. This type of attack is considered the

most devastating for such systems, as evidenced by

Simoens et al. (Simoens et al., 2012). The work of

Pagnin et al. takes place in the minimal leakage sce-

nario, wherein only the binary output of the biometric

system is given to the attacker. The authors present

the Center Search Attack, designed to recover the hid-

den enrolled input for any ‘valid’ biometric template

in Z

, where ‘valid’ refers to inputs within a ball cen-

tered at the enrolled template and with a radius equal

to the decision threshold t. To efﬁciently locate a valid

input, the authors also examine the exhaustive search

attack, particularly its application on binary templates

(q = 2). They suggest implementing a sampling with-

out replacement strategy using their Tree algorithm

to streamline the identiﬁcation of a suitable input for

the Center Search Attack. This efﬁcient identiﬁcation

of a proper input requires a number of authentication

attempts that is exponential in the space dimension n

minus the threshold t. While their work focuses on

the minimal leakage scenario, our analysis includes

the consideration of multiple additional information

leaks that may arise during the matching operation.

Contributions. We analyze the impact of poten-

tial information leakage in distance evaluations. Our

contributions detail various leakage scenarios, their

corresponding generic attacks, and the computational

costs involved:

• We revisit the exhaustive search attack in the sce-

nario of a minimal (one-bit) information leak-

age, correcting a previous result (see (Pagnin

et al., 2014)) about the costs of optimal and near-

optimal strategies and include additional informa-

tion on cases that are not well-detailed in the lit-

erature.

• We introduce new attack strategies by malicious

clients that exploit various levels of non-minimal

information leaks from the system. Our complex-

ity results, which detail the cost of these attacks,

apply to both ofﬂine exhaustive search attacks that

leverage a leaked (yet obfuscated) database and

online exhaustive search attacks involving direct

interactions with the server.

• We investigate a novel attack, named accumula-

tion attack, where an honest-but-curious server

accumulates knowledge during client authentica-

tion. This type of attack occurs when there is a

minor, yet non-negligible, amount of information

leakage.

The complexities of the attacks, relying on differ-

ent scenarios, are summarized in Table 1.

Outline. Section 2 introduces notations and termi-

nologies and classiﬁes the different types of informa-

tion leakages. Section 3 begins by revisiting the ex-

haustive search attack in the minimal (one-bit) infor-

mation leakage scenario, including a correction of a

previously cited result concerning the costs of optimal

and near-optimal strategies. It then introduces new

strategies for attacks by malicious clients capturing

various other types of information leakages, cover-

ing both ofﬂine and online exhaustive search attacks,

with an emphasis on their computational costs. The

section concludes by examining accumulation attacks

performed by an ”honest-but-curious” server during

client authentication, detailing the computational cost

involved. Section 4 provides a discussion of the pre-

sented results.

2 PRELIMINARIES

This section introduces the notations as well as the at-

tacker model and, a list of the considered information

leakage scenarios.

2.1 Notations and Attacker Models

Let Z

= {0,. ..,q − 1}

be a metric space equipped

with the Hamming distance d and ε ∈ N a threshold.

The Hamming distance is deﬁned by

d(x, y) =

{i ∈ {1,.. .n, }|x

̸= y

}

for two vectors x = (x

,. ..,x

) and y = (y

,. ..,y

) in

. Let Match

x,ε

denote the oracle modeling the inter-

action between the biometric system using a privacy-

preserving distance and the attacker. Match

x,ε

re-

ceives the template selected by the attacker and com-

pares it with the previously enrolled and stored tem-

plate. If the distance is below the threshold ε, the or-

acle returns 1 and 0 otherwise. In a more formal way,

Match

x,ε

is a function deﬁned as:

Match

x,ε

: Z

−→ {0,1}

y 7−→

(

1 if d(x, y) ≤ ε.

0 otherwise.

A privacy-preserving distance may leak additional

information beyond its binary output. Under the spec-

iﬁcations of each scenario, the oracle may display this

ICISSP 2025 - 11th International Conference on Information Systems Security and Privacy

354

Table 1: Summary of all leakage exploits and their complexities with α such that the occurrence of the rarest error is n

−α

with

α ∈ R

≥1

. The Distance-to-Threshold comparison determines if the leak occurs when d(x,y) ≤ ε (below) or when there is no

distance requirement between x and y (both). For all the complexities, x and y are in Z

with q ≥ 2 except for the minimal

leakage where x and y are in Z

. The provided complexities represent worst-case scenarios, except for the accumulation attack

where the result is the expectation.

Distance-to-Threshold comparison Leakage Complexity type Complexity in Big-Oh Theorem

Below

Distance Exponential q

n−ε

+ qε 3.2

Positions Exponential q

n−ε

+ q 3.3

Positions and values Exponential q

n−ε

3.4

Positions and values (accumulation) Linearithmic/Polynomial n

logn 3.9

Both

Minimal

Exponential q

n−ε

+ n(q − 1) + 2ε 3.5

Distance Linear nq 3.6

Positions Constant q 3.7

Positions and values Constant 1 3.8

Note that the Big-Oh complexity of the optimal exhaustive search strategy, in the worst-case, is the same as the naive strategy

as the minimum of h(·) is 0.

additional information. The objective of the attacker

is to ﬁnd the hidden template x exploiting the oracle

outputs. In the context of a biometric system, the ob-

jective of the attacker may be relaxed to simply ﬁnd y

that is close to x with respect to d and ε.

2.2 Typology of Information Leakage

In the context of a biometric system, a critical vul-

nerability arises when information is intercepted be-

tween the matcher and the decision module, as illus-

trated in Figure 1 (point 8). This ﬁgure, inspired by

Ratha et al. (Ratha et al., 2001), provides an overview

of the attack points in biometric systems while intro-

ducing both the decision module and two additional

attack points. Except for the accumulation attack, the

attacker exploits points 4 and 8 in all discussed sce-

narios. Point 4 allows the submission of a chosen

template, while point 8 grants access to additional in-

formation beyond the binary output. The accumula-

tion attack only necessitates control over the point 8.

For detailed insights into the remaining attack points,

readers are referred to Ratha et al. (Ratha et al., 2001).

There are three main categories of information leak-

age: Below the threshold; Above the threshold; Both

below and above the threshold.

In each of these categories, several sub-settings

can be identiﬁed. The ﬁrst one corresponds to the ab-

sence of any leakage, resulting in Match

x,ε

yielding

only the binary output. Then, the following informa-

tion leakages are examined:

• The distance.

• The positions of the errors.

• Both the error positions and values.

• Both the distance and the positions of the errors.

• Both the distance and the positions and their cor-

responding erroneous values.

It is not relevant to consider that additional informa-

tion is leaked only above the threshold, as no scheme

has such behavior. As a consequence, solely scenar-

ios ‘below the threshold’ and ‘below and above the

threshold’ are examined. The Hamming distance is

a measure of the number of differing coordinates be-

tween two templates. Therefore, knowledge of the

erroneous coordinates implies knowledge of the dis-

tance itself. Hence, we do not consider all possible

scenarios.

3 EXPLOITING THE LEAKAGE

This section provides a comprehensive analysis of the

attacks that can be performed in each leakage sce-

nario, along with an evaluation of their complexity.

3.1 Active Attacks

This section focuses on active attacks, i.e., attacks

where the attacker submits templates to the oracle

Match

x,ε

3.1.1 Attack Complexity for the Minimal

(One-Bit) Leakage

In this section, the attacker aims to ﬁnd a template

that lies in the ball of center x (the target template)

and radius ε (the threshold). To identify such a point,

several methods are available, each with its own set of

advantages and disadvantages.

Brute Force. The objective of this attack is to ex-

haustively test all possible templates until the oracle

Match

x,ε

yields 1. In the worst case, we test every

template, which results in the examination of q

vec-

tors. To obtain this result, we ignore the ε acceptance

Exploit the Leak: Understanding Risks in Biometric Matchers

355

Biometric Sensor Feature Extractor

Matcher

System Database

Decision Module

Application Device

1.Override

Sensor

2.Replay

Old Data

3.Override

Feature

Extractor

4.Channel

Attack

5.Override

Matcher

6.Modify

Database

7.Channel

Attack

8.Intercept

or Falsify

Matching

Informa-

tion

9.Override

Decision

Figure 1: Attack points in a generic biometric recognition system.

threshold. On the other hand, if we consider that only

n − ε exact coordinates are needed to be accepted by

the system, complexity decreases to q

n−ε

tests. Since

the attacker speciﬁcally targets n − ε coordinates (the

attacker arbitrarily chooses ε coordinates that do not

change), and aims for a perfect match for the n − ε

remaining coordinates yielding the result.

Random Sampling. The attacker randomly

chooses a template in Z

and tests it by querying

the oracle Match

x,ε

. The precise complexity of this

strategy has not been assessed in the literature. The

worst case for the attacker occurs when the templates

are uniformly distributed in Z

. The probability

that a template submitted to Match

x,ε

yields 1 is

ρ =

q,ε

(x)|

. According to this naive strategy, we can

assume that the tests are independent and that each

is modeled as a Bernoulli experiment with a success

probability of ρ. The number of tries needed to obtain

the ﬁrst success follows a geometric distribution.

Hence, the expected number of tries for an attacker

to get accepted by the system is p

−1

. First, recall that

the cardinal of B

q,ε

(x) is

q,ε

(x)| =

∑

i=0





(q − 1)

and that the q-ary entropy is h

(x) = xlog

(q − 1) −

x log

x − (1 − x)log

(1 − x). Then, using the Stirling

approximation (see (Timoth

ee and Ramanna, 2016;

Thomas and Joy, 2006)), the expected number of tries

for an attacker is

−1

q,ε

(x)|

∑

i=0





(q − 1)

≤

(ε/n)+o(n)

= q

n(1−h

(ε/n))+o(n)

≤ 1 −

holds, and if n is large enough.

Random Sampling Without Point Replacement.

As the random sampling, the attacker randomly

chooses a template in the set S ⊆ Z

. At each step,

if Match

x,ε

returns 0, the tested vector b is removed

from the set S. The probability of success does

not remain constant throughout the experiment, un-

like in the previous case. Consequently, the exper-

iment follows a hypergeometric distribution. This

game is equivalent to having an urn with q

object

where |B

q,ε

(x)| are considered ‘good’. Then, accord-

ing to Ahlgren (Ahlgren, 2014) the expected number

of queries to Match

x,ε

before success is given by

+ 1

q,ε

(x)| + 1

≈ ρ

−1

This attack has a slightly better performance com-

pared to the previous one, although it is accompanied

by an exponential memory cost that reduces its efﬁ-

ciency, making this version less interesting than the

previous one.

Remark 3.1.1. In the case of random sampling, if

the value of n is large, it is preferable to select a draw

with replacement to save memory while maintaining

a high degree of performance. Indeed, the probability

of drawing a vector that has already been selected is

relatively small if n is sufﬁciently large.

Tree Search. This algorithm was proposed by

Pagnin et al. (Pagnin et al., 2014). The underlying

idea is to construct a tree of depth n such that each

point of the space is considered to be a leaf. The

tree structure is utilized to establish relative relations

among the points of Z

and to guarantee that after

each unsuccessful trial, non-overlapping portions of

the space Z

can be removed. Speciﬁcally, if a point

p ∈ Z

does not satisfy the authentication, the algo-

rithm removes not only the tested point p from the

set of potential centers but also its sibling relatives

ICISSP 2025 - 11th International Conference on Information Systems Security and Privacy

356

generated by the common ancestor ε (i.e., the subtree

of height ε covering these siblings is removed). At

each attempt, the attacker can remove approximately

templates from the research space (for more de-

tails, please refer to (Pagnin et al., 2014)). The run-

ning time of the attack is the cost of exploring a q-ary

tree of order n − ε.

Remark 3.1.2. It should be noted that, as intended,

the cost of all the presented attacks is exponential.

Optimal Solution. The optimal solution is to solve

the set-covering problem (Korte et al., 2011) using

balls of radius ε. The main idea is to cover the

space with the smallest number of balls of radius ε

to partition the space. The objective is to remove

an entire ball of radius ε if the query fails. This is

an instance of the set covering problems. Pagnin et

al. (Pagnin et al., 2014) claimed that the number of

points that the adversary needs to query is only a

factor of O(εln(n + 1)) more than the optimal cover.

However, the result is imprecise, as detailed below in

this remark, mainly because the optimal cover is not

given.

Theorem 3.1. Given ε a threshold, x ∈ Z

a vec-

tor, and Match

x,ε

, an attacker with the optimal strat-

egy can retrieve x in q

n(1−h

(ε/n))+o(n)

queries to

Match

x,ε

Proof. The strategy between a bounded and an un-

bounded adversary may differ as detailed in the fol-

lowing:

• Unbounded Adversary: The adversary solves

the NP-hard set covering problem (Korte et al.,

2011) to ﬁnd the optimal covering of Z

using

balls of radius ε. The adversary exhaustively

searches x using at most q

n(1−h

(ε/n))+o(n)

queries

to Match

x,ε

. The number of vectors involved in

a given optimal cover is

q,ε

(x)|

, which can be

asymptotically approximated as detailed in what

follows. Then, using bounds on the binomial co-

efﬁcient (see (Thomas and Joy, 2006; Timoth

and Ramanna, 2016)), the result follows if

≤

1 −

holds and if n is large enough.

• Bounded Adversary: The adversary may use a

greedy algorithm to ﬁnd a non-optimal covering

containing

H(n)

q,ε

vectors (Chvatal, 1979) with

H(n) =

∑

i=1

−1

the n-th harmonic number. The

adversary then ﬁnds a solution with an exhaus-

tive search in at most

H(n)

q,ε

queries. To provide

a more intuitive value, notice that

H(n)

q,ε

can be

bounded up by

(ln(n)+1)

q,ε

. As in the unbounded

Table 2: Expected number of calls to oracle for the ex-

haustive search method Random Sampling with Replace-

ment (RSR). Examples with real biometric systems with

q = 2.

System n ε

RSR

(log

)

IrisCode (Daugman, 2009) 2,048 738 121.37

IrisCode (Daugman, 2009) 2,048 656 199.94

IrisCode (Daugman, 2009) 2,048 574 300.24

FingerCode (Harikrishnan et al., 2024) 80 30 5.92

BioHashing (Belguechi et al., 2013) 180 60 17.74

BioEncoding (Ouda et al., 2010) 350 87 70.62

BioEncoding (Ouda et al., 2010) 350 105 45.18

case, using the q-ary entropy and Stirling’s ap-

proximation, this non-optimal covering leads the

attacker to make at most q

n(1−h

(ε/n))+o(n)

queries,

as log

(ln(n) + 1) = o(n).

Then, in both cases, the number of queries is

n(1−h

(ε/n))+o(n)

and the result follows. ■

Remark 3.1.3. The time required to conﬁgure the

greedy algorithm is exponential, rendering the afore-

mentioned attack impractical. Moreover, even if an

attacker computes the optimal covering, it still needs

to query an exponential number of times the Match

x,ε

to ﬁnd a point close to x.

It is also interesting to note that the expected time

for an attacker to be accepted by the system using

the random sampling with and without replacement

method is equivalent to the worst case using the opti-

mal method.

Example of Expectations for the Random Sam-

pling. To illustrate the inﬂuence of the threshold

and the choice of q on exhaustive search, we calcu-

late the precise expectation of the number of attempts

required for an attacker to successfully impersonate

the user in different settings using the random sam-

pling method. The results are presented in Table 2.

Experimental results show that to increase the secu-

rity against exhaustive search, it is more interesting to

increase q than to decrease ε.

3.1.2 Attack Complexities for Leakage Below

the Threshold

Leakage below the threshold is considered in this sec-

tion. Given the hidden target x, querying y such that

d(x, y) ≤ ε to the oracle Match

x,ε

provides informa-

tion beyond the binary output.

Leakage of the Distance. The ﬁrst case occurs

when the distance is given to the attacker as extra in-

formation.

Exploit the Leak: Understanding Risks in Biometric Matchers

357

Theorem 3.2. Given ε a threshold, x ∈ Z

a vector,

and Match

x,ε

leaks the distance below the thresh-

old, an attacker can retrieve x in the worst case in

O(q

n−ε

+ qε) queries to Match

x,ε

Proof. The system, using the Hamming distance, re-

quires a minimum of n − ε accurate coordinates to

output 0. Since the attacker speciﬁcally targets n − ε

coordinates (the attacker arbitrarily chooses ε coor-

dinates that do not change), an exhaustive search at-

tack is performed in at most q

n−ε

steps to get ac-

cepted by the system. Then, a hill-climbing attack

runs on the remaining ε coordinates to minimize the

distance at each step. Coordinate by coordinate, the

attacker obtains the right value if the distance de-

creases. Since there are q different values to test on

ε coordinates, determining the correct ones requires a

maximum of (q − 1)ε steps. Then, the overall com-

plexity is O(q

n−ε

+ qε). ■

Leakage of the Positions. The positions of the er-

rors are the extra information given to the attacker,

while their values remain secret.

Theorem 3.3. Given ε a threshold, x ∈ Z

a vector,

and Match

x,ε

leaks the positions of the errors below

the threshold, an attacker can retrieve x in the worst

case in O(q

n−ε

+ q) queries to Match

x,ε

Proof. As the leakage occurs solely below the thresh-

old, the ﬁrst step is to ﬁnd a vector y ∈ Z

such that

d(x, y) ≤ ε. To identify such a vector, the attacker

performs an exhaustive search attack in q

n−ε

steps,

as previously shown. Since ε coordinates remain un-

known, and each coordinate ranges from 0 to q − 1,

every possibility must be examined. By testing all

possibilities simultaneously – for instance, testing all

coordinates at 0, then all coordinates at 1, and so forth

up to q − 2 while retaining the correct values – the

original vector can be identiﬁed in no more than q −1

queries (refer to the example illustrated in Figure 2).

Therefore, the complexity of the attack for recovering

x is O(q

n−ε

+ q). ■

Figure 2 gives a representation of the attack de-

scribed above in the case Z

and the hidden vector

or the missing coordinates is (0,1, 3,2,2). Note that

the actual complexity is q− 1 since the ﬁnal exchange

is unnecessary, as the coordinates at q − 1 become

known after q −1 queries by inference.

Leakage of the Positions and the Values. When

a vector below the threshold is given to the oracle

Match

x,ε

, the attacker gets information about both er-

ror positions and their values. This is similar to an

error-correction mechanism designed to correct errors

below a given threshold. Note that in the binary case,

this scenario is the same as the previous one, hence

the only considered case is q > 2.

Theorem 3.4. Given ε a threshold, x ∈ Z

a vector,

and Match

x,ε

leaks the positions and the values of the

errors below the threshold, an attacker can retrieve x

in O(q

n−ε

) queries to Match

x,ε

Proof. First, an exhaustive search is performed to ﬁnd

a vector y for which the distance is below the thresh-

old, for a cost of O(q

n−ε

). Then, given the error po-

sitions and the corresponding error values, y yields

immediately the recovery of x. In the end, the com-

plexity of the attack is O(q

n−ε

). ■

3.1.3 Leakage Below and Above the Threshold

The second scenario is considered in this section,

which involves a leakage independent of the thresh-

old. In other words, when a hidden vector x is tar-

geted, the queried vector y to the oracle Match

x,ε

re-

sults in the leak of additional information.

Minimal Leakage (a Single Bit of Information

Leakage). The basic usage of the system is char-

acterized by the minimal leakage scenario, where the

binary output itself is considered a necessary leak-

age. This minimal leakage is indispensable for the

system’s work and is consistent across these scenar-

ios as the system always responds. Remark that if the

server does not answer above the threshold, the non-

answer gives the attacker the wanted information.

Theorem 3.5. Given ε a threshold, x ∈ Z

a vector,

and Match

x,ε

that does not leak any extra informa-

tion, an attacker can retrieve x in O (q

n−ε

+n(q −1)+

2ε) queries to Match

x,ε

Proof. As in the previous cases, the attacker seeks a

vector y below the threshold. Such a vector is found

by exhaustive search in q

n−ε

steps. Then, the attacker

performs the center search attack (Pagnin et al., 2014)

(generalized to Z

) to retrieve the original data in at

most n(q − 1) + 2ε queries. Indeed, the generaliza-

tion does not change the cost of the edge detection

but changes the cost of the center search from n to

n(q − 1). The complexity of the attack to ﬁnd x is

O(2

n−ε

+ n + 2ε). ■

Leakage of the Distance. In this case, d(x,y) the

distance between y ∈ Z

the fresh template and x ∈ Z

the old template is leaked to the attacker regardless of

the threshold.

ICISSP 2025 - 11th International Conference on Information Systems Security and Privacy

358

Queries: ( 0 , 0

, 0

)

, 1 , 1

, 1

)

, 2

, 2 , 2 )

, 3

, 3 , 3

, 3

)

Solution: (0, 1, 3, 2, 2)

Figure 2: Exploiting the error position leaked in the case Z

and the hidden vector or missing coordinates is (0, 1,3,2, 2).

Theorem 3.6. Given ε a threshold, x ∈ Z

a vector,

and Match

x,ε

leaks the distance, an attacker can re-

trieve x in O(nq) queries to Match

x,ε

Proof. As the attacker has access to the distance, it

is possible to perform a hill-climbing attack, trying

to minimize the distance at each step. The strategy

is to ﬁnd the vector y, coordinate by coordinate. As

each coordinate has q possible values and there are n

coordinates, this is done in O(nq) steps. ■

Leakage of the Positions. The extra information

given to the attacker is the positions of the errors.

Theorem 3.7. Given ε a threshold, x ∈ Z

a vector,

and Match

x,ε

leaks the positions of the errors, an at-

tacker can retrieve x in O(q) queries to Match

x,ε

Proof. She tries the vector (0,... ,0), (1,...,1) up to,

(q − 1, .. .,q − 1) and keep for each coordinate the

right value (see Figure 2). Hence, the complexity of

the attack to recover x is O(q). ■

Leakage of the Positions and the Values. In this

last case, the positions of the errors and correspond-

ing values are leaked. Unlike the scenario of leakage

below the threshold, such a leak provides an error-

correcting code mechanism that operates irrespective

of any distance and threshold.

Theorem 3.8. Given ε a threshold, x ∈ Z

a vec-

tor, and Match

x,ε

leaks the positions of the errors

and their values, an attacker can retrieve x in O(1)

queries to Match

x,ε

Proof. The submission of any vector gives the posi-

tion of each error, and how to correct them, yielding a

complexity in O(1). ■

Example of the Worst Case for Active Attacks De-

pending on the Leakage. To illustrate the inﬂuence

of the leakage type on the attack complexity, we com-

pute the number of attempts (in the worst case) re-

quired for an attacker to successfully impersonate the

user in different settings. The results are presented in

Table 3 and Table 4.

3.2 Accumulation Attack: A Passive

Attack

During the client authentications, the attacker pas-

sively gathers information by observing errors leaked

by the server. More speciﬁcally, the server leaks a

list of positions and errors computed over the integers

(i.e., x

−y

) made by a genuine client during each au-

thentication. Such information gathered during one

successful authentication attempt is called an obser-

vation. The attacker aims to partially or fully recon-

struct x by exploiting these observations.

In the binary case (i.e., q = 2), the errors pre-

cisely yield the bits. If x

− y

= 1 then x

= 1, and

if x

− y

= −1 then x

= 0. This attack is related to

the Coupon Collector’s problem (Ferrante and Salta-

lamacchia, 2014), which involves determining the ex-

pected number of rounds required to collect a com-

plete set of distinct coupons, with one coupon ob-

tained at each round, and each coupon acquired with

equal probability.

Example 3.2.1. Suppose a setting with a metric

space Z

equipped with the Hamming distance. A

client seeks to authenticate to an honest-but-curious

server that uses a scheme leaking d(x, y) and the

corresponding errors if d(x,y) ≤ ε. As the client

is legitimate, i.e., d(x,y) ≤ ε with a high probabil-

ity, the attacker recovers the values of at most ε

erroneous bits. The attacker needs to collect all

the bits of the client, turning this problem into a

Coupon Collector problem. For example, let as-

sume x = (0,0,1, 1,0, 1,0), ε = 3. The attacker sets

z = (?,?, ?,?, ?,?,?). Session 1: The client authenti-

cates with y = (1,1,0,1,0, 1,0). In this case, d(x,y) =

3 ≤ ε. The values of the erroneous bits of the client

are obtained, yielding z = (0, 0,1,?,?,?, ?). Session 2:

the client authenticates with y = (0,0,0,0,1, 1,0). In

this case, d(x, y) = 3 ≤ ε, and the attacker obtains the

value of the erroneous bits of the client and updates

z = (0,0,1,1, 0,?, ?). At this point, replacing the un-

known values with random bits gives a vector that lies

inside the acceptance ball as the number of unknown

coordinates is smaller than the threshold ε.

In biometrics, some errors happen more fre-

quently than others. In this setup, the Weighted

Coupon Collector’s Problem must be considered.

Each coupon (i.e., each error) has a probability p

to occur. Suppose that p

≤ p

≤ ··· ≤ p

and

∑

i=1

≤ 1 then, according to Berenbrink and Sauer-

wald (Berenbrink and Sauerwald, 2009) (Lemma

3.2), the expected number of round E is such that:

≤ E ≤

H(n)

(1)

Exploit the Leak: Understanding Risks in Biometric Matchers

359

Table 3: Number of calls to oracle depending on the leakage type (worst case analysis). Examples with real biometric

systems (for the leakage below the threshold) with q = 2.

System n ε

Complexity (log

)

Distance Position

Distance

and Position

IrisCode (Daugman, 2009) 2,048 738 1,310 1,310 1,310

FingerCode (Harikrishnan et al., 2024) 80 30 50 50 50

BioHashing (Belguechi et al., 2013) 180 60 120 120 120

BioEncoding (Ouda et al., 2010) 350 87 263 263 263

Table 4: Number of calls to oracle depending on the leakage type (worst case analysis). Examples with real biometric

systems (for the leakage both above and below the threshold) with q = 2.

System n ε

Complexity (log

)

Distance Position

Distance

and Position

IrisCode (Daugman, 2009) 2,048 738 12.00 1 0

FingerCode (Harikrishnan et al., 2024) 80 30 7.32 1 0

BioHashing (Belguechi et al., 2013) 180 60 8.49 1 0

BioEncoding (Ouda et al., 2010) 350 87 9.45 1 0

with H(n) the n-th harmonic number. The upper

bound on H(n) is 1 +logn, which yields the expected

number of rounds required to complete the collection:

≤ E ≤

ln(n) + 1

. (2)

However, while in the original problem one coupon is

obtained at each round, the number of errors made by

a client during an authentication session is variable,

i.e., between 1 and ε. In this case, the expected num-

ber of rounds required before all the errors have been

observed is smaller than in the case where only one er-

ror occurs at each round. Consequently, the expected

number of rounds required to collect all the errors is

still in O(logn/p

Theorem 3.9. Given ε a threshold, x ∈ Z

a vec-

tor, Match

x,ε

leaks the positions of the errors and

their values below the threshold, and assuming that

the rarest coupon is obtained with probability p

−α

with α ∈ R

≥1

an attacker can retrieve x in

O(n

logn).

Proof. According to the Weighted Coupon Collec-

tor’s problem and assuming that the rarest coupon is

obtained with probability p

= n

−α

with α ∈ R

≥1

, the

vector x is recovered in O (n

logn) observations. ■

It is worth noting that in this scenario, the attacker

does not control the error. If the attacker controls

the error locations, then it is possible to obtain x in

⌈n/ε⌉ queries. This can happen during a fault attack,

akin to side-channel attacks. It should also be noted

that some coordinates of biometric data may be non-

variable and, as a consequence, an attacker cannot re-

cover them. This partial recovery attack is, therefore,

a privacy attack, and leads to an authentication attack

if the number of variable coordinates is sufﬁciently

large (at least n − ε in the binary case).

Remark 3.2.1. In the non-binary case, the value

− y

does not provide enough information. The

exact value of x

can be determined in two cases.

First, if x

− y

= −q + 1, then x

= 0. Second, if

−y

= 2(q −1), then x

= q −1. For all other cases,

there is an ambiguity regarding the value of x

as y

is unknown. However, by knowing the distribution of

and y

, repeating observations yields a statistical

attack.

Attacks for each type of leakage along with their

complexities are summarized in Figure 1.

4 CONCLUDING REMARKS

Our investigation into the information leakage of a

biometric system using privacy-preserving distance

has uncovered critical security vulnerabilities that

arise under various scenarios. By evaluating the im-

pact of different types of leakage, including distance,

error position, and error value, we have highlighted

the potential risks posed to data privacy and security.

Our analysis has encompassed ‘below the thresh-

old’ and ‘below and above the threshold’ setups, al-

lowing us to identify speciﬁc conditions under which

ICISSP 2025 - 11th International Conference on Information Systems Security and Privacy

360

information leakage can signiﬁcantly affect the over-

all security of the system.

It is important to highlight that the leakage ‘be-

low the threshold’ does not notably harm the security

of the system, while the leakage of ‘both below and

above the threshold’ markedly decreases the security.

Indeed, the attacks exploiting the leakage ‘below the

threshold’ are primarily exponential, while those ex-

ploiting information ‘below and above the threshold’

are mainly constant.

The accumulation attack we investigated assumes

errors uniformly distributed throughout each authen-

tication session. The result of the accumulation at-

tack could be further reﬁned by considering a variable

number of coupons, randomly drawn between 0 and

ε in each round, while acknowledging the actual dis-

tribution of the errors. To the best of our knowledge,

no previous studies provide an analysis of the distri-

bution of the errors for any systems.

In practical scenarios, certain errors may occur

more frequently than others, while some may never

occur. A skewed distribution of errors will substan-

tially increase the expected number of authentications

required from the legitimate user for the server to re-

cover the hidden template in its entirety. Future re-

search should involve reﬁning the accumulation at-

tack as suggested above and exploring other distance

metrics, such as L

(i.e., Manhattan distance) and L

ACKNOWLEDGEMENTS

The authors acknowledge the support of the French

Agence Nationale de la Recherche (ANR), under

grant ANR-20-CE39-0005 (project PRIVABIO).

REFERENCES

Ahlgren, J. (2014). The probability distribution for draws

until ﬁrst success without replacement.

Aydin, F. and Aysu, A. (2024). Leaking secrets in homo-

morphic encryption with side-channel attacks. Jour-

nal of Cryptographic Engineering, pages 1–11.

Belguechi, R., Cherrier, E., Rosenberger, C., and Ait-

Aoudia, S. (2013). Operational bio-hash to preserve

privacy of ﬁngerprint minutiae templates. IET bio-

metrics, 2(2):76–84.

Berenbrink, P. and Sauerwald, T. (2009). The weighted

coupon collector’s problem and applications. In Ngo,

H. Q., editor, Computing and Combinatorics, pages

449–458, Berlin, Heidelberg. Springer Berlin Heidel-

berg.

Bernal-Romero, J. C., Ramirez-Cortes, J. M., Rangel-

Magdaleno, J. D. J., Gomez-Gil, P., Peregrina-

Barreto, H., and Cruz-Vega, I. (2023). A review on

protection and cancelable techniques in biometric sys-

tems. IEEE Access, 11:8531–8568.

Cho, S., Oh, B.-S., Kim, D., and Toh, K.-A. (2021). Palm-

vein veriﬁcation using images from the visible spec-

trum. IEEE Access, 9:86914–86927.

Chvatal, V. (1979). A greedy heuristic for the set-

covering problem. Mathematics of operations re-

search, 4(3):233–235.

Daugman, J. (2009). How iris recognition works. In The

essential guide to image processing, pages 715–739.

Elsevier.

Daugman, J. (2015). Information theory and the iriscode.

IEEE transactions on information forensics and secu-

rity, 11(2):400–409.

Dehkordi, A. B. and Abu-Bakar, S. A. (2015). Iris code

matching using adaptive hamming distance. In 2015

IEEE International Conference on Signal and Im-

age Processing Applications (ICSIPA), pages 404–

408. IEEE.

Ferrante, M. and Saltalamacchia, M. (2014). The coupon

collector’s problem. MATerials MATem

atics, 2014:35.

Harikrishnan, D., Sunil Kumar, N., Joseph, S., and Nair,

K. K. (2024). Towards a fast and secure ﬁnger-

print authentication system based on a novel encoding

scheme. International Journal of Electrical Engineer-

ing & Education, 61(1):100–112.

Hashemi, M., Forte, D., and Ganji, F. (2024). Time is

money, friend! timing side-channel attack against gar-

bled circuit constructions. In International Confer-

ence on Applied Cryptography and Network Security,

pages 325–354. Springer.

He, R., Cai, Y., Tan, T., and Davis, L. (2015). Learning

predictable binary codes for face indexing. Pattern

recognition, 48(10):3160–3168.

Korte, B. H., Vygen, J., Korte, B., and Vygen, J. (2011).

Combinatorial optimization, volume 1. Springer.

Ouda, O., Tsumura, N., and Nakaguchi, T. (2010). Bioen-

coding: A reliable tokenless cancelable biometrics

scheme for protecting iriscodes. IEICE TRANS-

ACTIONS on Information and Systems, 93(7):1878–

1888.

Pagnin, E., Dimitrakakis, C., Abidin, A., and Mitrokotsa,

A. (2014). On the leakage of information in biometric

authentication. In International Conference on Cryp-

tology in India, pages 265–280. Springer.

Patel, V. M., Ratha, N. K., and Chellappa, R. (2015). Can-

celable biometrics: A review. IEEE signal processing

magazine, 32(5):54–65.

Rahman, A., Chowdhury, M. E., Khandakar, A., Kiranyaz,

S., Zaman, K. S., Reaz, M. B. I., Islam, M. T., Ezed-

din, M., and Kadir, M. A. (2021). Multimodal eeg

and keystroke dynamics based biometric system using

machine learning algorithms. Ieee Access, 9:94625–

94643.

Ratha, N. K., Connell, J. H., and Bolle, R. M. (2001).

An analysis of minutiae matching strength. In Bi-

gun, J. and Smeraldi, F., editors, Audio- and Video-

Based Biometric Person Authentication, pages 223–

228, Berlin, Heidelberg. Springer Berlin Heidelberg.

Exploit the Leak: Understanding Risks in Biometric Matchers

361

Sharma, S., Saini, A., and Chaudhury, S. (2023). A sur-

vey on biometric cryptosystems and their applications.

Computers & Security, page 103458.

Simoens, K., Bringer, J., Chabanne, H., and Seys, S. (2012).

A framework for analyzing template security and pri-

vacy in biometric authentication systems. IEEE Trans-

actions on Information Forensics and Security, 7:833–

841.

Thomas, M. and Joy, A. T. (2006). Elements of information

theory. Wiley-Interscience.

Timoth

ee, P. and Ramanna, S. C. (2016). Tutorial 10 for

Information Theory.

Tran, L., Hoang, T., Nguyen, T., and Choi, D. (2017). Im-

proving gait cryptosystem security using gray code

quantization and linear discriminant analysis. In Inter-

national Conference on Information Security, pages

214–229. Springer.

Wang, Z., Yang, J., and Zhu, Y. (2021). Review of ear bio-

metrics. Archives of Computational Methods in Engi-

neering, 28(1):149–180.

Yang, H. and Wang, Y. (2007). A lbp-based face recog-

nition method with hamming distance constraint. In

Fourth international conference on image and graph-

ics (ICIG 2007), pages 645–649. IEEE.

ICISSP 2025 - 11th International Conference on Information Systems Security and Privacy

362