PIUDI: Private Information Update for Distributed Infrastructure

Shubham Raj, Snehil Joshi and Kannan Srinathan

Centre for Security, Theory and Algorithmic Research, International Institute of Information Technology, Hyderabad, India

Keywords:

Privacy, Blockchain, Private Information Retrieval, Private Information Retrieval-Writing, Distributed

Database, Packed Secret Sharing, Privacy Enhancing Technology.

Abstract:

Encrypted data is susceptible to side-channel attacks like usage and access analysis. Techniques like

Oblivious-RAM (ORAM) and privacy information retrieval and writing aim to hide clients’ access pattern

while accessing encrypted data on a distrusted server. However, current techniques are constructed for a single

server model making them unsuitable and inefﬁcient for contemporary distributed architectures. In our work,

we address this problem and provide a solution to private information update using packed secret sharing. Our

protocol, named “Private Information Update for Distributed Infrastructure” PIUDI, aims to mitigate the at-

tacks to which PIR-Writing protocols are more susceptible in a distributed environment. Our scheme is secure

in presence of up to t + k −1 compromised parties where k is the size of the data set. We also provide an

analysis of our protocol for computational efﬁciency and gas cost in blockchains.

1 INTRODUCTION

Encryption is one of the primary measures used

to safeguard sensitive data stored in databases(Popa

et al., 2011; Popa et al., 2014; Papadimitriou et al.,

2016). However, inference and log analysis attacks

pose signiﬁcant threats to the privacy and security of

encrypted data(Grubbs et al., 2017; Lacharit

e et al.,

2018). For example, an attacker could use trafﬁc anal-

ysis to infer when and how often they parties commu-

nicate, and also distinguish between different types of

encrypted data, such as emails, media ﬁles etc. Addi-

tionally, service providers can run analysis on the ac-

cess patterns over a client’s encrypted data to extract

vital information.

Inference based methods involve attempting to ex-

tract sensitive information by observing patterns in

the access and operations over encrypted data. These

kind of attacks are highly dangerous and can of-

ten compromise the privacy of individuals and or-

ganizations. Due to efﬁciency concerns, a majority

of current protocols unintentionally expose data ac-

cess patterns(Papadimitriou et al., 2016). Log anal-

ysis,meanwhile, exploits database logs to gain unau-

thorized access to sensitive information and ﬁnd cor-

relations between encrypted columns via frequency

attacks(Zolotukhin et al., 2014). The attackers

use these correlations to get insights about sensi-

tive data based on the type of data stored in the

database or mapping the data to publicly available

data-sets(Dwork et al., 2017). Thus attackers can of-

ten gain access to valuable information even without

breaking the encryption.

In this paper, we address the challenges posed by

these attacks. Our focus is majorly on mitigating cor-

relation and frequency attacks. Correlation attacks in-

volve ﬁnding correlations between different columns

in the database. By analyzing the correlations, at-

tackers can often infer sensitive information that they

would not otherwise have access to. Frequency at-

tacks, on the other hand, involve analyzing the fre-

quency of particular data values. Attackers can often

infer sensitive information by analyzing the frequency

of different operations as well as the frequency of ac-

cess queries of any type on data values, even if the

data itself is encrypted.

In current cryptography literature, there are

mainly two methods for concealing a client’s access

patterns: Oblivious RAM (ORAM) and Private Infor-

mation Retrieval (PIR). ORAM’s traditional approach

involves organizing data in a way that ensures that

the client never accesses the same part twice, with-

out an intermediary process that removes the corre-

lation between block locations(Goldreich and Ostro-

vsky, 1996)(Stefanov et al., 2018). While ORAMs

had low communication complexity and do not re-

quire any computation on the server, sometimes the

client may have to download and reorganize the entire

Raj, S., Joshi, S. and Srinathan, K.

PIUDI: Private Information Update for Distributed Infrastructure.

DOI: 10.5220/0012087900003555

In Proceedings of the 20th International Conference on Security and Cryptography (SECRYPT 2023), pages 425-432

ISBN: 978-989-758-666-8; ISSN: 2184-7711

 2023 by SCITEPRESS – Science and Technology Publications, Lda. Under CC license (CC BY-NC-ND 4.0)

425

database, which is not practical.(Islam et al., 2012).

In contrast to ORAM, Private Information Retrieval

(PIR) hides the speciﬁc query being made, regardless

of any previous queries. PIR uses homomorphic en-

cryption and does not hide a sequence of accesses,

but instead each access individually. The downside

is that the server needs to compute over the entire

database for each query, which can be impractical for

large databases.

In addition to all the above issues, current tech-

niques have only been proposed for single server

problems and distributed systems have generally been

overlooked. As such, using them for a distributed

database will require direct replication of the tech-

niques at each instance of the database itself. This

makes these techniques inefﬁcient and difﬁcult to

scale, and therefore impractical, in the presence of

a large number of instances. They also do not take

into cognizance data sharding techniques. Moreover,

data sharding is commonly used in large-scale appli-

cations that require the ability to store and process

vast amounts of data in a distributed environment.

Sharding is a technique that involves partitioning a

large database into smaller, more manageable subsets

called shards. Each shard contains a subset of the data

and can be stored on separate servers. This technique

can improve the scalability and performance of the

database by allowing multiple servers to process data

simultaneously. The current PIR-Writing techniques

do not provide an efﬁcient mechanism to handle the

case for data sharding as well. Our work aims to solve

these problems.

1.1 Relevant Work and Motivation

The problem of Private Information Retrieval (PIR)

has been studied extensively in the ﬁeld of cryptog-

raphy and computer science. PIR protocols allow a

client to retrieve data from a database without reveal-

ing which item was accessed. This is particularly use-

ful when dealing with sensitive information that needs

to be kept private against access pattern analysis. The

ﬁrst PIR protocol was proposed in the seminal work

of (Chor et al., 1998), and since then, most of the re-

search in this area has been done to develop more ef-

ﬁcient and secure protocols(Gasarch, 2004).

PIR solved the problem of obliviously reading the

data, as it ensured that the client’s privacy was pro-

tected. However, the problem of PIR-writing was still

relevant due to the statistical and inference-based at-

tacks on access patterns and database and system logs.

Such attacks revealed information about the client’s

queries, even if the speciﬁc data accessed remains pri-

vate. This posed a signiﬁcant challenge in the design

of PIR systems, as ensuring both data privacy and

query privacy is crucial to protect clients’ sensitive

information.

Boneh et al. proposed the ﬁrst PIR-Writing proto-

col with sublinear communication complexity, which

uses a bilinear-pairing based cryptosystem(Boneh

et al., 2007). Lipmaa et al. (Lipmaa and Zhang,

2010) then came up two new PIR-Writing proto-

cols. The ﬁrst PIR-Writing protocol is based on the

cryptocomputing protocol PrivateBDD of Ishai and

Paskin(Ishai and Paskin, 2007). The second protocol

is based on a fully-homomorphic cryptosystem. Both

these approached use computational assumptions de-

pending on the hardness of the underlying problem

upon which the cryptosystem is based. Our proto-

col relies on perfectly secure schemes. Even though

our PIR-Writing scheme is homomorphic, it does not

use homomorphic encryption for the private compu-

tational operations which makes it quite beneﬁcial

when computing happens over expensive environ-

ments like public and permission-less blockchains.

1.2 Contributions

In our paper, we propose a novel protocol, “Pri-

vate Information Update for Distributed Infrastruc-

ture” (PIU DI), that uses a combination of data shard-

ing and secret sharing to mitigate aforementioned at-

tacks as well as improve scalability for practical ap-

plications.

• Our protocol is incredibly communication efﬁ-

cient, utilising packed secret sharing to encode

multiple data elements into a single polynomial.

• It is also computation efﬁcient which in the con-

text of blockchains, saves gas costs.

• It is also more scalable in a distributed setting :

as the number of the database instance increases,

our protocol makes sure that encoded data set in-

creases by a constant rate.

• Our protocol also supports batch updates, which

greatly simpliﬁes the process of updating data

in both blockchain networks and distributed

databases and can lead to signiﬁcant improve-

ments in efﬁciency.

We also provide a detailed analysis of our pro-

posed solution, including a formal security proof

and a comparative analysis with existing protocols,

thereby demonstrating the effectiveness of our proto-

col.

SECRYPT 2023 - 20th International Conference on Security and Cryptography

426

1.3 Organization of the Paper

The ﬁrst section begins with an introduction, moti-

vation of our work and the literature survey. It also

outlines contributions of our work. The second sec-

tion deﬁnes the the communication and adversarial

models, cryptographic assumptions,and underlying

schemes we have utilized in our work. This provides

the reader with the necessary background knowledge

and technical details for our protocol. The third sec-

tion describes our proposed protocol in detail, and its

variations. In the fourth section, we provide formal

deﬁnitions of security and present a detailed proof

of our protocol’s security guarantees against differ-

ent attacks. The next section presents a performance

evaluation of our protocol and a comparative analysis

against existing protocols. The next section describes

potential use cases of our protocol in practical sce-

narios. Finally, the conclusion summarizes the contri-

butions of our research and provides a discussion of

the limitations of our study and suggestions for future

research.

2 PRELIMINARIES

2.1 Communication and Adversary

Model

In this paper, we will be examining the stand-alone

setting, which is characterized by a synchronous net-

work and perfectly private channels between all par-

ties involved in the protocol. The stand-alone setting

restricts our analysis to only a single protocol execu-

tion, as opposed to a repeated execution of the proto-

col with changing participants.

Furthermore, we assume static corruptions in

which the set of corrupted parties is ﬁxed ahead of

time and remains constant throughout the execution

of the protocol as well as semi-honest adversaries,

i.e., those who follows the protocol correctly, but may

attempt to learn information outside their purview

without actually deviating from the protocol in any

way. All parties also have a probabilistic polytime

bound on their computational power.

2.2 Shamir Secret Sharing

Shamir Secret Sharing (SSS)(Shamir, 1979) is a cryp-

tographic technique that allows a secret to be split

into multiple shares and distributed among a group of

participants in such a way that only a predetermined

number of shares are required to reconstruct the orig-

inal secret. This technique has found widespread use

in a variety of applications, including secure commu-

nication, key management, and data storage.

2.3 Packed Secret Sharing

The Packed Shamir secret sharing scheme proposed

in 1992(Franklin and Yung, 1992) is an extension of

the original Shamir secret sharing scheme introduced

by Shamir in 1979. This variant enables the sharing

of a group of secrets using a single Shamir sharing,

which is a more efﬁcient and convenient approach.

Speciﬁcally, if we have a vector x in a ﬁnite ﬁeld F

then we can create a degree-d packed Shamir sharing

denoted as [x]

, where d is a value between k −1 and

n −1. To reconstruct the original sharing, at least d +

1 shares are required, and any d −k + 1 shares are

independent of the underlying secrets. The packed

secret sharing has linear homomorphism as well has

multiplicative properties.

Consider a ﬁeld F and let α

, ..., α

be n dis-

tinct elements in F. Let pos = (p

, p

, ..., p

) be an-

other k distinct elements in F. Suppose we wish

to share a vector x = (x

, ..., x

) ∈ F

among k par-

ties such that each party receives a share of the vec-

tor. We can achieve this by constructing a degree-

d (d ≥ k −1) packed Shamir sharing of x, which is

a vector (w

, ..., w

) satisfying the following condi-

tions:

There exists a polynomial f () ∈ F[X] of degree at

most d such that f (p

) = x

for all i ∈1, 2, ..., k. For all

i ∈ 1, 2, ..., n, f (α

) = w

, where the i −th share w

held by party P

. In other words, the polynomial f () is

used to encode the vector x and the shares w

, ..., w

are used to distribute the encoded vector among the

parties. This allows each party to reconstruct their

share of the vector using their share and the shares of

other parties.

2.3.1 Packed Secret Sharing Protocol (PSS)

Let x be a vector that we want to share such that x

= (x

, ..., x

) ∈ F such that the protocol can tolerate

up to t adversaries. Let pos = (i

, ..., i

) be other ﬁeld

element such that pos ∈ F and they are the index to

encode the secret vector x.

1. Let dealer select a random polynomial f

() “of de-

gree at most d = k −1 +t

2. Encode x

∈x in the polynomial f

() as f

) = x

∀j = 1, ..., k

3. Distribute f

) to n parties such that w

∈F ∀i =

1, ..., n

Lemma 2.1. Suppose we have a secret vector x and

a random polynomial f

, as deﬁned previously. If we

PIUDI: Private Information Update for Distributed Infrastructure

427

select a subset of shares, containing no more than t

shares, then the distribution of those shares is unre-

lated to x. On the other hand, if we gather at least

k + t shares, we can use them to recover x.

Proof. To prove the ﬁrst statement, let’s consider the

shares f

(1) through f

(t), without loss of generality.

Using Lagrange interpolation, we can create a poly-

nomial h with a maximum degree of d = k −1 + t,

such that h(w

) through h(w

) are all zero, and h(i

)

= -x

for j = 1 through k. This means that for each

polynomial f

() that shares the secret vector x, there

is exactly one polynomial f

(x) + h(x) that shares the

all-zero vector and generates the same ﬁrst t shares.

Since the choice of polynomials is random and uni-

form, we can conclude that the distribution of the t

shares is the same for all secret vectors, namely the

distribution resulting from sharing the all-zero vector.

The last statement on reconstruction is straightfor-

ward and follows from Lagrange interpolation.

2.3.2 (PSS) Notations

Let x be the vector of secrets we want to share using

polynomial f

. Let y be the vector we want to securely

add to vector x and we share y using polynomial f

• ENCODE(x, f

) : [x, f

]

= ( f

), ..., f

))

• ENCODE(y, f

) : [y, f

]

= ( f

), ..., f

))

• ADD(x, y) : [x, f

]

+ [y, f

]

= [x + y, f

+ f

]

We can deﬁne the multiplicative properties of the

share in the same way

• MUL(x, y) : [x, f

]

∗[y, f

]

= [x ∗y, f

∗ f

]

We will only focus only on the additively homomor-

phic property of the packed secret sharing scheme to

keep the details of our protocol simple for analysis.

2.4 Sharding

Data sharding is a technique used to partition a large

database into smaller, more manageable chunks. Each

such chuck of data is called a shard. Sharding can

help distribute the load of database queries across

multiple servers, allowing for faster and more ef-

ﬁcient retrieval of data. This is particularly im-

portant for large-scale databases that require high

performance and low latency(Bagui and Nguyen,

2015)(Luu et al., 2016).

Sharding can be implemented in various ways, in-

cluding range-based sharding, hash-based sharding,

and directory-based sharding. In range-based shard-

ing, data is partitioned based on a speciﬁc range of

keys, such as timestamps or alphabetical characters.

In hash-based sharding, data is partitioned based on a

hashing function, which distributes data evenly across

shards. Directory-based sharding involves using a

central directory to map data to speciﬁc shards.

In this paper, we will shard a set of database ﬁelds

into smaller subsets such that every subset contains at

least one ﬁeld which has a higher access rate.

3 PROTOCOL

3.1 Overview

We propose three variations of our protocol to cover

the different kinds of use-cases as the efﬁciency will

differ widely depending on their applications in dif-

ferent cases. While the basic structure of the proto-

col will remain largely similar, there will be modiﬁca-

tions to it to make it more efﬁcient for each scenario.

Accordingly, our security deﬁnition will also vary for

each case.

All our cases assume a client who wants to maintain

a database DB protected by a PIR-Writing protocol

which has N records of C columns each, to be repli-

cated across m servers in some capacity. It is assumed

that up to t servers can collude with a semi-honest ad-

versary to learn more information about the ﬁles that

have been accessed by the client in DB. We have three

different scenarios:

1. Column Hiding : The client wants to hide the col-

umn that was updated in a record. The adver-

sary can learn which record was updated but it

can not tell the speciﬁc column in that record that

was changed. An example would be the engage-

ment metrics for a YouTube channel where the ad-

versary will know that some values were updated

about the channel, but not the exact ﬁelds.

2. Row Hiding : The client wants to hide the record

that was updated. The adversary can learn which

column was updated but it can not tell the spe-

ciﬁc record that was changed. A useful scenario

would be updating an employee’s salary so ad-

versary will know that someone’s salary changed

but will not know for how many employees or for

whom.

3. Database Hiding : The client wants to hide any

kind of update information. The adversary should

neither learn the record nor the column that was

updated. This can be the case of extremely sen-

sitive data like healthcare data that can be used

to draw inferences about both an individual or a

wider population.

SECRYPT 2023 - 20th International Conference on Security and Cryptography

428

3.2 Base Protocol

We ﬁrst present the base version of our protocol. The

variations are all derived from it and maintain the

same level of security.

Consider D to be a database that follows a tabular

structure and x = (x

, ..., x

) as a set of values that

is a part of one of the columns of this database.

The individual elements of x are owned by different

clients. The objective of this protocol is to ensure that

in case any element from x is updated, no entity can

obtain information about which particular element

has been updated. This ensures that privacy of the

individual elements in x is maintained.

PIUDI - PIR-Writing Protocol:

Common Input. A distributed database with n

instances.

Database Initialization. The database instances

have been initialised with shares of zero such that

[0, f

]

←− ENCODE(0, f

)

Client’s Input. n shares of a vector x of size k

Database Output. Updated database

1. Client chooses a random polynomial f

of degree

d such that d = k −1 + t where k is the size of

vector x

2. [x, f

]

←− ENCODE(x, f

) such that n ←−

size([x, f

]

)

3. Client distributes the shares to the n instances of

the database and adds to every initialised element

at the respective database instance : [0, f

]

[x, f

]

←− ADD(0, x)

PIUDI - PIR-Writing Batch Update Protocol:

Common Input. A distributed database with n in-

stances.

Database State. The database instances have shares

of a vector x such that [x, f

]

←− ENCODE(x, f

)

Client’s Input. n shares of a vector y of size k such

that a client wants to add each element of y to each

element of x at the respective vector positions: [y, f

]

←− ENCODE(y, f

)

Database Output. Updated database after perform-

ing the following operation: ADD(x, y) ←−[x, f

]

[y, f

]

= [x + y, f

+ f

]

1. Client chooses a random polynomial f

of degree

d such that d = k −1 + t where k is the size of

vector y

2. [x, f

]

←− ENCODE(y, f

) such that n ←−

size([y, f

]

)

3. Client distributes the shares to the n instances of

the database and adds to every initialised element

at the respective database instance : [x, f

]

[y, f

]

←− ADD(x, y)

3.3 Protocol Variations

3.3.1 Column-Hiding PIUDI Protocol

The adversary should not learn which at-

tribute/column was updated for a particular record in

the database.

Common Input. Same as base protocol

Database Initialization. Same as base protocol

Client’s Input. Same as base protocol

Database Output. Same as base protocol

1. Client chooses a random polynomial f

of degree

d such that d = k + t where the vector x is a vec-

tor containing a row of the table and k = c is the

number of columns in the database.

2. To update a value, [x, f

]

←− ENCODE(x, f

)

such that n ←− size([x, f

]

)

3. Client distributes the shares to the n instances

of the database and adds to every initialised

element at the respective database instance :

[x, f

]

+ [y, f

]

←− ADD(x, y)

This results in all record values in our updated row

being refreshed with new shares across all the servers,

while the other rows will remain unchanged. Adver-

sary can not guess which value change caused this

change in the row.

3.3.2 Row-Hiding PIUDI Protocol

This protocol ensures that the adversary can not learn

which row/record was updated corresponding to a

particular attribute/column update in the database.

Common Input. Same as base protocol

Database Initialization. Same as base protocol

Client’s Input. Same as base protocol

Database Output. Same as base protocol

1. Client chooses a random polynomial f

of degree

d such that d = k + t where the vector x is the

column to be updated and k = c is the number of

rows/records in the database.

2. To update a value, [x, f

]

←− ENCODE(x, f

)

such that n ←− size([x, f

]

)

3. Client distributes the shares to the n instances

of the database and adds to every initialised

PIUDI: Private Information Update for Distributed Infrastructure

429

element at the respective database instance :

[x, f

]

+ [y, f

]

←− ADD(x, y)

This results in all column values corresponding to

our column change being refreshed with new shares

across all the servers, while the other columns will

remain unchanged. Adversary can not guess which

value change caused this change in the column.

3.3.3 Cell-Hiding PIUDI Protocol

This protocol ensures that the adversary can not learn

which column was updated for a particular record in

the database. This is achieved by keeping the vec-

tor x as the entire database of size r ×c. The rest of

the protocol is the same as the base protocol. This

results in all the columns for all the records being re-

freshed with new shares across all the servers. While

this is more inefﬁcient than the other two variations,

cell-hiding protocol is still more efﬁcient than previ-

ous PIR-Writing protocols in a distributed setting.

4 PROOF OF SECURITY

While our protocol variations differ in the number of

shards and secret shares, their security is dependent

on the base protocol. WLOG, we deﬁne our security

as the following game between a challenger and an

adversary for the base protocol: Let there be a chal-

lenger running the protocol over n servers in the pres-

ence of an adversary A such that up to t servers can be

compromised by A. Given this setup:

1. A selects a database of N = r ×c records and sends

it to the challenger.

2. Challenger processes the database through the

protocol and distributes the shares across n servers

such that each share contains a vector x of k se-

crets.

3. A now selects two subsets of columns S

and S

from the same row of the database to be modiﬁed.

The new updated values are also selected by A. A

sends these subsets along with their indices and

new values to be updated to the challenger. The

choice of the subsets is restricted based on the hid-

ing variation of the protocol we have chosen.

4. Challenger decides to randomly choose one of

the two subsets S

and only modiﬁes that in the

database according to the protocol.

5. A observes the modiﬁed database and outputs its

guess for the value for b = 0, 1.

Let P

be the probability with which A outputs 1 given

b = 0, 1.

4.1 Deﬁnition

The database update (k −t −N, S

) is private, if for

all semi-honest PPT adversaries A, we have P

−P

being negligible i.e., A is not able to guess the column

that the client modiﬁed.

Proof. To prove our protocol’s security we will show

how it can be reduce to the packed secret sharing

(PSS) protocol mentioned in section 2.4.

For the given database, the client ﬁrst distributes the

database values using a (PSS) scheme with the packed

secret represented with a vector x of size k for each

shard. For example, in row-hiding protocol, the client

will have |c| number of vectors each of size k = |r|

to represent the values in the database as packed se-

crets. In the game with the adversary, the client then

chooses one subset for updating and converts it into

secret vectors x

new

of size k. These secret vectors will

depend on the protocol variation. For example in case

of row hiding, our subset values will represent parts

of some columns and all those columns will be used

as secret vectors.

Now for each such vector x

new

, the client pro-

duces a polynomial f

and creates shares s

i j

for

each database hosting node db

hosting corresponding

database shards shards, using PSS such that all shares

for that vector are updated across the distributed sys-

tem. For any node db

, give its current share value

for the vector, s

current

, its new share will be updated to

new

= s

current

+ s

i j

. The updated shares when recon-

structed using (PSS) will result in the updated values

as had been demanded by the adversary A. But, since

all the shares for the entire vector have been modi-

ﬁed, from the adversary’s view, it will not be able to

guess with more that negligible probability, which of

the given subsets was actually had its values changed.

Additionally, since we are following the (PSS) pro-

tocol to distribute and update the shares across the

servers, the security deﬁnition will hold, i.e., the ad-

versary with control of only up to t + k −1 servers

will not be able to learn anything about the new val-

ues from its shares alone.

For simplicity, the following restrictions will be

applied on the adversary in the security game, for our

three variations: For column and row-hiding, all the

values from each subset should (ideally) only come

form a single column/row. In case, this is not the case,

the client will need to update the shares for all the

rows/columns from which the two subsets draw their

values. For the cell-hiding, the entire database has to

be updated for change in a single value. However, we

restrict our adversary to selecting the subsets in some

SECRYPT 2023 - 20th International Conference on Security and Cryptography

430

pattern, then we can achieve much more efﬁcient and

privacy-preserving outcomes.

5 ANALYSIS

To the best of our knowledge, all other protocols in

this ﬁeld have been designed speciﬁcally for a single

client-server model. That is to say, the focus of their

design has been on improving the efﬁciency of the

computation and communication aspects of the pro-

tocol in a non-distributed setup. All previous proto-

cols like Lipmaa’s (Lipmaa and Zhang, 2010) results

in a complete update of the entire database i.e., all n

encrypted records will be modiﬁed. For a distributed

database, this would require necessary updates in all

the locations/shards which would cost around O(n

)

updates assuming there are n instances of the database

for high availability. In a more particular scenario

where the data is hosted on a blockchain, this would

also result in immense gas costs. Others, like (Ishai

et al., 2004) have implemented a distributed setup,

somewhat similar to ours, but it only works for private

information retrieval and not for updating. Since our

protocol focuses on minimizing the net difference in

terms of records changed between the original and up-

dated database, in terms of communication and com-

putational efﬁciency, it only needs O(n) updates as-

suming there are n instances of the database for high

availability for updating a number of shares related

only to the subset.

Batch Updates. The existing PIR-Writing protocols

do not explicitly address the issue of batch updates,

which is a critical consideration for reducing gas costs

on public blockchains(Sguanci et al., 2021). The rea-

son for this is that these protocols typically involve

accessing and updating individual elements of a dis-

tributed database one at a time, which can quickly be-

come prohibitively expensive in terms of the gas fees

required for each transaction. However, our proto-

col takes a different approach by utilizing packed se-

cret sharing, which allows us to compress an entire

set of data into a single ﬁeld element. By doing so,

we are able to update multiple elements within the

data set by only modifying this single ﬁeld element

at all instances of the distributed database. This ap-

proach drastically reduces the number of transactions

required to update the entire data set, leading to sig-

niﬁcant cost savings in terms of gas fees.

Privacy Improvement. Unsupervised sharding could

lead to information leakage. However, Sharding the

dataset based on the apriori knowledge of the user ac-

cess patterns can turn out to be an efﬁcient method

to improve privacy in practical scenarios. A trivial

way to achieve this could be by distributing data with

patterns of frequent access uniformly into different

shards.

6 APPLICATIONS

There are several important applications of our proto-

col in various ﬁelds such as medical research, ﬁnance,

and government, where sensitive data must be stored

and manipulated securely and privately. Our protocol

can be used as an add-on with existing protocols with-

out modifying the system drastically. This provides

an easy mechanism to protect existing protocols that

are susceptible to side-channel attacks. This protocol

is useful to protect clients data whenever it is updated

on a remote unreliable server. For example, in case of

medical information being store on a hospital server.

With the optimizations in place, it is very difﬁcult for

an adversary to glean information about which patient

record was updated.

7 CONCLUSION

In this paper, we have presented a novel information

theoretic PIR-Writing protocol, PIUDI, that is highly

suitable and efﬁcient for distributed database settings.

The protocol is designed to mitigate two main types

of attacks, correlation attacks and frequency attacks,

which have been identiﬁed as major vulnerabilities in

existing PIR protocols.

The proposed protocol not only addresses these

vulnerabilities but also improves efﬁciency using

batch updates. This makes our protocol highly efﬁ-

cient and scalable, which is critical for distributed ar-

chitectures. Another key advantage of this protocol is

that it is highly suitable for public blockchains. With

the growing popularity of blockchain technology, re-

ducing the gas cost is a critical consideration for any

blockchain-based protocol. This protocol achieves

this by drastically reducing the gas cost, making it a

highly desirable solution for blockchain-based appli-

cations.

Future work in this area could explore further im-

provements to the protocol, such as exploring more

efﬁcient methods of batch updates or investigating ad-

ditional attack types that may be mitigated through

the use of this protocol. Another avenue of future re-

search is reducing the gas cost of updates when the

data is completely on blockchain without intervention

off-chain programs. We hope that our work provides

an excellent foundation for such research in this area,

PIUDI: Private Information Update for Distributed Infrastructure

431

Table 1: Comparison of related schemes in distributed database setting with n instances each with m data objects each of l

bits.

Scheme Lipmaa Boneh Chandran PIUDI

Communication (logn + l)k ∗m(best case ) O(l ∗m

√

n) O(m ∗

√

1+α

n)polylog(n) O(l ∗n)

Computation O(m ∗(n ∗logn + n)) O(l ∗m ∗n) O(n ∗m ∗ polylog(n)) O(n)

DB size change None None None None

Table 2: Gas fee comparison between trival encrypted db(Best case), homomorphically encrypted db(Best case) and our

approach for updating n data objects with c being the gas cost for a trivial addition operation.

Scheme Encrypted Homomorphic encrypted PIUDI

Single update gas cost O(n ∗logn + n) ∗c O(n ∗logn + n) ∗c O(1) ∗c

Batch update gas cost N.A. N.A. O(1) ∗c

and we anticipate it will have practical implications in

the ﬁeld of privacy-preserving distributed databases.

REFERENCES

Bagui, S. and Nguyen, L. T. (2015). Database sharding:

to provide fault tolerance and scalability of big data

on the cloud. International Journal of Cloud Applica-

tions and Computing (IJCAC), 5(2):36–52.

Boneh, D., Kushilevitz, E., Ostrovsky, R., and Skeith, W. E.

(2007). Public key encryption that allows pir queries.

In Advances in Cryptology-CRYPTO 2007: 27th An-

nual International Cryptology Conference, Santa Bar-

bara, CA, USA, August 19-23, 2007. Proceedings 27,

pages 50–67. Springer.

Chor, B., Kushilevitz, E., Goldreich, O., and Sudan, M.

(1998). Private information retrieval. Journal of the

ACM (JACM), 45(6):965–981.

Dwork, C., Smith, A., Steinke, T., and Ullman, J. (2017).

Exposed! a survey of attacks on private data. Annual

Review of Statistics and Its Application, 4:61–84.

Franklin, M. and Yung, M. (1992). Communication com-

plexity of secure computation. In Proceedings of the

twenty-fourth annual ACM symposium on Theory of

computing, pages 699–710.

Gasarch, W. (2004). A survey on private information re-

trieval. Bulletin of the EATCS, 82(72-107):113.

Goldreich, O. and Ostrovsky, R. (1996). Software protec-

tion and simulation on oblivious rams. Journal of the

ACM (JACM), 43(3):431–473.

Grubbs, P., Ristenpart, T., and Shmatikov, V. (2017). Why

your encrypted database is not secure. In Proceedings

of the 16th workshop on hot topics in operating sys-

tems, pages 162–168.

Ishai, Y., Kushilevitz, E., Ostrovsky, R., and Sahai, A.

(2004). Batch codes and their applications. In Pro-

ceedings of the thirty-sixth annual ACM symposium

on Theory of computing, pages 262–271.

Ishai, Y. and Paskin, A. (2007). Evaluating branching pro-

grams on encrypted data. In Theory of Cryptography:

4th Theory of Cryptography Conference, TCC 2007,

Amsterdam, The Netherlands, February 21-24, 2007.

Proceedings 4, pages 575–594. Springer.

Islam, M. S., Kuzu, M., and Kantarcioglu, M. (2012). Ac-

cess pattern disclosure on searchable encryption: ram-

iﬁcation, attack and mitigation. In Ndss, volume 20,

page 12. Citeseer.

Lacharit

e, M.-S., Minaud, B., and Paterson, K. G. (2018).

Improved reconstruction attacks on encrypted data us-

ing range query leakage. In 2018 IEEE Symposium on

Security and Privacy (SP), pages 297–314. IEEE.

Lipmaa, H. and Zhang, B. (2010). Two new efﬁcient pir-

writing protocols. In Applied Cryptography and Net-

work Security: 8th International Conference, ACNS

2010, Beijing, China, June 22-25, 2010. Proceedings

8, pages 438–455. Springer.

Luu, L., Narayanan, V., Zheng, C., Baweja, K., Gilbert, S.,

and Saxena, P. (2016). A secure sharding protocol for

open blockchains. In Proceedings of the 2016 ACM

SIGSAC conference on computer and communications

security, pages 17–30.

Papadimitriou, A., Bhagwan, R., Chandran, N., Ramjee,

R., Haeberlen, A., Singh, H., Modi, A., and Badri-

narayanan, S. (2016). Big data analytics over en-

crypted datasets with seabed. In OSDI, volume 16,

pages 587–602.

Popa, R. A., Redﬁeld, C. M., Zeldovich, N., and Balakr-

ishnan, H. (2011). Cryptdb: Protecting conﬁdential-

ity with encrypted query processing. In Proceedings

of the twenty-third ACM symposium on operating sys-

tems principles, pages 85–100.

Popa, R. A., Stark, E., Helfer, J., Valdez, S., Zeldovich, N.,

Kaashoek, M. F., and Balakrishnan, H. (2014). Build-

ing web applications on top of encrypted data using

mylar. In NSDI, volume 14, pages 157–172.

Sguanci, C., Spatafora, R., and Vergani, A. M. (2021).

Layer 2 blockchain scaling: A survey. arXiv preprint

arXiv:2107.10881.

Shamir, A. (1979). How to share a secret. Communications

of the ACM, 22(11):612–613.

Stefanov, E., Dijk, M. v., Shi, E., Chan, T.-H. H., Fletcher,

C., Ren, L., Yu, X., and Devadas, S. (2018). Path

oram: an extremely simple oblivious ram protocol.

Journal of the ACM (JACM), 65(4):1–26.

Zolotukhin, M., H

ainen, T., Kokkonen, T., and Silta-

nen, J. (2014). Analysis of http requests for anomaly

detection of web attacks. In 2014 IEEE 12th Inter-

national Conference on Dependable, Autonomic and

Secure Computing, pages 406–411. IEEE.

SECRYPT 2023 - 20th International Conference on Security and Cryptography

432