HOMEFUS: A Privacy and Security-Aware Model for IoT Data Fusion
in Smart Connected Homes
Kayode S. Adewole
1,2 a
and Andreas Jacobsson
1,2 b
1
Department of Computer Science and Media Technology, Malm
¨
o University, Malm
¨
o, Sweden
2
Internet of Things and People Research Center, Malm
¨
o University, Malm
¨
o, Sweden
Keywords:
Smart Homes, Internet of Things, Data Fusion, Security, Privacy, Federated Learning, Sensors Selection.
Abstract:
The benefit associated with the deployment of Internet of Things (IoT) technology is increasing daily. IoT
has revolutionized our ways of life, especially when we consider its applications in smart connected homes.
Smart devices at home enable the collection of data from multiple sensors for a range of applications and
services. Nevertheless, the security and privacy issues associated with aggregating multiple sensors’ data
in smart connected homes have not yet been sufficiently prioritized. Along this development, this paper
proposes HOMEFUS, a privacy and security-aware model that leverages information theoretic correlation
analysis and gradient boosting to fuse multiple sensors’ data at the edge nodes of smart connected homes.
HOMEFUS employs federated learning, edge and cloud computing to reduce privacy leakage of sensitive
data. To demonstrate its applicability, we show that the proposed model meets the requirements for efficient
data fusion pipelines. The model guides practitioners and researchers on how to setup secure smart connected
homes that comply with privacy laws, regulations, and standards.
1 INTRODUCTION
Internet of Things (IoT) has paved ways for connect-
ing different sensors and smart devices to benefit from
a range of applications and services including smart
health, smart grids, intelligent transportation, smart
manufacturing, autonomous driving, and smart agri-
culture. With over 7 billion connected IoT devices to-
day, the number of smart devices that will be powered
by IoT technology is expected to grow to 22 billion
by 2025 (Gartner Inc., 2017). The vision of IoT is to
seamlessly connect everything in the physical world
over the Internet. This technology has transcended
into our homes, allowing us to connect smart home
devices using sensors, actuators, and controllers that
are equipped with wireless connectivity, and cognitive
computing technologies (Rahman et al., 2016). The
main challenge is how to ensure privacy and security
while also complying with laws, regulations and stan-
dards.
There are privacy laws and regulations such as the
General Data Protection Regulation (GDPR) in Eu-
rope, Health Insurance Portability and Accountability
Act (HIPAA) in the United States, Japan Personal In-
a
https://orcid.org/0000-0002-0155-7949
b
https://orcid.org/0000-0002-8512-2976
formation Protection Act, and China’s Cybersecurity
Law. For instance, Article 23 of the GDPR enforces
data controllers to store and analyze only the neces-
sary data required to achieve data collection objec-
tives, and to limit access to sensitive data to autho-
rized entities (A
¨
ıvodji et al., 2019). GDPR mandates
data controllers to obtain informed consent from data
subjects before collecting and analyzing their data,
and to provide them with the right to access, review,
correct, and delete their data. Despite the existence of
laws, regulations and standards, IoT companies con-
tinue to release devices into the market some of which
are vulnerable (Bugeja et al., 2019; Rahman et al.,
2016). Many users of IoT devices lack the knowledge
of the amount and the type of data collected by the
devices, and to what extent they are being utilized.
This makes it difficult for users to make informed de-
cisions regarding their privacy and how to control the
data collection and distribution process.
The privacy and security risks are further com-
plicated when multiple sensors are collaborating and
when the data collected have to be transmitted to the
cloud via the Internet (Mena et al., 2022; Chimamiwa
et al., 2021; Adewole and Torra, 2022b; Adewole
and Torra, 2022a). For instance, activities recogni-
tion in smart homes may involve data collection about
Adewole, K. and Jacobsson, A.
HOMEFUS: A Privacy and Security-Aware Model for IoT Data Fusion in Smar t Connected Homes.
DOI: 10.5220/0012437900003705
Paper published under CC license (CC BY-NC-ND 4.0)
In Proceedings of the 9th International Conference on Internet of Things, Big Data and Security (IoTBDS 2024), pages 133-140
ISBN: 978-989-758-699-6; ISSN: 2184-4976
Proceedings Copyright © 2024 by SCITEPRESS Science and Technology Publications, Lda.
133
home occupant’s behaviors, some of which are per-
sonally identifiable and sensitive. Some smart home
devices, e.g., cameras, wearables, and health moni-
toring devices may not only track occupancy or fit-
ness activities, but also infer sensitive information of
the users. The complexity and lengthy nature of the
policy documents that usually accompany IoT devices
from the manufacturers also make it difficult for the
users to have a broad understanding of the nature of
the data the IoT devices are collecting and how these
data are being shared. With this development, align-
ing with the GDPR regulations on informed consent
is becoming very difficult to implement in smart con-
nected homes (Pathmabandu et al., 2023). This re-
quires users to make adequate decision concerning
the risk associated with sharing their personal data
with the third party. Therefore, there is a need for
models that enforces privacy and security for smart
connected homes to minimize information disclosure
about users. This is the goal of HOMEFUS. More
specifically, this paper contributes in the following
ways:
1. We propose a privacy and security-aware model
that aggregates sensors’ data for effective IoT ap-
plication and service delivery in smart connected
homes.
2. We enforce privacy and security for sensors data
collection, processing, and analysis both at the
edge and in the cloud using federated learning.
3. The model leverages information theoretic corre-
lation analysis and gradient boosting for sensors
selection, fusion and modeling, which provide ef-
ficient method at the edge nodes.
4. We show that the proposed model meets the re-
quirements for effective data fusion pipeline.
The remainder of this paper is organized as fol-
lows. Section 2 presents IoT ecosystems and charac-
teristics. In Section 3, related work in IoT data fusion
for smart connected homes is presented. Section 4
presents the requirements for data fusion architecture.
In Section 5, we present the proposed model. Section
6 focuses on the evaluation and risk assessment of the
model, and finally, Section 7 concludes the paper and
discusses future research directions.
2 IoT ECOSYSTEM AND
CHARACTERISTICS
The IoT is typically described as a network of het-
erogeneous devices and services. These devices are
characterised as resource-constrained, making im-
plementation of the traditional encryption and data
anonymization algorithms almost impossible. The
components of an IoT ecosystem have the capability
to communicate autonomously over the network with
unique identity. A number of architectures have been
proposed for the IoT, often with the consideration of
three layers: perception, network and application lay-
ers (A
¨
ıvodji et al., 2019).
In Fig. 1, we present a view of an IoT ecosystem
that highlights the different components with a focus
on privacy and security. In this representation, we
add two layers thereby recognizing the crucial roles of
the users, their applications, and the controlled man-
agement of their corresponding information. Layer 1
considers ”things” in the IoT and it is the layer that
accommodates the smart connected devices. These
devices are hardware components capable of sensing,
actuating, controlling, communicating, and monitor-
ing. Layer 2 represents the communication and net-
work layer where the communication infrastructure
and standard protocols are used to establish interop-
erability among the connected smart devices of layer
1. Typical communication technologies include Zig-
bee, WiFi, Bluetooth, Cellular (e.g 5G), LoRaWAN,
and MQTT. This layer is responsible for transmitting
data to the computing and storage facilities, which are
handled by layer 3. Cloud infrastructures are mostly
leveraged in this layer. Layer 4 represents applica-
tions, which accommodates different software appli-
cations that render services to the end-users. Ser-
vices include activities recognition of home occu-
pants, occupancy sensing, energy consumption mon-
itoring, surveillance, logistics, sport, business, trans-
port, and so on.
We identified the roles of users in IoT ecosystem
which permeates across the different layers. Three
categories of users can be distinguished in connec-
tion to smart connected homes: residents, guests,
and malicious actors. The residents and guests are
considered as legitimate users. In smart connected
homes, attackers or malicious actors represent unin-
vited guests or users who aim to establish unautho-
rized connection to the home network to infer sensi-
tive information. An attacker’s goal is typically to es-
tablish unauthorized access within the IoT ecosystem
and to launch different attacks or privacy invasions.
There are also external attackers who may want to
launch complicated attacks such as denial-of-service
(DoS) and malware infections to disrupt the services
or infer some information from the target network
or smart devices on to the Internet. In layer 5, we
emphasise the need for security and privacy assess-
ment, which should consider all the layers of the IoT
IoTBDS 2024 - 9th International Conference on Internet of Things, Big Data and Security
134
ecosystem. Enforcing privacy and security in this re-
spect will help security analysts and IoT practitioners
to conduct thorough risk analysis of the IoT.
Figure 1: IoT ecosystem layers.
3 RELATED WORK
Information fusion deals with the study of explor-
ing efficient automatic or semi-automatic methods of
transforming data from multiple sources to produce
improved representation for better decision making
(Ding et al., 2019). In the domain of IoT, data fusion
involves aggregation of data from multiple sensors to
extract useful information for service improvement.
This research domain has been widely studied in the
recent years. This section presents related studies in
the area of IoT data fusion with a specific focus on
smart connected home research. Data fusion in smart
connected homes can be categorized as intrusive or
non-intrusive approaches based on the types of sen-
sors aggregated. We provide the detail as follows.
3.1 Intrusive Approaches
Intrusive approaches use smart devices such as cam-
eras, wearables, GPS and microphones in smart-
phones, just to mention a few. This approach has bet-
ter accuracy for smart connected home applications,
particularly for occupant monitoring. For instance,
(Monti et al., 2022) investigate the use of multiple
cameras for occupant counting. (Kommey, 2022) de-
sign an automated ceiling fan regulator for smart con-
nected home based on the fusion of data from web
cameras and temperature sensors. (Chaaraoui et al.,
2014) proposed a weighted feature fusion approach
which relies on the use of camera data. Fusion of
passive infrared sensor that detect human presence to-
gether with contact, camera and microphones sensors
have been studied in (Chahuara et al., 2013).
Although intrusive approaches has the potential to
offer better performance for smart connected home
applications, they compromise the privacy of the
home occupants. Our proposed model addresses the
privacy challenges in the existing studies by consid-
ering federated learning and hybrid data fusion ap-
proach to reduce privacy leakage of sensitive users
information and to improve the sensor fusion method.
3.2 Non-Intrusive Approaches
The development of smart home applications that
can offer comparable performance with intrusive ap-
proaches poses research challenges. To address the
privacy concerns in intrusive monitoring, the non-
intrusive approaches consider the fusion of environ-
mental sensors, such as carbon dioxide (CO2), to-
tal volatile organic compounds (TVOC), air temper-
ature, air humidity, and smart meters. (Sayed et al.,
2023) proposed the fusion of temperature, humidity,
pressure, light level, motion, sound, and CO2 sensors
for occupancy detection in smart connected homes.
(Dutta and Roy, 2022) combined different environ-
mental sensors with contextual information to im-
prove home occupancy detection. For a detailed dis-
cussion on data fusion as well as the methods used
during the fusion pipeline, the reader is refereed to
the comprehensive review in (Ding et al., 2019).
Non-intrusive approaches reduces privacy viola-
tion of home occupant information, nevertheless, they
suffer from shortcomings in terms of efficiency and
reliability. Moreover, studies have shown the behav-
ior of home occupants can still be monitored through
aggregating multiple sensors data even in a non-
intrusive way (Pathmabandu et al., 2023). The ques-
tion of which sensors should be aggregated to provide
better performance while considering the limited ca-
pability of smart home devices still remain an open re-
search issues for smart connected home applications.
Additionally, risk assessment of fusion methods has
not been considered in the existing studies. Our pro-
posed model addresses the challenges in non-intrusive
domain by considering hybrid lightweight sensor se-
lection approach to improve efficiency and reliability
of the fusion pipeline. Particularly, HOMEFUS lever-
ages federated learning to further preserve privacy of
sensitive data in smart connected homes.
HOMEFUS: A Privacy and Security-Aware Model for IoT Data Fusion in Smart Connected Homes
135
4 REQUIREMENTS FOR SMART
HOME DATA FUSION
To benefit from IoT applications in smart connected
homes and to make accurate predictions and deci-
sions, considering multiple sensors is usually advis-
able. Given the heterogeneous nature of sensors data
from multiple sources, distributing all the data over
the network will increase network bandwidth, in-
crease power consumption, decrease the longevity of
the battery-driven devices, and thus increase the costs
of the system. Therefore, in data fusion, it becomes
necessary to extract vital information from diverse ar-
rays of collected sensor data to improve data qual-
ity and facilitate decision making. The question of
which sensors should be aggregated to improve pre-
dictive performance still remain unsolved. Along this
development, (Ding et al., 2019) provides the basic
requirements for data fusion models in IoT scenarios.
We provide brief discussion as follows.
1. Context-Awareness. This involves the ability of
data fusion pipeline to accommodate background
information in addition to the sensor data to de-
velop intelligent fusion methods. The context
may not necessarily directly relate with the sen-
sors, but provide an additional detail that can fur-
ther improve smart home applications. As an ex-
ample, (Dutta and Roy, 2022) fuse environmental
sensors data with 16 types of context data includ-
ing AC status, fan status, door status, heat isola-
tion, location type, area type, and so on, to im-
prove occupancy detection.
2. Reliability. This deals with the reliability of
the results from the application of the data fu-
sion model. Because fusion results can have se-
rious impact in decision making for smart con-
nected home applications, such as activity recog-
nition, diagnoses, emergency predictions, fall de-
tections, fire outbreak predictions, energy con-
sumption predictions, and surveillance, the out-
put of these applications must be reliable to avoid
critical situations. Reliability in data fusion can
be assessed using the established metrics in infor-
mation retrieval. For instance, metrics for classi-
fication tasks include accuracy, precision, recall,
f-measure, and auc-roc. For clustering tasks, re-
liability can be assessed using metrics like rand
index, Silhouette score, Davies-Bouldin index,
Calinski-Harabasz index, and mutual Informa-
tion. Regression tasks are evaluated using met-
rics like coefficient of determination, root mean
square error, and mean absolute error.
3. Robustness. Data fusion needs to be robust to re-
sist different cyberattacks, such as data injection
and other forms of malicious software exploita-
tion. Since existing solutions utilize the Internet
protocol to transmit raw sensor data to the cloud,
the possibility of data or code injection attacks
cannot be underestimated. Robustness can be as-
sessed based on the level of security offered by the
fusion method.
4. Efficiency. This attribute deals with the ability of
data fusion methods to scale irrespective of the
size of the data required for the modeling pro-
cess. Efficiency can be assessed in terms of train-
ing time, testing time, communication cost, and
so on.
5. Verifiability. Data fusion results should be verifi-
able to ascertain if the contributing sensors actu-
ally improve the final fusion results. This can be
assessed by checking the quality of the data col-
lected from each sensor in relation with the smart
connected home application’s objective.
6. Security and Privacy: Data fusion architectures
must ensure that data transmission to the fusion
center is secured and that the data are handled
in conformity with privacy regulation. The re-
sults of the data fusion should also be protected
from unauthorized access or modification, and the
availability of the fusion results should not be
compromised.
7. Real-Time. Data fusion methods should de-
liver results to the end-users in real-time. Exist-
ing solutions leverage cloud computing capabil-
ity which offer a near real-time architecture for
IoT applications. Thus, data fusion model needs
to consider real-time delivery of fusion results for
timely decision making.
The above highlighted quality attributes guide the
formulation of our proposed model. HOMEFUS is a
privacy and security-aware model for data fusion tar-
geted toward smart connected homes. Section 5 pro-
vides a detailed discussion of the proposed model.
5 HOMEFUS - THE PROPOSED
MODEL
In this section, we discuss the different components
of the proposed model, and formally guide through its
requirements. HOMEFUS achieves privacy and secu-
rity, aligns with the requirements for data fusion as
previously stated, and includes risk assessment and
mitigation strategies that promote the evaluation of
IoTBDS 2024 - 9th International Conference on Internet of Things, Big Data and Security
136
privacy and security threats directed to the smart con-
nected home (see Fig. 2). The proposed model con-
siders the sensitivity of the data generated and has
the ability to use both intrusive and non-intrusive data
sources since the raw data of the home occupants are
not transmitted to the cloud. It also advocates secure
storage and computation in the cloud to prevent infer-
ence attacks on the local federated machine learning
model that is transmitted from each smart connected
home.
Formally, the proposed approach models a smart
connected home ecosystem as a tuple
(H,U,C, N,A,F,L
m
,G
m
,P) where H: smart con-
nected homes, U: users, C: context data, N: con-
nected nodes or devices, A: smart connected home
applications, F: mapping function, L
m
: local model,
G
m
: global model, and P: policy. We provide the de-
tails as follows.
5.1 Smart Connected Homes (H)
This refers to the set of smart connected homes that
have agreed to implement the data fusion pipeline of-
fers by the proposed model. Formally, we identify
a set of smart connected homes H = {h
1
,h
2
,..., h
n
}
where n is the number of homes. To allow for scal-
ability of the model, we assume the value of n is dy-
namic.
5.2 Users (U)
Users represent smart connected home occupants in-
cluding family members and invited guests. A set
of users in home h
i
, h
i
H, is denoted as U
h
i
=
{u
h
i
1
,u
h
i
2
,..., u
h
i
m
}, u
i
U. m is also considered to be
dynamic, because a user u
i
can join or leave the home
network at any time.
5.3 Context Data (C)
C
h
i
represents context data generated in each h
i
, h
i
H at a particular timestamp. Similar to (Dutta and
Roy, 2022), we distinguish two type of context data:
static (e.g location type, area type) and dynamic (e.g
fan status, awake status). Thus, C
h
i
= {C
h
i
s
C
h
i
d
}.
5.4 Nodes (N)
This represents smart connected devices or nodes. We
identify N
h
i
, the set of devices used by user u
i
in home
h
i
, u
i
U and h
i
H. These devices have the ca-
pability to sense, actuate, process, and transmit data.
Usually, N
h
i
= {C
n
M
n
P
n
} where C
n
are connected
devices with fixed location (e.g., a washing machine,
a fridge, etc.); M
n
are mobile devices (e.g., a smart-
phone, a laptop, etc.), and P
n
are the processing nodes
(e.g., the edges or the cloud). P
n
has additional fea-
tures which include storage, processing, and analytic
capability. From a security perspective, they are also
equipped with intrusion detection and prevention sys-
tem (IDPS). Edge nodes are located inside the home
h
i
while cloud connection is outside the home. Both
edge and cloud nodes are connected using a smart
home gateway. This gateway is maintained by the
middleware layer. From a security perspective, we
assume that the gateway uses standard protocols with
standard security configurations. A typical example
is the use of MQTT, which operates using the publish
and subscribe principle and can work with SSL/TLS
encryption. When a sending node publish a topic, the
receiving node can securely subscribe to it. HOME-
FUS leverages homomorphic encryption to secure the
local model to be transmitted via the gateway to the
cloud.
Each device in C
n
and M
n
N
h
i
can have one or
more sensors S
h
i
j
, j = 1,2, ...,k and i = 1, 2,..., n at-
tached to it. Our proposed model aims to identify the
sensor S
h
i
j
that should be fused with context data C
h
i
to improve smart connected home applications and
services. To ensure this, we propose filter-based sen-
sor selection using information gain and correlation-
based methods. These methods are not computation-
ally expensive compared to the existing state-of-the-
art approaches. Additionally, we chose this method
to take advantage of edge computing and offer a low-
computationally demanding solution.
5.5 Applications (A)
Every activity or behavior of user u
h
i
i
U can signal
a specific smart connected home application a
i
A,
where A is denoted as a set of smart home applications
or services, i.e., A = {a
1
,a
2
,..., a
}. Typical appli-
cations or services include activity recognition (e.g.,
cooking, washing, etc.), fire detection, surveillance,
energy management, remote monitoring, and so on.
5.6 Mapping Function (F)
The role of the mapping function F is to map
sensor S
h
i
j
, j = 1, 2, ..., k to a specific application
or service a
i
A. This mapping is denoted as
F({S
h
i
1
,S
h
i
2
,..., S
h
i
k
},a
i
). The mapping function com-
putes the relevance of each sensor to the IoT applica-
tion or service a
i
. F relies on both information gain
and correlation-based sensor selection for the map-
ping.
HOMEFUS: A Privacy and Security-Aware Model for IoT Data Fusion in Smart Connected Homes
137
Figure 2: Proposed model for data fusion in smart connected home.
Formally, given number of applications, infor-
mation gain (IG) can be obtained from entropy as fol-
lows.
Entropy(A) =
i=1
p(a
i
)log
2
(p(a
i
)) (1)
where p(a
i
) is the probability of A extracted accord-
ing to application a
i
. The contribution of each sensor
is estimated according to Eqn. 2.
F
IG
(A,S
h
i
j
) = Entropy(A)
vS
h
i
j
|S
h
i
j
|
|A|
.Entropy(S
h
i
j
) (2)
where F
IG
(A,S
h
i
j
) is a mapping function that com-
putes the usefulness of sensor S
h
i
j
for applications A
based on information theoretic. This quantifies the in-
formation gained by this sensor. The higher the value
of F
IG
(A,S
h
i
j
), the more useful the sensor S
h
i
j
is.
Similarly, a merit score, F
M
can be obtained for
the sensors using correlation-based sensor selection
method as in Eqn. 3.
F
M
({S
h
i
1
,S
h
i
2
,..., S
h
i
k
},A) =
dr
AS
p
d + d(d 1)r
SS
(3)
where d is the number of sensors in the subset, r
AS
is
the mean of application-to-sensor relevance correla-
tion, and
r
SS
is the mean of sensor-to-sensor relevance
correlation. The subset of sensors with the highest
merit is selected as the output of the correlation-based
sensor selection approach.
We then define F
h
i
IGM
= {F
IG
(.) F
M
(.)}, with any
duplicate removed, which represent the sensors se-
lected by information gain and correlation sensors se-
lection approaches that are considered in the proposed
model. F
h
i
IGM
is then merged with the context data C
h
i
i.e F
h
i
S
= {F
h
i
IGM
C
h
i
}, where F
h
i
S
denote the data used
to develop the local federated learning model on the
edge of each home, h
i
H.
5.7 Federated Learning
The proposed model employs federated learning to
enhance privacy of home occupants. In this setting,
the raw data of home users U are not transmitted
directly to the cloud, but rather the learned model
trained from the individual home hi H who have
subscribed to participate in the ecosystem. This local
model is encrypted using homomorphic encryption.
Homomorphic encryption allows computation on en-
crypted data in the cloud. In our proposed context,
federated learning allows multiple homes to collabo-
ratively train a machine learning model without nec-
essarily sharing their data with each other. For the
machine learning model, we propose gradient boost-
ing methods due to the following reasons: accuracy,
train faster especially on large datasets, and capable
of handling noisy data (Mwiti, 2023). In practice, a
federated XGBoost algorithm can serve this purpose
(IBM, 2023).
IoTBDS 2024 - 9th International Conference on Internet of Things, Big Data and Security
138
Formally, let L
m
denotes the local model trained
at the edge node and G
m
represents the global model
updated at the cloud node. The learning can be esti-
mated as given in Eqn. 4.
G
m
= argmin
w
1
n
n
i=1
E(F
h
i
s j
,a j) L
h
i
m
[(w, F
h
i
s j
,a j)]
(4)
where w is the model parameters, n is the number of
homes, L
h
i
m
is the probability distribution of data at
home h
i
, (.) is the loss function, and E(.) is the ex-
pected value. The goal of the model is to obtain the
global parameter G
m
that minimizes the loss function.
5.8 Policy (P)
Policy regulates the different security and privacy as-
pects of the model. These are the rules that can be
implemented to ensure security and privacy. Policy
is denoted as P = {p
1
, p
2
,..., p
r
} where p
i
P may
include fusion policy, data transfer policy, data reten-
tion policy, encryption policy, key management pol-
icy, storage policy, and model update policy. For in-
stance, model update policy may include a rule that
specify when a local model should be updated. Con-
sidering this policy, a timestamp attribute can be used
to automate the update process. The fusion policy
will include a rule that ensures only the authenticated
sensor nodes are allowed to connect. Other policies,
p
i
P, will also have specific rules that guide the
model in a collaborative manner.
6 EVALUATION AND RISK
ASSESSMENT
The proposed model is formulated taking into con-
sideration the requirements for data fusion. The fol-
lowing privacy and security threats are identified. We
briefly highlight how they are addressed in the pro-
posed model.
6.1 Privacy Threats
Linkage. An attacker wants to reveal information
that is not disclosed by home occupants by linking
data from different sources. Since raw data of the
users are not transmitted, this threat is minimized.
Localization and Tracking. An attacker wants to
record the location of home occupants and track
their movements. The proposed model preserves
users privacy and the IoT devices are configured
to transmit data to the edge nodes, and in doing
so; this type of attack is prevented.
Model Inversion. An attacker wants to determine
if a particular home or entity has been used to de-
velop the local or global model. The countermea-
sure here is that the communication between the
edge and the cloud is encrypted, making it diffi-
cult to compromise the model.
Profiling. A hacker wants to collect and correlate
data to generate new data about home occupant.
The main countermeasure is that raw data trans-
mission is not permitted by the data transfer pol-
icy.
Inventory Attack. An attacker wants to gain unau-
thorized access to the home network and gather
occupant data from an edge node. Since the edge
node is equipped with IDPS, this will prevent
unauthorized data collection.
6.2 Security Threats
Confidentiality. A hacker wants to have the
knowledge about the data in transit and those at
storage to compromise data integrity. The coun-
termeasure here is that the policy governing the
model does not permit raw data transfer between
the gateway and the cloud. The cloud provides se-
cured storage and computation for the model re-
sults, making it difficult for attacker to succeed
in this respect. Authentication and access control
mechanisms are also provided in the cloud and the
edge nodes, which also strengthen confidentiality.
Data Integrity and Data Poisoning. An attacker
wishes to modify the data in transit and to inject
fake data to compromise data integrity. The possi-
bility of this attack is limited since authentication
and access control mechanisms are provided. In
addition, only the model trained is transmitted in
encrypted form, which also reinforces the means
to ensure data integrity.
Eavesdropping. This is also known as sniffing or
snooping. In this type of attack, the hacker re-
lies on unsecured network communications to ac-
cess data in transit between nodes or devices. The
possibility of this attack is limited since there is a
secured connection between the gateway and the
cloud for the model transfer.
Denial of Service. A hacker wants to compromise
availability of the data fusion pipeline by flood-
ing the network with superfluous requests. This
attack is prevented by the IDPS that is running on
the edge and the cloud. In addition, the middle-
ware also ensures that only the secured connec-
tions and sessions are routed, and that requests are
time-bound.
HOMEFUS: A Privacy and Security-Aware Model for IoT Data Fusion in Smart Connected Homes
139
7 CONCLUSION
In this paper, a privacy and security-aware model
for smart connected home applications is proposed.
It advocates privacy and security for IoT data fu-
sion in smart pervasive living spaces where a lot of
personal data is generated, stored, and distributed.
In the model, the requirements for efficient data fu-
sion pipeline are considered, and federated learning
to protect home occupants’ data and improve predic-
tive analysis are adopted. Edge nodes are considered
for local model training and deployment, and a se-
cure connection is established between the edge and
the cloud. We show that the proposed model meets
the requirements for efficient data fusion and that it
can be applied to a variety of smart connected home
applications and services. Future work will consider
empirical analysis of the performance of the proposed
model, considering its different components.
ACKNOWLEDGMENT
This work was partially funded by the Knowledge
Foundation (Stiftelsen f
¨
or kunskaps- och kompeten-
sutveckling KK-stiftelsen) via the Synergy project
Intelligent and Trustworthy IoT Systems (Grant num-
ber 20220087).
REFERENCES
Adewole, K. S. and Torra, V. (2022a). Dftmicroagg: a dual-
level anonymization algorithm for smart grid data. In-
ternational Journal of Information Security, pages 1–
23.
Adewole, K. S. and Torra, V. (2022b). Privacy is-
sues in smart grid data: From energy disaggregation
to disclosure risk. In International Conference on
Database and Expert Systems Applications, pages 71–
84. Springer.
A
¨
ıvodji, U. M., Gambs, S., and Martin, A. (2019). Iot-
fla: A secured and privacy-preserving smart home ar-
chitecture implementing federated learning. In 2019
IEEE security and privacy workshops (SPW), pages
175–180. IEEE.
Bugeja, J., Vogel, B., Jacobsson, A., and Varshney, R.
(2019). Iotsm: an end-to-end security model for iot
ecosystems. In 2019 IEEE International Conference
on Pervasive Computing and Communications Work-
shops (PerCom Workshops), pages 267–272. IEEE.
Chaaraoui, A. A., Padilla-L
´
opez, J. R., Ferr
´
andez-Pastor,
F. J., Nieto-Hidalgo, M., and Fl
´
orez-Revuelta, F.
(2014). A vision-based system for intelligent monitor-
ing: human behaviour analysis and privacy by context.
Sensors, 14(5):8895–8925.
Chahuara, P., Portet, F., and Vacher, M. (2013). Making
context aware decision from uncertain information in
a smart home: A markov logic network approach.
In Ambient Intelligence: 4th International Joint Con-
ference, AmI 2013, Dublin, Ireland, December 3-5,
2013. Proceedings 4, pages 78–93. Springer.
Chimamiwa, G., Alirezaie, M., Pecora, F., and Loutfi, A.
(2021). Multi-sensor dataset of human activities in a
smart home environment. Data in Brief, 34:106632.
Ding, W., Jing, X., Yan, Z., and Yang, L. T. (2019). A sur-
vey on data fusion in internet of things: Towards se-
cure and privacy-preserving fusion. Information Fu-
sion, 51:129–144.
Dutta, J. and Roy, S. (2022). Occupancysense: Context-
based indoor occupancy detection & prediction using
catboost model. Applied Soft Computing, 119:108536.
Gartner Inc. (2017). Gartner Says 8.4 Billion Connected
”Things” Will Be in Use in 2017, Up 31 Percent From
2016. https://www.gartner.com/en/newsroom/press-r
eleases/2017-02-07-gartner-says-8-billion-connect
ed-things-will-be-in-use-in-2017-up-31-percent-fro
m-2016. Accessed: 5th June, 2023.
IBM (2023). Federated learning xgboost tutorial for ui. ht
tps://www.ibm.com/docs/en/cloud-paks/cp-data/4.
6.x?topic=samples-xgboost-tutorial. Accessed: 5th
June, 2023.
Kommey, B. (2022). Automatic ceiling fan control using
temperature and room occupancy. JITCE (Journal of
Information Technology and Computer Engineering),
6(01):1–7.
Mena, A. R., Ceballos, H. G., and Alvarado-Uribe, J.
(2022). Measuring indoor occupancy through envi-
ronmental sensors: A systematic review on sensor de-
ployment. Sensors, 22(10):3770.
Monti, L., Tse, R., Tang, S.-K., Mirri, S., Delnevo, G.,
Maniezzo, V., and Salomoni, P. (2022). Edge-based
transfer learning for classroom occupancy detection in
a smart campus context. Sensors, 22(10):3692.
Mwiti, D. (2023). Gradient boosted decision trees [guide]:
a conceptual explanation. https://neptune.ai/blog/gr
adient-boosted-decision-trees-guide. Accessed: 5th
June, 2023.
Pathmabandu, C., Grundy, J., Chhetri, M. B., and Baig, Z.
(2023). Privacy for iot: Informed consent manage-
ment in smart buildings. Future Generation Computer
Systems, 145:367–383.
Rahman, A. F. A., Daud, M., and Mohamad, M. Z. (2016).
Securing sensor to cloud ecosystem using internet of
things (iot) security framework. In Proceedings of
the International Conference on Internet of things and
Cloud Computing, pages 1–5.
Sayed, A. N., Bensaali, F., Himeur, Y., and Houchati, M.
(2023). Edge-based real-time occupancy detection
system through a non-intrusive sensing system. En-
ergies, 16(5):2388.
IoTBDS 2024 - 9th International Conference on Internet of Things, Big Data and Security
140