
These technologies enhance automation, efficiency,
and innovation within Industry 4.0. However, the vol-
ume and diversity of data generated by this environ-
ment present significant challenges, including issues
like transmission noise, device malfunctions, and in-
stability.
To address these, we propose a data quality moni-
toring pipeline that integrates seamlessly into the core
process, ensuring continuous management of data
quality as part of the operational workflow, thus im-
proving data reliability and process efficiency. Met-
rics specifically tailored for IoT scenarios are used to
monitor data quality, allowing real-time assessment
with minimal configuration and eliminating the need
for complex, custom solutions.
Data profiling is a fundamental component of this
pipeline, providing insights into the structure, dis-
tribution, and relationships within datasets. Profil-
ing tasks, such as detecting null values, extreme val-
ues, data types, and dependencies, generate metadata
crucial for assessing data quality dimensions such as
Accuracy, Completeness, Consistency, and Timeli-
ness. Taking a proactive profiling approach, we en-
able rapid responses to quality issues, ensuring high
data quality over time. Moreover, integrating data
profiling into the monitoring pipeline helps address
common IoT challenges, such as sensor malfunctions
and data gaps, which could otherwise affect opera-
tional performance and product quality. The profiling
outputs allow for automated checks, reducing human
intervention and enabling timely adjustments to main-
tain process stability.
Future work will focus on improving both perfor-
mance and outcomes by incorporating advanced tech-
niques such as sketching methods (e.g., t-digest (Dun-
ning, 2021)).
ACKNOWLEDGEMENTS
This work has been supported by the European Union
under the Next Generation EU, through a grant of
the Portuguese Republic’s Recovery and Resilience
Plan (PRR) Partnership Agreement, within the scope
of the project PRODUTECH R3 – ”Agenda Mo-
bilizadora da Fileira das Tecno-logias de Produc¸
˜
ao
para a Reindustrializac¸
˜
ao”, Total project investment:
166.988.013,71 Euros; Total Grant: 97.111.730,27
Euros.
REFERENCES
Abedjan, Z., Golab, L., Naumann, F., and Papenbrock, T.
(2018). Data profiling. Synthesis Lectures on Data
Management, 10:1–154.
Agolla, J. E. (2021). Smart Manufacturing: Quality Control
Perspectives. IntechOpen.
Azeroual, O., Saake, G., and Schallehn, E. (2018). Analyz-
ing data quality issues in research information systems
via data profiling. International Journal of Informa-
tion Management, 41:50–56.
Bandara, K., Bergmeir, C., and Smyl, S. (2020). Fore-
casting across time series databases using recurrent
neural networks on groups of similar series: A clus-
tering approach. Expert Systems with Applications,
140:112896.
Batini, C. and Scannapieco, M. (2016). Data and Informa-
tion Quality. Springer International Publishing.
Byabazaire, J., O’Hare, G., and Delaney, D. (2020). Us-
ing trust as a measure to derive data quality in data
shared iot deployments. In 2020 29th International
Conference on Computer Communications and Net-
works (ICCCN), pages 1–9. IEEE.
Cichy, C. and Rass, S. (2019). An overview of data quality
frameworks. IEEE Access, 7:24634–24648.
Dunning, T. (2021). The t-digest: Efficient estimates of
distributions. Software Impacts, 7:100049.
Goknil, A., Nguyen, P., Sen, S., Politaki, D., Niavis, H.,
Pedersen, K. J., Suyuthi, A., Anand, A., and Ziegen-
bein, A. (2023). A systematic review of data quality in
cps and iot for industry 4.0. ACM Computing Surveys,
55(14s):1–38.
Groover, M. P. (2010). Fundamentals of modern manufac-
turing: materials, processes, and systems. John Wiley
& Sons.
Heine, F., Kleiner, C., and Oelsner, T. (2019). Automated
Detection and Monitoring of Advanced Data Quality
Rules, pages 238–247. Springer, Cham.
Hu, C., Sun, Z., Li, C., Zhang, Y., and Xing, C. (2023).
Survey of time series data generation in iot. Sensors,
23.
ISO/IEC 25012:2008 (2008). Software engineering Soft-
ware product Quality Requirements and Evaluation
(SQuaRE) Data quality model. Standard, Interna-
tional Organization for Standardization, Geneva, CH.
Khan, J., Dalu, R., and Gadekar, S. (2014). Defects in ex-
trusion process and their impact on product quality.
International journal of mechanical engineering and
robotics research, 3(3):187.
Kusumasari, T. F. and Fitria (2016). Data profiling for data
quality improvement with openrefine. In 2016 Inter-
national Conference on Information Technology Sys-
tems and Innovation (ICITSI), pages 1–6. IEEE.
Liu, C., Peng, G., Kong, Y., Li, S., and Chen, S. (2021).
Data quality affecting big data analytics in smart fac-
tories: research themes, issues and methods. Symme-
try, 13(8):1440.
Loshin, D. (2011). The Practitioner’s Guide to Data Qual-
ity Improvement. Elsevier.
Real-Time Manufacturing Data Quality: Leveraging Data Profiling and Quality Metrics
67