to a digital replica of physical assets such as pro-
cesses, locations, systems and devices, in which ML-
generated models based on measured and monitored
data from the real world form the basis for further
analysis. These are often based on IoT-generated data
with enhances models and function provided through
machine learning. We plan to investigate deeper the
complexity of these digital twins and the respective
quality concerns that would apply.
As other future work, our ultimate goal is to close
the loop mapping functional problems back to their
origins by identifying the symptoms of low quality
precisely and map these to the root causes of these
deficiencies.
REFERENCES
Amershi, S., Begel, A., Bird, C., DeLine, R., Gall, H., Ka-
mar, E., Nagappan, N., Nushi, B., and Zimmermann,
T. (2019). Software engineering for machine learning:
A case study. In International Conference on Software
Engineering (ICSE 2019) - Software Engineering in
Practice track. IEEE Computer Society.
Azimi, S. and Pahl, C. (2020a). Particle swarm optimiza-
tion for performance management in multi-cluster
iot edge architectures. In International Conference
on Cloud Computing and Services Science CLOSER.
SciTePress.
Azimi, S. and Pahl, C. (2020b). Root cause analysis and
remediation for quality and value improvement in ma-
chine learning driven information models. In 22nd
International Conference on Enterprise Information
Systems - ICEIS 2020. SciTePress.
Caruana, R. and Niculescu-Mizil, A. (2005). An empirical
comparison of supervised learning algorithms using
different performance metrics.
Caruana, R. and Niculescu-Mizil, A. (2006). An empirical
comparison of supervised learning algorithms. In Pro-
ceedings of the 23rd International Conference on Ma-
chine Learning, ICML ’06, page 161–168, New York,
NY, USA. Association for Computing Machinery.
Casado-Vara, R., de la Prieta, F., Prieto, J., and Corchado,
J. M. (2018). Blockchain framework for iot data qual-
ity via edge computing. In Proceedings of the 1st
Workshop on Blockchain-Enabled Networked Sensor
Systems, BlockSys’18, page 19–24, New York, NY,
USA. Association for Computing Machinery.
Cortes, C., Jackel, L. D., and Chiang, W.-P. (1995). Limits
on learning machine accuracy imposed by data qual-
ity. In Tesauro, G., Touretzky, D. S., and Leen, T. K.,
editors, Advances in Neural Information Processing
Systems 7, pages 239–246. MIT Press.
Ehrlinger, L., Haunschmid, V., Palazzini, D., and Lettner,
C. (2019). A daql to monitor data quality in ma-
chine learning applications. In Hartmann, S., Küng,
J., Chakravarthy, S., Anderst-Kotsis, G., Tjoa, A. M.,
and Khalil, I., editors, Database and Expert Systems
Applications, pages 227–237, Cham. Springer Inter-
national Publishing.
Heine, F., Kleiner, C., and Oelsner, T. (2019). Automated
detection and monitoring of advanced data quality
rules. In Hartmann, S., Küng, J., Chakravarthy,
S., Anderst-Kotsis, G., Tjoa, A. M., and Khalil, I.,
editors, Database and Expert Systems Applications,
pages 238–247, Cham. Springer International Pub-
lishing.
Karkouch, A., Mousannif, H., Moatassime, H. A., and
Noel, T. (2016). Data quality in internet of things: A
state-of-the-art survey. Journal of Network and Com-
puter Applications, 73:57 – 81.
Kleiman, R. and Page, D. (2019). Aucµ: A performance
metric for multi-class machine learning models. In In-
ternational Conference on Machine Learning, pages
3439–3447.
Li, H., Ota, K., and Dong, M. (2018). Learning iot in
edge: Deep learning for the internet of things with
edge computing. IEEE Network, 32:96–101.
Mahdavinejad, M. S., Rezvan, M., Barekatain, M., Adibi,
P., Barnaghi, P., and Sheth, A. P. (2018). Machine
learning for internet of things data analysis: a sur-
vey. Digital Communications and Networks, 4(3):161
– 175.
Nguyen, T. L. (2018). A framework for five big v’s of big
data and organizational culture in firms. In 2018 IEEE
International Conference on Big Data (Big Data),
pages 5411–5413. IEEE.
O’Brien, T., Helfert, M., and Sukumar, A. (2013). The value
of good data- a quality perspective a framework and
discussion. In ICEIS 2013 - Proceedings of the 15th
International Conference on Enterprise Information
Systems, volume 2.
Plewczynski, D., Spieser, S. A. H., and Koch, U. (2006).
Assessing different classification methods for virtual
screening. Journal of Chemical Information and Mod-
eling, 46(3):1098–1106. PMID: 16711730.
Rajkomar, A., Hardt, M., Howell, M. D., Corrado, G., and
Chin, M. H. (2018). Ensuring fairness in machine
learning to advance health equity. Annals of internal
medicine, 169(12):866–872.
Saha, B. and Srivastava, D. (2014). Data quality: The other
face of big data. In 2014 IEEE 30th International
Conference on Data Engineering, pages 1294–1297.
IEEE.
Sicari, S., Rizzardi, A., Miorandi, D., Cappiello, C., and
Coen-Porisini, A. (2016). A secure and quality-aware
prototypical architecture for the internet of things. In-
formation Systems, 58:43 – 55.
Sridhar, V., Subramanian, S., Arteaga, D., Sundararaman,
S., Roselli, D. S., and Talagala, N. (2018). Model
governance: Reducing the anarchy of production ml.
In USENIX Annual Technical Conference.
Thatipamula, S. (2013). Data done right: 6 di-
mensions of data quality. https://smartbridge.com/
data-done-right-6-dimensions-of-data-qua\-lity/. Ac-
cessed on 2020-01-16.
A Layered Quality Framework for Machine Learning-driven Data and Information Models
587