
ACKNOWLEDGEMENTS
This work has been supported by the French govern-
ment under the ”France 2030” program, as part of the
SystemX Technological Research Institute within the
Confiance.ai Program (www.confiance.ai).
REFERENCES
Adedjouma, M., Adam, J.-L., Aknin, P., Alix, C., Baril, X.,
Bernard, G., Bonhomme, Y., Braunschweig, B., Can-
tat, L., Chale-Gongora, G., et al. (2022). Towards the
engineering of trustworthy AI applications for critical
systems - the confiance.ai program.
AI, U. L. I. (2019). A plan for federal engagement in devel-
oping technical standards and related tools.
Arya, V. et al. (2022). AI Explainability 360: Impact and
design. In Proceedings of the AAAI Conf., volume 36
(11).
Avizienis, A. et al. (2004). Basic concepts and taxonomy
of dependable and secure computing. IEEE Trans. on
dependable and secure computing, 1(1):11–33.
Biran, O. and Cotton, C. (2017). Explanation and justifi-
cation in machine learning: A survey. In IJCAI-17
workshop on explainable AI (XAI), volume 8, pages
8–13.
Blatchford, M. L., Mannaerts, C. M., and Zeng, Y. (2021).
Determining representative sample size for validation
of continuous, large continental remote sensing data.
International Journal of Applied Earth Observation
and Geoinformation, 94:102235.
Bosni
´
c, Z. and Kononenko, I. (2009). An overview of ad-
vances in reliability estimation of individual predic-
tions in ML. Intelligent Data Analysis, 13(2):385–
401.
Braunschweig, B., Gelin, R., and Terrier, F. (2022). The
wall of safety for AI: approaches in the confiance.ai
program. In SafeAI@ AAAI, volume 3087 of CEUR
Workshop Proceedings. CEUR-WS.org.
Buckley, F. J. and Poston, R. (1984). Software quality assur-
ance. IEEE Trans. on Software Engineering, 1(1):36–
41.
Byun, T. and Rayadurgam, S. (2020). Manifold for machine
learning assurance. In ACM/IEEE 42nd International
Conference on Software Engineering: New Ideas and
Emerging Results, pages 97–100.
Chapman, A. et al. (2020). Dataset search: a survey. The
VLDB J., 29(1):251–272.
Davis, J. and Goadrich, M. (2006). The relationship be-
tween precision-recall and roc curves. In Proceed-
ings of the 23rd international conference on Machine
learning, pages 233–240.
Delseny, H., Gabreau, C., Gauffriau, A., Beaudouin, B.,
Ponsolle, L., Alecu, L., Bonnin, H., Beltran, B.,
Duchel, D., Ginestet, J.-B., et al. (2021). White paper
machine learning in certified systems. arXiv preprint
arXiv:2103.10529.
Derezi
´
nski, M. (2019). Fast determinantal point processes
via distortion-free intermediate sampling. In Conf. on
Learning Theory, pages 1029–1049. PMLR.
Doshi-Velez, F. and Kim, B. (2017). Towards a rigorous sci-
ence of interpretable machine learning. arXiv preprint
arXiv:1702.08608.
Du, M. et al. (2019). On attribution of recurrent neural net-
work predictions via additive decomposition. In The
WWW Conf., pages 383–393.
Felderer, M. and Ramler, R. (2021). Quality Assurance for
AI-Based Systems: Overview and Challenges (Intro-
duction to Interactive Session). In International Conf.
on Software Quality, pages 33–42. Springer.
Fjeld, J. et al. (2020). Principled artificial intelligence:
Mapping consensus in ethical and rights-based ap-
proaches to principles for AI. Berkman Klein Center
Research Publication.
Ge, M. and Helfert, M. (2006). A framework to assess deci-
sion quality using information quality dimensions. In
ICIQ, pages 455–466.
Girard-Satabin, J. et al. (2022). CAISAR: A platform for
Characterizing Artificial Intelligence Safety and Ro-
bustness. In AI Safety workshop of IJCAI-ECAI.
Gong, Z. et al. (2019). Diversity in machine learning. IEEE
Access, 7:64323–64350.
Grabisch, M. and Labreuche, C. (2010). A decade of appli-
cation of the Choquet and Sugeno integrals in multi-
criteria decision aid. Annals of Operations Research,
175(1):247–286.
Gyllenhammar, M. et al. (2020). Towards an operational
design domain that supports the safety argumentation
of an automated driving system. In 10th European
Congress on Embedded Real Time Systems (ERTS).
Heinrich, B. et al. (2018). Requirements for data quality
metrics. Journal of Data and Information Quality
(JDIQ), 9(2):1–32.
High-Level Expert Group on Artificial Intelligence (2019).
Assessment list for trustworthy artificial intelligence
(altai). Technical report, European Commission.
INCOSE, T. (2007). Systems engineering vision 2020. IN-
COSE, San Diego, CA, accessed Jan, 26(2019):2.
Jakubik, J. et al. (2022). Data-centric artificial intelligence.
arXiv 2212.11854.
Jarrahi, M. and Others (2022). The Principles of Data-
Centric AI. arXiv 2211.14611.
Kaakai, F. and Raffi, P.-M. (2023). Towards multi-timescale
online monitoring of ai models: Principles and prelim-
inary results. In SafeAI@ AAAI.
Kapusta, K., , et al. (2023). Protecting ownership rights of
ml models using watermarking in the light of adver-
sarial attacks. In AAAI Spring Symposium - AITA: AI
Trustworthiness Assessment.
Kaur, G. and Bahl, K. (2014). Software reliability, metrics,
reliability improvement using agile process. Int. J. of
Innovative Science, Engineering & Techno., 1(3):143–
147.
Kulesza, A. et al. (2012). Determinantal point processes
for machine learning. Foundations and Trends® in
Machine Learning, 5(2–3):123–286.
MBSE-AI Integration 2024 - Workshop on Model-based System Engineering and Artificial Intelligence
332