themselves remains opaque. The ”black box” nature
of DL models used for predicting these events limits
our ability to fully understand and interpret the occur-
rence of the BEs.
In future work, our research will explore the un-
validated approaches of using ML for selecting, gen-
erating, or both selecting and generating FTs. We
will investigate methodologies for employing ML al-
gorithms to automate the selection of appropriate
FTs based on observed symptoms or failure modes.
This will involve developing algorithms that navi-
gate through multiple failure scenarios to identify the
most suitable FTs for RCA. Furthermore we will in-
vestigate how ML can be utilized to automate the
genreration of FTs based on observational or histor-
ical data. This involves developing algorithms that
construct FTs that accurately represent the complex
failure mechanisms within cloud computing systems,
while also ensuring interpretability and relevance for
effective fault diagnosis.By pursuing these paths, we
aim to enhance fault diagnosis by fully leveraging the
integration of ML with FTs. Additionally, we will
explore the implementation of our approach in real-
world settings to evaluate its applicability and robust-
ness across various cloud computing environments.
Through these efforts, we try to unlock advanced ca-
pabilities for more precise analysis and understanding
of system failures.
Our investigation into integrating ML with FTA
presents a significant advancement in fault detection
methodologies for cloud computing systems. By con-
centrating on the prediction of BEs and the subse-
quent calculation of TE probability, we not only en-
hance the precision of fault diagnosis but also in-
crease the system’s interpretability and transparency.
Although our experimental validation focused on
this particular approach, we discussed the theoretical
framework and potential benefits of using ML for se-
lecting and generating FTs. Future work will explore
these unvalidated approaches to further refine and ex-
pand our understanding of integrating ML with FTA,
aiming to develop more robust and intuitive fault di-
agnosis tools for complex computing environments.
This research was funded by the Deutsche
Forschungsgemeinschaft (DFG, German Research
Foundation), under grant DFG -GZ: RE 2881/6-1
and the French Agence Nationale de la Recherche
(ANR), under grant ANR-22-CE92-0007.
