
Table 5: Sample set of counterfactuals generated using iXGB from the Flight Delay dataset.
Change in Feature Values
Change
in Target
ts leg to ts flight duration leg ts to ta leg ta leg ts ifp to ts
-118 -56 +9 +4 +3 -10%
+2 +2 +2 +2 +2 -5%
-14 +7 +11 -36 -89 -5%
-21 +35 +25 -5 -34 +5%
+29 +46 -25 -8 -49 +5%
maintaining operational relevance.
5 CONCLUSION AND FUTURE
WORKS
XGBoost is widely adopted in regression tasks
because of its higher accuracy than other tree-based
ML models with the cost of interpretability.
Generally, the interpretability is induced to XGBoost
through using various XAI methods. These XAI
methods (e.g., LIME) rely on perturbed samples
to provide explanations for XGBoost predictions.
In this paper, iXGB is proposed by utilising the
internal structure of XGBoost to generate rule-based
explanations and counterfactuals from the same data
on which the model trains for prediction tasks. The
proposed approach is functionally evaluated on three
different datasets in terms of local accuracy and
quality of the rules, which shows the ability of
iXGB to improve the interpretability of XGBoost
reasonably. Future research directions include
theoretically grounded evaluation of the proposed
approach on more diverse datasets and different
real-world problems. Moreover, further investigations
are also required to adopt the proposed iXGB for
binary and multi-class classification tasks.
ACKNOWLEDGEMENTS
This study was supported by the following projects;
i) ARTIMATION (Transparent Artificial intelligence
and Automation to Air Traffic Management Systems),
funded by the SESAR JU under the European Union’s
Horizon 2020 Research and Innovation programme
(Grant Agreement No. 894238) and ii) xApp
(Explainable AI for Industrial Applications), funded
by the VINNOVA (Sweden’s Innovation Agency)
(Diary No. 2021-03971).
REFERENCES
Blanchart, P. (2021). An Exact
Counterfactual-Example-based Approach to
Tree-ensemble Models Interpretability. ArXiv,
(arXiv:2105.14820v1 [cs.LG]).
Breiman, L. (2001). Random Forests. Machine Learning,
45(1):5–32.
Chen, T. and Guestrin, C. (2016). XGBoost: A Scalable
Tree Boosting System. In Proceedings of the 22nd
ACM SIGKDD International Conference on Knowledge
Discovery and Data Mining, pages 785–794, San
Francisco California USA. ACM.
Cook, A. J. and Tanner, G. (2015). European Airline Delay
Cost Reference Values. Technical report, University of
Westminster, London, UK.
Dalmau, R., Ballerini, F., Naessens, H., Belkoura, S., and
Wangnick, S. (2021). An Explainable Machine Learning
Approach to Improve Take-off Time Predictions. Journal
of Air Transport Management, 95:102090.
Friedman, J. H. (2001). Greedy Function Approximation:
A Gradient Boosting Machine. The Annals of Statistics,
29(5).
Guidotti, R., Monreale, A., Giannotti, F., Pedreschi, D.,
Ruggieri, S., and Turini, F. (2019). Factual and
Counterfactual Explanations for Black Box Decision
Making. IEEE Intelligent Systems, 34(6):14–23.
Gunning, D. and Aha, D. W. (2019). DARPA’s
Explainable Artificial Intelligence Program. AI
Magazine, 40(2):44–58.
Hara, S. and Hayashi, K. (2016). Making Tree Ensembles
Interpretable. ArXiv, (arXiv:1606.05390v1 [stat.ML]).
Harrison, D. and Rubinfeld, D. L. (1978). Hedonic
Housing Prices and the Demand for Clean Air.
Journal of Environmental Economics and Management,
5(1):81–102.
Hunter, J. D. (2007). Matplotlib: A 2D Graphics
Environment. Computing in Science & Engineering,
9(3):90–95.
Islam, M. R., Ahmed, M. U., Barua, S., and Begum, S.
(2022). A Systematic Review of Explainable Artificial
Intelligence in Terms of Different Application Domains
and Tasks. Applied Sciences, 12(3):1353.
Jmoona, W., Ahmed, M. U., Islam, M. R., Barua, S.,
ICAART 2024 - 16th International Conference on Agents and Artificial Intelligence
1352