Authors:
Kerstin Wagner
1
;
Henrik Volkening
2
;
Sunay Basyigit
2
;
Agathe Merceron
1
;
Petra Sauer
1
and
Niels Pinkwart
3
Affiliations:
1
Berliner Hochschule für Technik, Berlin, Germany
;
2
Deutsches Zentrum für Luft- und Raumfahrt, Berlin, Germany
;
3
Deutsches Forschungszentrum für Künstliche Intelligenz, Berlin, Germany
Keyword(s):
Predicting Dropouts, Global / Local Feature Set, Evaluation, Balanced Accuracy, Explainability, Fairness.
Abstract:
To predict whether students will drop out of their degree program in a middle-sized German university, we investigate five algorithms — three explainable and two not — along with two different feature sets. It turns out that the models obtained with Logistic Regression (LR), an explainable algorithm, have the best performance. This is an important finding to be able to generate explanations for stakeholders in future work. The models trained with a local feature set and those trained with a global feature set show similar performance results. Further, we study whether the models built with LR are fair with respect to both male and female students as well as the study programs considered in this study. Unfortunately, this is not always the case. This might be due to differences in the dropout rates between subpopulations. This limit should be taken into account in practice.