of state-of-art pre-trained natural language process-
ing embedding models based on BERT (Huang et al.,
2019; Alsentzer et al., 2019) for downstream tasks,
which can be used in this context. The explainability
of the model is a crucial issue here as this can be uti-
lized by different stakeholders (Lipton, 2017). In the
future, we will also explore this explainability issue.
We investigated missingness and different time win-
dow sizes in extremely sparse EHR data obtained
from a Swedish university hospital for the task of
early prediction of sepsis using a deep learning-based
LSTM model. It was shown that the size of the win-
dow has a significant impact on the predictive perfor-
mance of the models. We also observed that treating
missing data as missing not at random can in some
cases lead to better predictive performance compared
to assuming that it is missing at random.
