to September 2022. However, the study only focused
on data from April 2020 to July 2020 since this was
when the identified version of the FASSSTER model
was heavily adopted, and thus the model parameter
values are more appropriate. The individual case data
were aggregated to get the daily number of new cases
in the Philippines. Although the FASSSTER model
was also used for modeling COVID-19 for smaller
administrative areas, this study only focused on the
national level. The national daily case data was used
for the model fitting/training and evaluation.
3.2 Model Development and Simulation
The objective of the study is to evaluate various pa-
rameter estimation algorithms in a compartmental
disease model, and the ARIMA time-series model.
Both the compartmental and ARIMA time-series
models were developed using the R programming lan-
guage.
The FASSSTER model was adopted for the
COVID-19 compartmental model. The initial con-
ditions, known parameters, and parameters to be es-
timated were also based from the FASSSTER study
(de Lara-Tuprio et al., 2022). Three parameter esti-
mation algorithms, i.e., Nelder-Mead, Simulated An-
nealing, and L-BFGS-B, were evaluated for the com-
partmental model. The models were fitted to the ac-
tual case data based on the minimum negative log
likelihood (NLL) using Poisson distribution.
Both the compartmental disease models and
ARIMA time-series models were fitted/trained using
training data consisting of 30 days of daily COVID-
19 case counts starting from April 1 2020 to June 30
2020, totalling to 91 time periods. The resulting mod-
els were then used to predict the cases up to 30 days
forward.
3.3 Model Evaluation
The bench package of R was used to measure the me-
dian run time in seconds, total memory allocation in
megabytes (MB), and iterations per second on 5 it-
erations during the model fitting/training. The 30-
day model predictions were then compared with ac-
tual case data as reported by DOH Philippines. The
NLL was computed to measure the accuracy of the
case predictions.
One-way Analysis of Variance (ANOVA) was
used to test if there are statistical differences among
the model outputs. Further, the Tukey Honest Signifi-
cance Difference (Tukey HSD) test was implemented
to determine which specific pairs of modeling tech-
niques have significant differences. Tukey HSD in-
corporates some corrections, with such corrections
becoming necessary when multiple pairs are tested
for differences. An alternative correction is used in
the Bonferroni test, which is probably the simplest
among post hoc tests, and is conservative on Type I
errors but is however more prone to Type II errors.
4 RESULTS AND DISCUSSIONS
4.1 Negative Log Likelihood
Table 1 summarizes the NLL values based on the 30-
day predictions of new cases using the different dis-
ease modeling methods. The ARIMA model had the
lowest mean NLL value at 13074.85 and the lowest
median value at 9178.40. Although the overall mini-
mum NLL value was produced using Simulated An-
nealing at 1070.86, the minimum NLL for ARIMA,
1104.75, is only at a slight difference. ARIMA
also had the lowest standard deviation with a value
of 13442.28, signifying that the NLL values com-
puted from the ARIMA model predictions are closer
to the mean value. Among the group, L-BFGS-B
had the highest mean NLL value of 438883.18, fol-
lowed by Nelder-Mead, with a mean NLL value of
39248.45. Nelder-Mead produced the overall maxi-
mum NLL value of 1041587.03 and the highest NLL
standard deviation at 112435.38. L-BFGS-B and
Nelder-Mead also provided the highest median NLL
values at 13434.54 and 13432.61, respectively.
Table 1: Summary of negative log likelihood values for each
method.
method mean median min max stdev
NM 39248.45 13432.61 3477.48 1041587.03 112435.38
SANN 27801.88 11956.06 1070.86 454141.05 57790.56
L-BFGS-B 43883.18 13434.54 2976.53 391733.97 77462.12
ARIMA 13074.85 9178.40 1104.75 81025.72 13442.28
A box plot representation of the NLL values in
logarithmic scale for each method is shown in Figure
3. The resulting NLL values suggest that the case pre-
dictions using Nelder-Mead and L-BFGS-B have the
least likelihood. ARIMA mostly provided the best
NLL values, which suggests that ARIMA should be
preferred to predict new cases with the most likeli-
hood. However, it might still be worth investigat-
ing if the parameters, initial conditions, and hyper-
parameters that were used for the parameter estima-
tion algorithms are approximately optimal.
A one-way ANOVA was performed to compare
the NLL values of the four disease-modeling meth-
ods. The one-way ANOVA revealed that there was
a significant difference in the mean NLL scores be-
HEALTHINF 2023 - 16th International Conference on Health Informatics
266