response as well as to set priorities to Covid-19 crisis
(Grasselli et al., 2020).
Forecasting can be used as a prediction tool of the
future cases. It can be used as an effort to provide an
initial response in handling Covid-19 cases in
Indonesia as well. In (Rustam et al., 2020) forecasting
has powerfully been applied based on machine
learning methods for Covid-19 cases using John
Hopkins dataset (Petropoulos & Makridakis, 2020).
Machine learning works by combining statistical
techniques and Big Data analytics to yield the
knowledge about future; or so-called data-driven
knowledge (Frické, 2018).
In the Big Data field, the term "analysis" is
different from "analytics". The fundamental
difference is that an analysis is carried out to find the
information from the available data to explain
patterns that occurred in the past that are useful for
current decision making. While analytics is an
activity to find patterns in the data and interpret the
knowledge of the patterns to predict something that
will happen in the future (Frické, 2018). In other
countries, e.g. China, Singapore and Canada, the use
of Big Data and Artificial Intelligence played an
important role in Covid-19 for deciding action
planning and mitigation. In contrast, in Indonesia the
information presented was mostly the analysis of
daily data and the aggregate data in the form of
statistical s and percentages related to Covid-19
cases. As a matter of facts, not many prediction
models have been produced because it required
forecasting methods and pattern recognition from
time-series data based on Artificial Intelligence and
Machine Learning methods.
Therefore, this paper proposed a Big Data
analytics for predicting the future cases of Covid-19
in Indonesia based on machine learning methods. Our
contribution is creating the prediction model of
Covid-19 cases in Indonesia using Big Data analytic
tools. This contribution is important since the Covid-
19 pandemic curve has not been flattening up to now,
therefore all efforts, including scientific effort must
be made in supporting all decisions related to Covid-
19 cases mitigation planning and handling. We
organized this paper systematically to provide a
comprehensive picture about the Covid-19 problems
in Indonesia; the proposed prediction methods; and
the discussion of the results.
2 COVID-19 GLOBAL PANDEMIC
2.1 Covid-19
Corona virus disease that first appears in the year of
2019; or in short Covid-19 is an infectious disease
caused by The Corona Virus type 2 (SARS-CoV-2)
and resulting an acute respiratory syndrome (Sanche
et al., 2020). Common symptoms caused by this
disease are fever, coughing, and shortness of breath.
Other symptoms include fatigue, muscle aches,
diarrhoea, sore throat, loss of sense of smell, and
stomach pain. Meanwhile some cases are reported to
cause mild symptoms that develop into severe
symptoms quickly, including pneumonia and multi-
organ failure if the patient is included in the comorbid
group (Radulescu & Cavanagh, 2020). Comorbid is a
group of patients with a risk of severe symptoms
because they have a history of congenital diseases
such as: diabetes, hypertension, heart disease,
pregnant women, and smokers. As per May 28
th
2020,
more than 5.7 million cases had been reported in more
than 200 countries and regions, this disease also
caused more than 350 thousand of deaths and more
than two million people recovered (Worldometer,
n.d.). Covid-19 has been proclaimed as a global
pandemic by WHO on March 2020.
The Covid-19 virus is spreading from a close
contact by small droplets produced when an infected
person coughs, sneezes or talks (Sanche et al., 2020).
These tiny drops are also produced when breathing,
but quickly fall to the ground or surface and generally
do not spread over long distances. People can also be
infected by touching a contaminated surface then
holding their face. This virus can survive on the
surface for up to 72 hours. The disease is most
contagious during the first three days after the
occurrence of symptoms, although infections may
occur before symptoms appear. The time from the
exposure to the occurrence of symptoms is usually
around 5 days, but can range from 2 up to 14 days
(Radulescu & Cavanagh, 2020). The standard method
for diagnosing Covid-19 is by applying the Reverse
Transcriptional Polymerase Chain Reaction (RRT-
PCR) in real-time from the nasopharyngeal swab,
which is taking samples from the lining of the nose
and throat (Long et al., 2020). Infections can also be
diagnosed from a combination of symptoms,
comorbid risk factors, and chest CT scans that show
symptoms of pneumonia.
Researchers in the UK and Germany found that
the Covid-19 virus has mutated into three types. They
labelled the Covid-19 virus with types A, B, and C.
The type A virus is the earliest type found in Wuhan