Forest Fire Area Estimation using Support Vector Machine as an
Approximator
Nittaya Kerdprasop
1
, Pumrapee Poomka
1
, Paradee Chuaybamroong
2
and Kittisak Kerdprasop
1
1
Data and Knowledge Engineering Research Unit, School of Computer Engineering,
Suranaree University of Technology, Nakhon Ratchasima, Thailand
2
Department of Environmental Science, Thammasat University, Thailand
Keywords: Forest Fire Area Estimation, Machine Learning, Data Modelling, Approximator, Support Vector Machine,
Kernel Function.
Abstract: Forest fire is critical environmental issue that can cause severe damage. Fast detection and accurate
estimation of forest fire burned area can help firefighters to effectively control damage. Thus, the purpose of
this paper is to apply state of the art data modeling method to estimate the area of forest fire burning using
support vector machine (SVM) algorithm as a tool for area approximation. The dataset is
real forest fires
data from the Montesinho natural park in the northeast region of Portugal. The original dataset comprises of
517 records with 13 attributes. We randomly sample the data 10 times to obtain 10 data-subsets for building
estimation models using two kinds of SVM kernel:
radial basis function and polynomial function. The
obtained models are compared against other proposed techniques to assess performances based on the two
measurement metrics: mean absolute error (MAE) and root mean square error (RMSE). The experimental
results show that our SVM predictor using polynomial kernel function can precisely estimate forest fire
damage area with the MAE and RMSE as low as 6.48 and 7.65, respectively. These errors are less than
other techniques reported in the literature.
1 INTRODUCTION
Forest fire is a severe disaster for humans and other
wild lives. The fires, either intentionally manmade
or a natural phenomenon, are unwanted situation and
they should be getting into control as fast as possible
in order to reduce loss. Predicting accurately the
spread of the fires is one effective way to control
and limit the burned area. In practice, controlling the
fire area is based on the experience of firefighters.
At present, with the advance of computational
modeling methods, estimating the burned area can
be made more accurate with the new technology.
Computational modeling efficiency is mainly
due to the advancement in machine learning
technology. The recent invention of support vector
machine (Cortes and Vapnik, 1995; Vapnik, 2013)
has made machine being able to learn both linear
and non-linear classification models based on the
application of specific kind of kernel functions.
Support vector machine, or SVM, has been proven
an efficient learner and extensively applied in
environmental science and other numerous research
areas. Some examples of SVM applications to
support natural phenomenon study includes the
estimation of horizontal global solar radiation (Baser
and Demirhan, 2017), landslide assessment due to
rainfall effect (Lin et al., 2017), and the prediction
of wind power (Yuan et al., 2017).
However, it is not a straightforward task to apply
SVM successfully in every domain because SVM is
a parametric learning approach that needs a proper
setting of parameters best suitable for each specific
data domain. Data analysts, therefore, need some
experiences and prior knowledge regarding the
nature of SVM before applying it efficiently.
In this work, we propose an empirical study of
applying SVM to estimate the burned area of forest
fires in the largest natural park of Portugal, named
Montesinho. We show in our experimental setting
that using different kinds of kernel function results
in different yields. We explain major characteristics
of SVM as a background knowledge for general
readers in the next section. We then explain our
modeling method and SVM setting in section 3. The
Kerdprasop, N., Poomka, P., Chuaybamroong, P. and Kerdprasop, K.
Forest Fire Area Estimation using Support Vector Machine as an Approximator.
DOI: 10.5220/0007224802690273
In Proceedings of the 10th International Joint Conference on Computational Intelligence (IJCCI 2018), pages 269-273
ISBN: 978-989-758-327-8
Copyright © 2018 by SCITEPRESS Science and Technology Publications, Lda. All rights reserved
269
experimental results are shown in section 4 and the
conclusion is provided in section 5.
2 SVM CHARACTERISTICS
SVM is a very fast and effective algorithm for
learning a classification model. The term model
means a concise form that can be used to classify
future data into their correct class. SVM learns to
build a model and represents it as a plane or a linear
line. This line is called a classifier. Figure 1
illustrates the idea of learning SVM classification
model from the available data of mixed classes: the
one that is represented by dark dot, and the other
class shown by light dot. The learning objective of
SVM is to find a linear line being able to separate
correctly data of one class from another.
In this simple example, a classification model is
represented as a linear blue line in the middle of the
figure. There are many possible linear lines being
qualified to be a classifier, but there is only one
optimal classifier. Optimality is judged from the
farthest distance between the classification line and
the data at the border lines. In Figure 1, the dashed
lines on both sides of the classification line are
boundaries for selecting the optimal model in such a
way that the distance between the classification line
to both borders are the wideset one. Data on the
dashed lines are called support vectors.
Figure 1: SVM learning on linearly separable data.
Figure 2: The application of kernel function to learn
classifier for non-linear separable data.
For data that cannot be separated easily with the
linear line, some transformation function is needed
to change the orientation of data to be conveniently
separable through the straight line. Figure 2
illustrates the idea of data transformation. The
function that transforms data from normal plane to a
hyperplane is called a kernel function. With the
power of data transformation through the application
of proper kernel function, SVM can efficiently learn
classifier that can classify non-linear data.
There are many possible kernel functions to
transform data to be in a higher feature space that
can help SVM linearly separating data. Among
many existing functions, the most applicable one is
the radial basis function. Its computation (Cristianini
and Shawe-Taylor, 2000) is shown in (1) and (2). In
our work, we also consider a simpler kernel
function, called polynomial, as shown in (3).

,
=
−

(1)
=−
1
2
(2)

,
=
,
+

(3)
where γ is gamma parameter, X
i
is a vector of input
variables, X
j
is the target variable, σ is a free
variable, q is the degree of polynomial function, and
θ
is the bias.
3 RESEARCH METHODOLOGY
3.1 Study Area and Data Preparation
The forest fire data used in our study are historical
events occurred at the Montesinho natural park
(Figure 3).
Figure 3: Location of Montesinho natural park in the
northeast of Portugal.
IJCCI 2018 - 10th International Joint Conference on Computational Intelligence
270
This park covers 748 km
2
, or 74,229 ha, in the
mountainous region of the northeast Portugal with
altitude ranges from 438 m in the lower valley to
1481 m over the mountain top (Castro et al., 2010).
The forest fire data are publicly available at the
UCI machine learning repository (https://archive.ics.
uci.edu/ml/datasets/forest+fires). Fire data had been
collected from January 2000 to December 2003
comprising of 517 records with 13 attributes in each
record. In our study, we select only 9 attributes to be
used in the modeling process. The attribute details
are summarized in Table 1.
Table 1: Forest fire data attributes and meaning.
Attribute name Description Unit
FFMC Fine Fuel Moisture Code --
DMC Duff Moisture Code --
DC Drought Code --
ISI Initial Spread Index --
Temp Temperature
ο
C
RH Relative Humidity %
Wind Wind speed km/h
Rain Rain volume mm/m
2
Area Total burned area ha
The attributes FFMC, DMC, DC, ISI are parts of
major components to compute the danger rating
scales of forest fires (Taylor and Alexander, 2006).
The FFMC determines influence of litters for the
ignition and spread of fire. The DMC and DC
identify fire intensity, while ISI correlates to the fire
velocity spread. The other four attributes (temp, RH,
wind, rain) are meteorological data that can also
affect fire spread. The target of our modeling is the
last attribute, area.
3.2 Modeling Techinque
Prior to the modeling process of fire area estimation,
we have to explore the distributions of our data.
From data exploration, we have found that from the
517 records, there are 247 records (almost 48%) that
burned area is zero. This is due to the data collection
threshold that burned area less than 100 m
2
shall not
be recorded. We, therefore, have to rescale the
burned area with the formula shown in (4).
burned_area = ln(area + 1) (4)
The comparison of original burned area and the
new one after transformation is graphically shown in
Figure 4. The transformation makes data less skew
and hence can increase correctness on burned area
prediction.
Figure 4: Distributions of the burned area in the original
data (above) compared to the area after logarithmic scaling
(below), where vertical axis is frequency of fires and
horizontal axis is the burned area.
We then randomly select ten datasets of equal
size for the purpose of ten iterations of train SVM to
build model and test the built model (10-fold cross
validation). For the SVM learning with radial basis
kernel function, we set the gamma (γ) parameter to
be 80. For the SVM training with polynomial kernel
function, the learning parameters q = 7, γ = 1, and
θ = 1.
The model testing has been performed ten times
and the model’s performances are evaluated with the
mean absolute error (MAE) and root mean square
error (RMSE) metrics. The computations (Al Janabi,
Al Shourbaji, and Salman, 2017) of MAE and
RMSE are shown in (5) and (6), respectively.
=

−

(5)
=

−


(6)
where
is the actual value of burned area,
is
the estimated burned area, and n is the number of
data records.
Forest Fire Area Estimation using Support Vector Machine as an Approximator
271
4 RESULTS AND DISCUSSION
The results of forest fire burned area prediction from
the ten iterations of SVM learning algorithm using
polynomial and radial basis kernel functions are
illustrated in Table 2. For the specific application of
natural phenomenon prediction such as forest fires,
polynomial kernel produces more accurate
estimation than the radial basis function. The
prediction results are graphically shown in Figure 5.
Table 2: Error evaluation results from the ten iterations of
SVM-polynomial and SVM-radial basis kernel functions.
MAE RMSE
No.
SVM-
polynomial
SVM-
radial basis
SVM-
polynomial
SVM-
radial basis
1 6.4814 11.0621 7.6575 56.0906
2 6.4813 11.0619 7.6577 56.0906
3 6.4814 11.0618 7.6578 56.0906
4 6.4814 11.0620 7.6576 56.0906
5 6.4813 11.0624 7.6575 56.0907
6 6.4813 11.0621 7.6574 56.0906
7 6.4812 11.0619 7.6574 56.0906
8 6.4814 11.0623 7.6577 56.0907
9 6.4816 11.0620 7.6577 56.0906
10 6.4815 11.0620 7.6577 56.0907
Avg.
6.4814
11.0620
7.6576
56.0906
Figure 5: Comparative plots showing estimation errors of
radial basis kernel (above) and polynomial kernel (below).
From the prediction plots in Figure 5, it is
noticeable that the radial basis kernel cannot predict
correctly burned area wider than 100 ha. To analyze
absolute errors, we show the boxplot in Figure 6 and
the errors made by radial basis kernel are from the
too high approximation.
The SVM learning using exactly the same set of
forest fire data also appears in the literature (Al
Janabi, Al Shourbaji, and Salman, 2017; Cortez and
Morais, 2007). But the kernel application, the data
attribute selection, and SVM parameter setting are
different from our work. The prediction results of
our work as compared to others are also summarized
and shown in Table 3.
From the comparative results, it is our SVM with
polynomial kernel model that performs the most
accurate prediction of forest fire burned area.
Figure 6: Boxplot showing absoluter errors of polynomial
kernel (left) against the radial basis kernel (right).
Table 3: Comparative performance of SVM predictors.
Modeling method RMSE MAE
Our SVM with polynomial kernel
7.65 6.48
Our SVM with radial basis kernel 56.09 11.06
SVM by Cortez and Morais (2007) 64.70 12.71
SVM by Al Janabi et al. (2017) 54.00 282.40
5 CONCLUSIONS
In this work, we study the performance of support
vector machine (SVM) algorithm when it has been
applied to the environmental domain to predict
burned area of the forest fires. SVM and other
computational models such as logistic regression,
artificial neural network, and particle swarm
intelligence have recently been applied to the
IJCCI 2018 - 10th International Joint Conference on Computational Intelligence
272
modeling of forest fire spread and intensity. The
advantage of accurate prediction with computational
models is to efficiently control the damage caused
by forest fires.
It has been reported by many research teams that
SVM yield the most promising results. But most
applications of SVM employ a sophisticate radial
basis function as the kernel of SVM. We
demonstrate in our experiment that for some specific
application, a simpler kernel such as polynomial
function performs better than the complex one. The
polynomial SVM predicts correctly burned area with
the root mean square error as low as 7.65, whereas
the radial basis kernel yields higher error at 56.09.
ACKNOWLEDGEMENTS
This work was financially supported by grants from
the Thailand Toray Science Foundation, the National
Research Council of Thailand, and Suranaree
University of Technology through the funding of the
Data and Knowledge Engineering Research Units.
REFERENCES
Al Janabi, S., Al Shourbaji, I., Salman, M. A., 2017.
Assessing the suitability of soft computing approaches
for forest fires prediction.
Applied Computing and
Informatics, Available at https://doi.org/10.1016/j.aci.
2017.09.006
Baser, F., Demirhan, H., 2017. A fuzzy regression with
support vector machine approach to the estimation of
horizontal global solar radiation.
Energy, vol. 123, pp.
229-240.
Castro, J., de Figueiredo, T., Fonseca, F., Castro, J. P.,
Nobre, S., Pires, L. C., 2010. Motesinho natural park:
general description and natural values. In N.
Evelpidou, T. Figueiredo, F. Mauro, V. Tecim, and A.
Vassilopoulos (eds), Natural Heritage from East to
West, Springer
.
Cortes, C., Vapnik, V., 1995. Support-vector network.
Machine Learning, vol. 20, no. 3, pp. 273-297.
Cortez, P., Morais, A., 2007. A data mining approach to
predict forest fires using meteorological data. In 13
th
EPIA – Portuguese Conference on Artificial
Intelligence, pp. 512-523.
Cristianini, N., Shawe-Taylor, J., 2000. An Introduction to
Support Vector Machines and Other Kernel-based
Learning Methods, Cambridge University Press.
Lin, G. F., Chang, M. J., Huang, Y. C., Ho, J. Y., 2017.
Assessment of susceptibility to rainfall-induced
landslides using improved self-organizing linear
output map, support vector machine, and logistic
regression. Engineering Geology, vol. 224, pp. 62-74.
Taylor, S. W., Alexander, M. E., 2006. Science,
technology, and human factors in fire danger rating:
the Canadian experience. International Journal of
Wildland Fire, vol. 15, no. 1, pp. 121-135.
Yuan, X., Tan, Q., Lei, X., Yuan, Y., Wu, X., 2017. Wind
power prediction using hybrid autoregressive
fractionally integrated moving average and least
square support vector machine. Energy, vol. 129, pp.
122-137.
Vapnik, V. 2013. The Nature of Statistical Learning
Theory, Springer Science & Business Media.
Forest Fire Area Estimation using Support Vector Machine as an Approximator
273