The national weather service reported a total of
107 days with surface thermal inversions during
2010, the highest in the past 13 years. The largest
part was recorded during the winter months, when
the long and cold nights favor its formation. In the
dry season months it has been reported a 40% of
days with thermal inversion. The months of April
and December had the largest number of events with
16 and 17 days, respectively. The influence of high
pressure systems during the months of March to
May was responsible for the formation of surface
thermal inversions (NWM, 2012).
In this research we propose predictions models
of hourly concentrations of PM
2.5
, based on data
obtained at downtown Mexico city. We show results
obtained with two different methods, all of which
use past values of PM
2.5
as input. The simplest
method is persistence, which assigns hourly values
on the next day equal to the values at the present
day. Then we used the fuzzy inductive reasoning
approach that is a non-linear methodology based on
fuzzy logic and pattern recognition. We used
registered data of 4 year periods, each lasting six
months starting on December 1
st
. As explained
before, the months from December to May are the
ones that have higher levels of PM
2.5
concentrations
in Mexico city metropolitan area.
In section 2 some basic concepts of the fuzzy
inductive reasoning approach are introduced. In
section 3 the methodology used is described, i.e. the
data, the fuzzy models development and the models
evaluation. Section 4 describes the results obtained.
Finally the conclusions of this research are given.
2 FUZZY INDUCTIVE
REASONING (FIR)
The conceptualization of the FIR methodology
arises of the General System Problem Solving
(GSPS) approach proposed by Klir (Klir and Elias,
2002). This methodology of modeling and
simulation is able to obtain good qualitative
relations between the variables that compose the
system and to infer future behavior of that system.
It has the ability to describe systems that cannot
easily be described by classical mathematics or
statistics, i.e. systems for which the underlying
physical laws are not well understood.
The Fuzzy Inductive Reasoning (FIR)
methodology, offers a model-based approach to
predicting either univariate or multi-variate time
series (Nebot et al., 2003); (Carvajal and Nebot,
1998). A FIR model is a qualitative, non-
parametric, shallow model based on fuzzy logic.
Fuzzy logic-based methods have not been applied
extensively in environmental science, however,
some interesting research can be found in the area
of modeling of pollutants (Mintz et al., 2005);
(Ghiaus, 2005); (Morabito and Versaci, 2003);
(Heo and Kim, 2004); (Yildirim and Bayramoglu,
2006); (Peton et al., 2000); (Onkal-Engin et al.,
2004), where different hybrid methods that make
use of fuzzy logic are presented for this task.
Visual-FIR is a tool based on the Fuzzy
Inductive Reasoning (FIR) methodology (runs under
Matlab environment), that offers a new perspective
to the modeling and simulation of complex systems.
Visual-FIR designs process blocks that allow the
treatment of the model identification and prediction
phases of FIR methodology in a compact, efficient
and user friendly manner (Escobet et al., 2008).
The FIR model consists of its structure (relevant
variables) and a set of input/output relations (history
behavior) that are defined as if-then rules. Feature
selection in FIR is based on the maximization of the
models' forecasting power quantified by a Shannon
entropy-based quality measure. The Shannon
entropy measure is used to determine the uncertainty
associated with forecasting a particular output state
given any legal input state. The overall entropy of
the FIR model structure studied, H
s,
is computed as
described in equation 1.
()
i
i
HpiH
∀
−⋅
,
(1)
where p(i ) is the probability of that input state to
occur and H
i
is the Shannon entropy relative to the
i
th
input state. A normalized overall entropy H
n
is
defined in equation 2.
max
1
n
H
H
H
=−
(2)
H
n
is obviously a real-valued number in the range
between 0.0 and 1.0, where higher values indicate an
improved forecasting power. The model structure
with highest H
n
value generates forecasts with the
smallest amount of uncertainty.
Once the most relevant variables are identified,
they are used to derive the set of input/output
relations from the training data set, defined as a set
of if-then rules. This set of rules contains the
behaviour of the system. Using the five-nearest-
neighbors (5NN) fuzzy inferencing algorithm the
five rules with the smallest distance measure are
selected and a distance-weighted average of their
SIMULTECH2012-2ndInternationalConferenceonSimulationandModelingMethodologies,Technologiesand
Applications
528