Experiments and Design of an Inference Fuzzy System

F. Benmakrouha, C. Hespel, E. Monnier and D. Quichaud

Computer Sciences INSA, Rennes, France

Keywords:

Fuzzy System, Diabetes, Datum Plane Covering.

Abstract:

The aim of this paper is to propose a criterion to estimate the design, from experimental data, of a fuzzy infer-

ence system, when data are sparse. This lack of data is important and may improve the generalisation ability

of fuzzy systems (Isao Ishibuchi, 2002).

Several methods have been proposed to obtain automatic fuzzy rules from sparse training data. In (Cruz

Vega Israel, 2010), the authors ﬁrst construct fuzzy rules from collect data. Then, they use kernel regressions

for generate training data.

Another technique used when classical inference methods produce sparse fuzzy rules is a diffusion proce-

dure based on interpolation to initialize incomplete rules (Benmakrouha, 1997), (Glorennec, 1999), (Baranyi,

1996). Our method has the advantage of occuring before initialization step and therefore avoiding unﬁred

rules which make difﬁcult to produce an accurate output.

1 INTRODUCTION

The lack of data is important and is pointed out by

M Lutaud-Brunet in (Lutaud, 1996). Sugeno and Ya-

sukawa underline in (Sugeno and Yasukawa, 1993)

how difﬁcult it is to build a fuzzy model when data are

scarce and membership functions don’t sweep over all

the universe of discourse. This may improve the gen-

eralisation ability of fuzzy systems (Isao Ishibuchi,

2002).

Several methods have been proposed to obtain au-

tomatic fuzzy rules from sparse training data.

In (Cruz Vega Israel, 2010), the authors ﬁrst con-

struct fuzzy rules from collect data. Then, they use

kernel regressions for generate training data.

Another technique used when classical inference

methods produce sparse fuzzy rules is a diffusion

procedure based on interpolation to initialize incom-

plete rules (Benmakrouha, 1997), (Glorennec, 1999),

(Baranyi, 1996).

The objective of this paper is to measure the im-

pact of datum plane covering on the outcome of a

fuzzy inference system. Most of optimization meth-

ods make the assumption that datum plane is sufﬁ-

ciently covered. If this assumption no longer holds,

we will see that these methods cannot work, since it

implies that, before optimization, the fuzzy system gi-

ves acceptable results. In (Benmakrouha et al., 2010),

we analysed the relationship between learning set Ω

and labels of fuzzy inference system. In this paper,

we take into account a data density repartition, by the

measure of number of available data on intervals of

each input variable domain.

All these tecniques take place during optimisation

step when our method is used before initialization

step. This in turn allows one to isolate unﬁred rules

and to proceed, if necessary, to a partial remodelling

of the FIS before training and optimization.

2 THE TAGAKI-SUGENO MODEL

The model under consideration in this section is

a Tagaki-Sugeno model, which corresponds to dis-

cretized linear models of order 1, combined with non-

linear functions.

y(t) =

∑

i=1

(z(t))(a

.y(t −idecal)+b

.u(t −idecal))

y(t) is the output,

r is the number of linear models,

z(t) a vector which depends lineary or not on the

state,

(z(t)) >= 0 , i = 1, ··· , r nonlinear functions

verifying the convex sum property.

420

Benmakrouha F., Hespel C., Monnier E. and Quichaud D..

Experiments and Design of an Inference Fuzzy System.

DOI: 10.5220/0004148904200423

In Proceedings of the 4th International Joint Conference on Computational Intelligence (FCTA-2012), pages 420-423

ISBN: 978-989-8565-33-4

 2012 SCITEPRESS (Science and Technology Publications, Lda.)

idecal is the time lag between input and its effect,

which is specially interesting in our application.

We have measures about every ﬁve minutes and

we admit that the effect of insulin (considered in

our application)is fast and noticeable ten (idecal=2)

minutes later, up to half an hour(idecal = 6).

The determination of unknown parameters a

and

is done by the algorithm of recursive least square.

3 APPLICATION TO THE

INSULIN/GLYCAEMIA

BEHAVIOR OF DIABETICS

3.1 The Available Data

The correlated data ”‘insulin infusion deliv-

ery/glycaemia”’ has been provided by the team

of Pr. Pinget, CHU of Strasbourg. They concern the

same person and the same insulin.The insulin infu-

sion has been done by an intra-peritoneal route and

the glycaemia has been checked by a subcutaneous

sensor. Measures of glycaemia have been made every

ﬁve minutes during 7 days, which corresponds to

1700 measures.

A bolus is a dose of insulin infused manually, in ad-

dition to the basic dose, since postprandial glycemia

cannot be regulated satisfactorily. The insulin ﬁle

contains crude data about basic insulin doses as well

as boluses. So, a pretraitement of the insulin ﬁle has

been necessary to produce a ﬁle of insulin delivery

for the same person every ﬁve minutes.

3.2 Experiments and Validation of the

Model

The learning set is composed of the ﬁrst mea-

sures(280 points) that corresponds to insulin infusion

and blood glucose concentration of a patient during a

day. We take 7 (r = 7)linear models, considering that

each model is valid about three and half hours. The

mean square error(MSE)is calculated on the totality

of the measures(1700 points).

We make experiments by changing the parameter

idecal of the model, time lag between an input and its

effect.

The test of our modeling method shows that we

can predict the glycaemia over a long period (7 days),

by considering glycaemia and insulin delivery 15-

minute (resp 30-minute) before with an error of about

6%(resp 16%), which is a good result compared with

current results. However, we see that results obtained

Table 1: First table.

r idecal MSE

7 2 0.04

7 3 0.06

7 6 0.16

7 24 1.02

are not so good in the last case, when we consider

slow effect insulin (with 2-hours delay). In this case,

our model has to be reﬁned, by increasing its order.

4 DATUM PLANE COVERING

We propose a measure used to pre-validate a fuzzy

model. We suppose that there exists a learning set

Ω = {(x

, d

)}, where x

is an input vector and d

the corresponding output. We also assume that the

desired function f is deﬁned in

V = [a

, b

] × [a

, b

] × ... × [a

, b

]

Usually, to validate a fuzzy inference system, the

mean square error (MSE) is calculated on a test set.

If the MSE exceeds a threshold, then training is done,

using a gradient method. This consists in modifying

at each presentation of examples from the error

(y(x

) − d

Unfortunately, in case of model invalidation, we can-

not determine never learned rules that cause the gap

between the model and the real system. Moreover, if

there is an insufﬁcient covering of datum plane, train-

ing and ﬁner splitting of input space are inefﬁcient

and useless. With the criterion proposed below, we

estimate the datum plane coverage and we are able

to isolate inactivated rules. Then, partial remodeling

of the fuzzy inference system is possible. The study

is investigating the relationship between a quantita-

tive variable X , number of available data for each in-

put, and a qualitative variable Y , labels of membership

functions.

When designing a fuzzy system, we attribute

to each input I r modalities (or labels) noted

, ··· y

. We note X

the variable for the input I

of average ¯x

and variance σ

. We note Ω

the cor-

responding learning set . Each label y

of I deﬁnes

a subset Ω

of Ω

: we obtain a partition of Ω

in m

classes. We note n

= card(Ω

) and n

= card(Ω

We have n

= Σ

l=1

. Then, if we consider the re-

striction de X

to Ω

) (l = 1, ··· , m), we may deﬁne

ExperimentsandDesignofanInferenceFuzzySystem

421

the average (noted ¯x

)and the variance (noted σ

on this subset:

¯x

∑

ω∈Ω

X(ω)

∑

ω∈Ω

(X(ω) − ¯x

)

We have an index of connection between the datum

plane coverage (for an input I) and the learning set

deﬁned by :

where

= σ

+ σ

and

∑

l=1

∗ ( ¯x

− ¯x

)

and

∑

l=1

∗ σ

and

¯x

∑

l=1

∗ ¯x

)

This index of connection consists in detecting rela-

tionships between number of data of the learning set

Ω

and r

labels. This index is low if the features of

these labels are not so different (Test 1). When this

index is high, it points to that there is a bad repartition

of membership functions (Test 3). This gives an in-

formation about the repartition of data of learning set

Ω

between membership functions.

4.1 Experiments

We have made 3 tests for the ﬁrst input (with two tri-

angular membership functions) using our application.

We give in these array the features of these functions.

We obtain for the ﬁrst test a low index(0.035), the sec-

ond a medium index (0.28) and for the last test a high

index (0.84). In the third test, the ﬁrst membership

function is useless and the corresponding rules are in-

actived. So, we can suppress them without affecting

results. We have made a 4th test where three (out of

four) membership functions and the associated rules

were unnecessary.

Table 2: First membership function.

Test Center Left corner Right corner

1 1.4 1.6 1.0

2 1.0 0.6 1.0

3 0.2 0.2 0.2

Table 3: Second membership function.

Test Center Left corner Right corner

1 2.6 1.0 0.9

2 2.6 1.0 0.9

3 3.0 2.8 0.5

4.2 Graphic Representation

We represent sets of data by Box & Whiskers Plots

to underline the relation between number of data and

labels of membership functions. The ﬁrst ﬁgure (resp

ﬁgure 2 and ﬁgure 3) corresponds to Test1 (resp Test

2 and Test 3).

Figure 1.

5 CONCLUSIONS

We have proposed a measure for detecting useless

rules and thus pre-validating a fuzzy inference sys-

tem. When the model is not pre-validated, we have

not to carry out next steps, particularly optimization

step.

We have shown that this criterion gives useful in-

formation about datum plane coverage.

IJCCI2012-InternationalJointConferenceonComputationalIntelligence

422

Figure 2.

Figure 3.

REFERENCES

Baranyi, P. K. (1996). A general and specialised solid cut-

ting method for fuzzy rule interpolation. J. BUSEFAL

URA-CNRS, 66:13–22.

Benmakrouha, F. (1997). Parameter identiﬁcation in a fuzzy

system with insufﬁcient data. In Sixth IEEE Inter-

national Conference on Fuzzy Systems, volume 542,

page 1:537.

Benmakrouha, F., Hespel, C., and Monnier, E. (2010). An

algorithm for rule selection on fuzzy rule-based sys-

tem applied to the treatment of diabetics and detection

fraud in electronic payment. In FUZZ-IEEE.

Cruz Vega Israel, W. Y. (2010). Multiple fuzzy neural net-

works modeling with sparse data. Neurocomputing,

pages 2446–2453.

Glorennec, P. Y. (1999). Algorithmes d’apprentissage pour

syst

emes d’inf

erence ﬂoue. Paris France Herm

es.

Isao Ishibuchi, T. Y. (2002). Performance evaluation of

fuzzy partitions with different fuzziﬁcation grades. In

Proceedings of the 2002 IEEE International Confer-

ence Fuzzy Systems FUZZ-IEEE’02.

Lutaud, M. (1996). Identiﬁcation et Contr

ole de proces-

sus par r

eseaux Neuro-Flous. PhD thesis, Universit

d’Evry Val d’Essonne.

Sugeno, M. and Yasukawa, T. (1993). A fuzzy-logic-based

approch to qualitative modeling. IEEE Trans. on

Fuzzy Systems, 1(1).

ExperimentsandDesignofanInferenceFuzzySystem

423