A STEPWISE PROCEDURE TO SELECT VARIABLES

IN A FUZZY LEAST SQUARE REGRESSION MODEL

Francesco Campobasso and Annarita Fanizzi

Department of Statistical Sciences “Carlo Cecchi”, University of Bari, Bari, Italy

Keywords: Fuzzy least square regression, Multivariate generalization, Asymmetric fuzzy intercept, Total sum of

squares, Goodness of fit, Stepwise procedure.

Abstract: Fuzzy regression techniques can be used to fit fuzzy data into a regression model. Diamond treated the case

of a simple model introducing a metrics into the space of triangular fuzzy numbers. In previous works we

provided some theoretical results about the estimates of a multiple regression model with a non-fuzzy

intercept; in this paper we show how the sum of squares of the dependent variable can be decomposed in

exactly the same way as the classical OLS estimation procedure only when the intercept is fuzzy

asymmetric. Such a decomposition allows us to introduce a stepwise procedure which simplifies, in terms of

computational, the identification of the most significant independent variables in the model.

1 INTRODUCTION

Modalities of quantitative variables are commonly

given as exact single values, although sometimes

they cannot be precise. The imprecision of

measuring instruments and the continuous nature of

some observations, for example, prevent researcher

from obtaining the corresponding true values.

On the other hand qualitative variables are

commonly expressed using common linguistic

terms, which also represent verbal labels of sets with

uncertain borders.

The appropriate way to manage such an

uncertainty of observations is provided by using

fuzzy numbers.

In 1988 P. M. Diamond introduced a metric onto

the space of triangular fuzzy numbers and derived

the expression of the estimated coefficients in a

simple fuzzy regression of an uncertain dependent

variable on a single uncertain independent variable.

Starting from a multivariate generalization of this

regression, we provided in previous works some

results on the decomposition of the deviance of the

dependent variable according to Diamond’s metric.

2 THE FUZZY LEAST SQUARE

REGRESSION

A triangular fuzzy number

TRL

)x,x,x(X

= for the

variable X is characterized by a function

[

]

0,1X:μ

→

, like the one represented in Fig. 1,

that expresses the membership degree of any

possible value of X to

The accumulation value x is considered the core

of the fuzzy number, while

−=ξ

and

xx −=ξ

are considered the left spread and the right spread

respectively.

Figure 1: Representation of a triangular fuzzy number.

Note that x belongs to X

with the highest degree

(equal to 1), while the other values included between

417

Campobasso F. and Fanizzi A..

A STEPWISE PROCEDURE TO SELECT VARIABLES IN A FUZZY LEAST SQUARE REGRESSION MODEL.

DOI: 10.5220/0003720504170426

In Proceedings of the International Conference on Evolutionary Computation Theory and Applications (FCTA-2011), pages 417-426

ISBN: 978-989-8425-83-6

 2011 SCITEPRESS (Science and Technology Publications, Lda.)

the left extreme

x and the right extreme

belong to

with a gradually lower degree.

The set of triangular fuzzy numbers is closed

under addition: given two triangular fuzzy numbers

TRL

)x,x,x(X

and

TRL

)y,y(y,Y

, their sum

is still a triangular fuzzy number

TRRLL

)yx,yx,yx(Y

+++=+=

. Moreover

the opposite of a triangular fuzzy number

TRL

)x,x,x(X

TLR

)x,x,x(X

−−−=−

It follows that, given n fuzzy numbers

TRiLiii

)x,x,x(X

, i =1, 2, .., n, their average is

RiLiii

⎟

⎠

⎞

⎜

⎝

⎛

∑∑∑

∑

= .

Diamond (1988) introduced a metrics onto the

space of triangular fuzzy numbers; according to this

metrics, the squared distance between

and Y

()

TRLTRL

)y,y(y,,)x,x,x(d)Y

)yx()yx()yx( −+−+−

The same Author treated the fuzzy regression

model of a dependent variable

on a single

independent variable

, which can be written as

= a + b X

, a, b ∈ IR ,

when the intercept a is non-fuzzy, as well as

= A

+b X

a, b ∈ IR ,

when the intercept

TRL

)a,a(a,A

is fuzzy, where

it is

−= aa

and γ , γ

> 0.

The expression of the corresponding parameters

is derived from minimizing the sum

∑

the squared distances between theoretical and

empirical values in n observed units of the fuzzy

dependent variable

with respect to a and b.

Such a sum takes different forms according to

the signs of the coefficient b, as the product of a

fuzzy number

TRL

)x,x,x(X

and a real number k

depends on whether the latter is positive or negative.

by subtracting the right spread from the core.

Diamond demonstrated that the optimization

problem has a unique solution under certain

conditions.

In previous works we provided some theoretical

results about the estimates of the regression

coefficients and about the decomposition of the sum

of squares of the dependent variable (Campobasso,

Fanizzi and Tarantini, 2009) in a multiple regression

model. In particular we treated the case of a non-

fuzzy intercept, as well as the case of a fuzzy

intercept, which seems more appropriate

(Campobasso and Fanizzi, 2011) for some reasons

which will be clearer later.

3 A MULTIVARIATE

GENERALIZATION OF THE

REGRESSION MODEL

3.1 A Generalization of the Model

Including a Non-fuzzy Intercept

Let’s assume to observe a fuzzy dependent variable

TRiLiii

)y,y,(yY

and two fuzzy independent

variables,

TRiLiii

)x,x,x(X

and

TRiLiii

)z,z,(zZ

= ,

on a set of n units. The linear regression model is

given by

*= a +b

, i=1,2, ...,n; a,b,c ∈ IR.

The corresponding parameters are determined by

minimizing the sum of Diamond’s distances

between theoretical and empirical values of the

dependent variable

∑

iii

ba,Y

(1)

respect to a, b and c. As we stated above, such a

sum assumes different expressions according to the

signs of the regression coefficients b and c. This

generates the following four cases

Case 1

: b>0, c>0

∑

iii

ba,Y

])czbxay(

)czbxay()czbxay[(

RiRiRi

LiLiLi

iii

−−−+

∑

+−−−+−−−=

Case 2

: b<0, c>0

∑

iii

ba,Y

])czbxay(

)czbxay()czbxay[(

RiLiRi

LiRiLi

iii

−−−+

∑

+−−−+−−−=

Case 3

: b>0, c<0

∑

iii

ba,Y

])czbxay(

)czbxay()czbxay[(

LiRiRi

RiLiLi

iii

−−−+

∑

+−−−+−−−=

Case 4

: b<0, c<0

∑

iii

ba,Y

])czbxay(

)czbxay()czbxay[(

LiLiRi

RiRiLi

iii

−−−+

∑

+−−−+−−−=

FCTA 2011 - International Conference on Fuzzy Computation Theory and Applications

418

Let’s consider, as an example, case 3 and let’s

express it in matricial terms. The expression to be

minimized is given by

222

()

LL RR

GyXyX yX

ββ β β

=− + − + − =

(2)

()'()( )'( )

()'()

LL LL

RR RR

yX yX y X y X

yX yX

βββ

ββ

=− − + − − +

+− −

where

y = [y

], is the n-dimensional vector of cores of the

dependent variable;

= [

] and y

= [

] are the n-dimensional

vectors of lower extremes and upper extremes of the

dependent variable respectively;

X is the n×3 matrix of cores of the independent

variables, formed by vectors 1, x = [ x

], z = [ z

];

is the n×3 matrix of lower bounds of the

independent variables, formed by vectors 1, x

[

x], z

= [

z];

is the n×3 matrix of upper bounds of the

independent variables (analogous to

), formed by

vectors

1, x

, z

;

β is the vector (a, b, c) '.

The estimates of the regression coefficients are

derived from minimizing G(

β) with respect to β i.e.

from seeking the solutions of the system

0]'''[]'''[

−

RRLLRRLL

XyXyXyXXXXXX β

and in particular we obtain

]'''[]'''[

RRLLRRLL

yXyXyXXXXXXX ++++=

−

Similarly to OLS estimation procedure, the

optimization problem admits a single and finite

solution if

]'''[

RRLL

XXXXXX ++

is invertible and the

hessian matrix is definite positive.

The found solution

=(a

, b

, c

)', is admissible

if the signs of the regression coefficients are

coherent with basic assumptions (b >0, c <0).

In the remaining three cases the expression (2) to

be minimized is obtained after replacing

by z

and z

by z

in X

(case 1), x

by x

and z

in X

and also x

by x

and z

by z

in X

(case

2), x

by x

in X

and x

by x

in X

(case 4)

respectively.

The optimum solution corresponds to that

(admissible) one which makes minimum (1) among

all.

The generalization of such a procedure to the

case of several independent variables is immediate

and that the number of solutions to analyse, in order

to identify the optimum one, growths exponentially

with the considered number of variables. For

example, if the model includes k independent

variables, 2

possible cases must be taken into

account, which derive from combining the signs of

the regression coefficients.

3.2 A Generalization of the Model

Including a Fuzzy Intercept

Now we analyze an extension of the model with a

fuzzy intercept, which seems more appropriate than

the non-fuzzy one as it expresses the average value

of the dependent variable (which is also

fuzzy) when

the independent variables equal zero.

For this purpose we start from the results obtained

by Diamond in the case of the univariate regression

model with a fuzzy intercept.

3.2.1 The Univariate Model

Let’s regress, for example, the dependent variable

TRiLiii

)y,y,(yY

on a single independent variable

TRiLiii

)x,x,x(X

in a set of n units. If we

consider a symmetric fuzzy

intercept

TRL

)a,a(a,A

, where

γ−= aa

γ+= aa

and

> 0 (if γ = 0, A

would be no more fuzzy), the

model assumes the following expression:

i = 1, 2, ..., n; a, b ∈ IR .

The fuzzy regression parameters are determined

by minimizing the sum of the squared Diamond’s

distances between theoretical and empirical fuzzy

values of the dependent variable

∑

respect to a, b and

γ.

The function to minimize assumes different

expressions according to the sign of the regression

coefficients b. Supposing that b > 0, the estimates of

a,b and

γ are obtained as solutions a

, b

and γ

the system of equations

[]

⎪

⎩

⎪

⎨

⎧

∑

++−++=

∑

−−−=γ

∑

++=

∑

+++

∑

−γ+++

.)xxb(xyyy

)]xx(b)yy[(2n

)xyxyxy(

)xxx(b)xx()xx(xa

RiLiiRiLii

LiRiLiRi

RiRiLiLiii

iLiRiRiLii

Otherwise, supposing b<0, the estimates of a, b

and

γ are obtained as solutions a

, b

and γ

of the

system of equations

[]

iLiRi RiLi

22 2

iLiRi iLiRi

a(xx x) (x x)

()( )

2n [( ) ( )]

na yyyb(xxx).

i LiRi iiLiRiRiLi

Ri Li Ri Li

bxxx xyyxyx

yy bxx

⎧

++ − − +

⎪

++ = + +

⎪

=−+−

⎨

⎪

=++−++

⎪

⎩

∑

∑∑

∑

A STEPWISE PROCEDURE TO SELECT VARIABLES IN A FUZZY LEAST SQUARE REGRESSION MODEL

419

As Diamond shows (1988), the solution to such a

problem of minimization exists and is unique if the

following conditions occur simultaneously:

either b

< 0 or b

> 0;

0)yy(

)yy()xx(

)xx(

LiRiLiRiLiRiLiRi

≥

∑

⎥

⎦

⎤

⎢

⎣

⎡

−−−

⎥

⎦

⎤

⎢

⎣

⎡

−−−

;

> b

3.2.2 The Multivariate Model

Now we generalize the regression model with a

fuzzy intercept to the case of more than a single

independent variable.

Assuming to regress a dependent variable

TRiLiii

)y,y,(yY

on two independent variables

TRiLiii

)x,x,x(X

and

TRiLiii

)z,z,(zZ

= in a set

of n units, the linear regression model including a

fuzzy asymmetric intercept

TRL

)a,a(a,A

= , where

γ−= aa

−= aa

and

, γ

> 0 (if

= γ =

0, A

would be no more fuzzy), assumes the

following expression:

, i = 1, 2, ..., n; a, b, c ∈ IR .

Note that the asymmetric intercept is more

appropriate the symmetric one, which evidently fits

the data in a less efficient way.

The corresponding estimates of the parameters

are again determined by minimizing the sum of the

squared Diamond’s distances between empirical

and theoretical values of the dependent variable

∑

iii

(3)

respect to a, b, c,

and

. The function to

minimize assumes different expressions according

to the signs of the regression coefficients b and c.

Case 1: b>0, c>0

∑

iii

])czbxay(

)czbxay()czbxay[(

RiRiRRi

LiLiLLi

iii

−−−+

∑

+−−−+−−−=

Case 2: b<0, c>0

∑

iii

])czbxay(

)czbxay()czbxay[(

RiLiRRi

LiRiLLi

iii

−−−+

+−−−+−−−=

∑

Case 3: b>0, c<0

∑

iii

])czbxay(

)czbxay()czbxay[(

LiRiRRi

RiLiLLi

iii

−−−+

+−−−+−−−=

∑

Case 4: b<0, c<0

∑

iii

])czbxay(

)czbxay()czbxay[(

LiLiRRi

RiRiLLi

iii

−−−+

+−−−+−−−=

∑

Let’s consider, as an example, case 3 and let’s

express it in matricial terms. The expression to be

minimized is given by

22 2

()

LL RR

GyXyX yX

ββ β β

−+− +− =

(4)

()'()( )'( )

()'()

LL LL

RR RR

yX yX y X y X

yX yX

ββ β β

ββ

−−+− −+

+− −

where

y = [y

], is the n-dimensional vector of cores of the

dependent variable;

= [

] and y

= [

] are the n-dimensional

vectors of lower extremes and upper extremes of the

dependent variable respectively;

X is the n×5 matrix of cores of the independent

variables, formed by vectors

1, x = [ x

], z = [ z

]

and two vectors

is the n×5 matrix of lower bounds of the

independent variables, formed by vectors

1, x

[

x ], z

= [

z ] and -1, 0;

is the n×5 matrix of upper bounds of the

independent variables (analogous to

), formed by

vectors

1, x

, z

and 0, 1;

β is the vector (a, b, c,

) '.

The estimates of the regression coefficients are

derived from minimizing G(β) with respect to β

i.e.

from seeking the solutions of the system

0]'''[]'''[

−

RRLLRRLL

XyXyXyXXXXXX

and in particular we obtain

]'''[]'''[

RRLLRRLL

yXyXyXXXXXXX ++++=

−

Similarly to OLS estimation procedure, the

optimization problem admits a single and finite

solution if

]'''[

RRLL

XXXXXX

is invertible and the

hessian matrix is definite positive.

The found solution β

=(a

, b

, c

)', is

admissible if the signs of the regression coefficients

are coherent with basic assumptions, that is b >0, c

<0 and

> 0.

In the remaining three cases the expression (4) to

be minimized is obtained after appropriately

FCTA 2011 - International Conference on Fuzzy Computation Theory and Applications

420

replacing the vectors of the left and right extremes in

the matrices as described above, according to the

case considered. The optimum solution corresponds

to that (admissible) one which makes minimum (3)

among all.

When the intercept is symmetric, we estimate a

parameter less than the previous model, because the

spreads left and right coincide (Campobasso and

Fanizzi, 2011). Note that the matrices

X, X

and X

relative to independent variables, and the vector of

parameters β

change their expression. In particular

we have that

X is the n×4 matrix of cores of the independent

variables, formed by vectors

1, x = [ x

], z = [ z

]

and

is the n×4 matrix of lower bounds of the

independent variables, formed by vectors

1, x

[

x ], z

= [

z ] and -1;

is the n×4 matrix of upper bounds of the

independent variables (analogous to

), formed by

vectors

1, x

, z

and 1;

β is the vector (a, b, c,

) '.

4 DECOMPOSITION OF THE

TOTAL SUM OF SQUARES OF

THE DEPENDENT VARIABLE

In this section two important theoretical results will

be demonstrated: the first one regards the inequality

between theoretical and empirical averages of the

fuzzy dependent variable (unlike in the classical

OLS estimation procedure); the second one regards

the decomposition of the total sum of squares of the

dependent variable, which involves other two

additive components besides the regression and the

residual sum of squares.

4.1 The Model Including a Non-fuzzy

Intercept

Let’s consider, only for example, the sum of

Diamond’s distances between theoretical and

empirical values of the dependent variable in the

case 3:

∑

iii

ba,Y

])czbxay(

)czbxay()czbxay[(

LiRiRi

RiLiLi

iii

−−−+

∑

+−−−+−−−=

Setting equal to 0 the derivate of

∑

iii

ba,Y

respect to a, b and c, we can

obtain the following system of equations:

⎪

⎩

⎪

⎨

⎧

=−−−+

∑

+−−−+−−−−

=−−−+

∑

+−−−+−−−−

=−−−+

+−−−+−−−

∑

−

0]z)czbxay(

z)czbxay(z)czbxay[(2

0]x)czbxay(

x)czbxay(x)czbxay

[(2

0)]czbxay(

)czbxay()czbxay([2

LiLiRiRi

RiRiLiLiiiii

RiLiRiRi

LiRiLiLiiiii

LiRiRi

RiLiLiiii

Such a system can be written as

⎪

⎩

⎪

⎨

⎧

∑

+++

∑

+++

∑

+++

∑

+++

∑

++=

∑

+++

∑

+++++

LiRiRiLi

LiLiRi

RiRiLi

iii

RiRiLiLi

RiLiRi

LiRiLi

iii

RiLi

LiRi

RiLiii

zyzyzyz)czbxa(

z)czbxa(z)czbxa(

xyxyxyx)czbxa(

x)czbxa(x)czbxa(

)yyy()czbxa(

)czbxa()czbxa(

Recalling that the theoretical values of the fuzzy

dependent variable are

czbxay ++=

RiLi

czbxay ++=

and

LiRi

czbxay ++=

, we obtain

** *

()()

iLiRi iLiRi

i i Li Li Ri Ri i i Li Li Ri Ri

i i Li Ri Ri Li i i Li Ri Ri Li

yy y yy y

yx yx yx yx yx yx

yz yz yz yz yz yz

⎧

++ = ++

⎪

++= ++

⎨

⎪

++= ++

⎩

∑

∑∑

(5)

The first equation of the system (5) shows that

the total sum of lower extremes, cores and upper

extremes of the theoretical values of the dependent

variable coincides with the same amount referred to

the empirical values. This equation does not allow us

to say that theoretical and empirical averages of the

fuzzy dependent variable coincide.

Let’s examine how the total sum of squares of

dependent variable

∑

−+−+−= ])yy()yy()yy[(SSTot

can be decomposed according to Diamond’s metric.

Adding and subtracting the corresponding

theoretical value within each square and developing

all the squares, the total deviance can be expressed

as:

=−+−+

∑

+−+−+−+−=

])yyyy(

)yyyy()yyyy[(SSTot

RiRi

LiLi

)].yy)(yy(2)yy()yy(

)yy)(yy(2)yy()yy(

)yy)(yy(2)yy()yy([

−−+−+−+

+−−+−+−+

+−−+−+−

∑

Adding and subtracting the theoretical average

values of the lower extremes, of the cores and of the

upper extremes of the dependent variable within

A STEPWISE PROCEDURE TO SELECT VARIABLES IN A FUZZY LEAST SQUARE REGRESSION MODEL

421

each square and developing all the squares, the

previous expression becomes

=−−+−+−+

+−+−−+

+−+−+−+−−+

+−+−+−

∑

)]yy)(yy(2)yyyy(

)yy()yy)(yy(2

)yyyy()yy()yy)(yy(2

)yyyy()yy([ SSTot

2***

)]yy)(yy(2

)yy)(yy(2)yy()yy(

)yy()yy)(yy(2)yy)(yy(2

)yy()yy()yy()yy)(yy(2

)yy)(yy(2)yy()yy()yy([

***

2*2**

−−+

+−−+−+−+

+−+−−+−−+

+−+−+−+−−+

+−−+−+−+−

∑

where:

∑

−+−+−=

∑

= ])yy()yy()yy[()Y,Y

(dSSReg

represents the regression sum of squares, while

∑

−+−+−=

∑

= ])yy()yy()yy[()Y

(dSSsRe

represents the residual sum of squares, and

])yy()yy()yy[(n=)Y,Y(nd

2*2*

−+−+−

represents the distance between theoretical and

empirical average values of dependent variable.

Synthetically the expression of Tot SS can be

written as:

Tot SS = Reg SS + Res SS +

)Y,Y(nd +

where:

)].yy)(yy()yy)(yy(

)yy)(yy([2)]yy)(yy(

)yy)(yy()yy)(yy[(2

***

−−+−−+

+−−

∑

+−−

∑

+−−+−−=η

As the sums of deviations of each component

from its average equal zero, then it is

0)]yy)(yy()yy)(yy()yy)(yy[(

RiL

***

=−−

∑

+−−+−−

and the amount η is reduced to

].y)yy(y)yy(

y)yy(y)yy(y)yy(y)yy[(2

)]yy)(yy(

)yy)(yy()yy)(yy([2

−−−+

+−−−+−−

∑

−=

=−−+

+−−+−−

∑

=η

Moreover, as it is

czbxay ++=

RiLi

czbxay ++=

and

LiRi

czbxay ++=

, it is

also

0y)yy(2y)yy(2y)yy(2

∑

−+

∑

−+

∑

−

By replacing expressions of the theoretical

values in the latter equation, we obtain

.]y)yy(y)yy(y)yy[(2

)]xyxyxy(c)zyzyzy(c

)xyxyxy(b)xyxyxy(b

)yyy(a)yyy(a[2

LiRiRiLiii

RiRiLiLiii

RiLii

∑

−+−+−−

+++−+++

∑

+++−++=η

According to the condition (5) the last expression

can be reduced to

.]y)yy(y)yy(y)yy[(2

∑

−+−+−−=η

Note that, if the residual sum of squares equals

zero, also η and

)Y,Y(d equal zero, because

theoretical and empirical average values of the

dependent variable coincide for each observation.

Therefore:

- if the regression sum of squares equals zero, then

the model has no forecasting capability, because the

sum of the components of the i-th theoretical value

equals the sum of the components of the empirical

average value (i = 1 ,..., n). Actually it is for each i

∑

++=

∑

RiLii

yyyyyy =>

ynynynnynyny ++=++ =>

yyyyyy ++=++ ;

if the residual sum of squares equals zero, the

relationship between the dependent variable and the

independent ones is well represented by the

estimated model. In this case, the total sum of

squares is entirely explained by the regression sum

of squares.

4.2 The Model Including a Fuzzy

Intercept

Let’s consider, only for example, the sum of

Diamond’s distances between theoretical and

empirical values of the dependent variable in the

case 3 for a model with fuzzy intercept:

∑

iii

])czbxay(

)czbxay()czbxay[(

LiRiRRi

RiLiLLi

iii

−−−+

∑

+−−−+−−−=

By minimizing such a quantity with respect to a,

b, c,

and

(remember that

γaa

−=

and

γaa

) we can obtain the following system of

equations

FCTA 2011 - International Conference on Fuzzy Computation Theory and Applications

422

⎪

⎩

⎪

⎨

⎧

=−−γ−−−

∑

=−−γ+−

=−−γ−−+

∑

+−−γ+−+−−−−

=−−γ−−+

∑

+−−γ+−+−−−−

=−−γ−−+

+−−γ+−+−−−

∑

−

0)czbxay(2

0]z)cz

bxay(

z)czbxay(z)czbxay[(2

0]x)czbxay(

x)czbxay(x)czbxay[(2

0)]czbxay(

)czbxay()czbxay([2

LiRiRi

RiLiLi

LiLiRiRi

RiRiLiLiiiii

RiLiRiRi

LiRiLiLiiiii

LiRiRi

RiLiLiiii

Such a system can be written as

⎪

⎩

⎪

⎨

⎧

∑

++γ+

∑

++γ−

∑

++γ++

∑

++γ−

∑

+++

∑

++γ++

∑

++γ−+

∑

++=

∑

++γ++

∑

++γ−+++

RiLiRi

LiRiLi

LiRiRiLi

LiLiRi

RiRiLi

iii

RiRiLiLi

RiLiRi

LiRiLi

iii

RiLi

LiRi

RiLiii

y)czbxa(

zyzyzyz)czbxa(

z)czbxa(z)czbxa(

xyxyxyx)czbxa(

x)czbxa(x)czbxa(

)yyy()czbxa(

)czbxa()czbxa(

Recalling that the theoretical values of the fuzzy

dependent variable are

czbxay ++=

RiLi

czbxay ++γ−=

and

LiRi

czbxay ++γ+=

respectively, we obtain

⎪

⎩

⎪

⎨

⎧

∑

∑∑

∑

∑∑

∑

++=

∑

LiRiRiLi

Lii

RiRiLiLi

Lii

RiLi

zyzyzy

xyxyxy

)yyy()yyy(

(6)

The first equation shows that the total sum of

cores and extremes of the theoretical values of the

dependent variable coincides with the same amount

referred to the empirical values. The combination of

the first equation with the last two allows us to state

that theoretical and empirical values of the average

fuzzy dependent variable coincide, like it happens in

the classic OLS estimation procedure.

Let’s examine how the total sum of squares of

dependent variable can be decomposed according to

Diamond’s metric:

∑

−+−+−= ])yy()yy()yy[(SSTot

Adding and subtracting the corresponding

theoretical value within each square and developing

all the squares, the total deviance can be expressed

as:

=−+−+

∑

+−+−+−+−=

])yyyy(

)yyyy()yyyy[(SSTot

RiRi

LiLi

)].yy)(yy(2)yy()yy(

)yy)(yy(2)yy()yy(

)yy)(yy(2)yy()yy([

−−+−+−+

+−−+−+−+

+−−+−+−

∑

Adding and subtracting the theoretical average

values of the lower extremes, of the cores and of the

upper extremes of the dependent variable within

each square and developing all the squares, the

previous expression becomes

=−−+−+−+

+−+−−+

+−+−+−+−−+

+−+−+−

∑

)]yy)(yy(2)yyyy(

)yy()yy)(yy(2

)yyyy()yy()yy)(yy(2

)yyyy()yy([ SSTot

2***

)]yy)(yy(2

)yy)(yy(2)yy()yy(

)yy()yy)(yy(2)yy)(yy(2

)yy()yy()yy()yy)(yy(2

)yy)(yy(2)yy()yy()yy([

***

2*2**

−−+

+−−+−+−+

+−+−−+−−+

+−+−+−+−−+

+−−+−+−+−

∑

where:

∑

−+−+−=

∑

= ])yy()yy()yy[()Y,Y

(dSSReg

represents the regression sum of squares, while

∑

−+−+−=

∑

= ])yy()yy()yy[()Y

(dSSsRe

represents the residual sum of squares. Moreover,

according to the conditions (6), it is

0)]yy)(yy(2)yy)(yy(2)yy(

)yy)(yy(2)yy)(yy(2)yy(

)yy)(yy(2)yy)(yy(2)yy([

RiR

LiL

***

=−−+−−+−+

+−−++−−+−+

+−−+−−+−

∑

Therefore the expression of the total sum of

squares of the dependent variable can be reduced to

SSsReSSgReSSTot +

Ultimately the total sum of squares consists only

of two addends, the regression sum of square and the

residual one, like in the classic OLS estimation

procedure, when the intercept has the same form of

the dependent variable.

Note that, when the intercept has not the same

form of the dependent variable, theoretical and

empirical average values of the latter do not coincide

for each observation; rather the total sum of lower

extremes, cores and upper extremes of the

theoretical values coincides with the same amount

referred to the empirical values:

A STEPWISE PROCEDURE TO SELECT VARIABLES IN A FUZZY LEAST SQUARE REGRESSION MODEL

423

⎪

⎩

⎪

⎨

⎧

−

∑

=−

∑

+=+

∑

)yy()yy(

zyzyzyzyzyzy

xyxyxyxyxyxy

)yyy()yyy(

LiRiRiLi

Lii

RiRiLiLi

RiRiLi

Lii

RiLi

In this case the total sum of squares of the

dependent variable consists of two other components

in addition to the regression sum of square and the

residual one: the first is residual in nature and is

characterized by an uncertain sign, the second is

equal to n times the distance between theoretical and

empirical average values of the dependent variable.

5 A FUZZY MODEL FIT INDEX

We have just demonstrated that the total sum of

squares of the dependent variable consists only of

two addends, the regression sum of square and the

residual one, when the intercept is fuzzy

asymmetric. This is because theoretical and

empirical average values of the dependent variable

coincide and, therefore, both the total sum of squares

and the regression one can be expressed in terms of

distance between empirical values and their

averages.

Under these circumstances, the greater the

regression sum of squares the better the model fits

the data.

When there are more addends of the total sum of

squares than those just mentioned, an increase in the

regression sum of square does not necessarily imply

a better fit to observed data: this is because the

theoretical average value, from which the regression

sum of squares is calculated, may be very different

from the empirical one. On the contrary a decrease

in the residual sum of squares necessarily implies a

better fit to observed data.

In order to assess the goodness of fit of the

regression model, we propose the following index,

for simplicity called Fuzzy Fit Index (FFI), which is

common to all three models:

∑

−=−=

)Y,Y

SSTot

SSsRe

1FFI

where

)y,y,y(Y = and

TRL

)y,y,y(Y =

denote the

fuzzy theoretical average and the fuzzy empirical

average of the dependent variable respectively.

The more this index is next to 1, the smaller the

residual sum of squares is and the better the model

fits the observed data.

With specific reference to the model with a

symmetric (both fuzzy and not) intercept, if the

residual sum of squares decreases, also the distance

between theoretical and empirical fuzzy averages of

the dependent variable decreases, as well as the

component η of the total sum of squares. It follows

ultimately that the forecasting capability of the

model increases.

6 A STEPWISE FORWARD

PROCEDURE TO SELECT

INDEPENDENT VARIABLES

The selection of the most significant independent

variables presents greater difficulties from a

computational point of view in the case of a fuzzy

regression model than in the classic one.

In classical regression analysis, if the number p

of independent variables is limited, the optimal

subset of them can be selected by examining in

succession at most

∑

− )!kp(!k

models, from the

simple ones (k = 1) to the saturated one (k = p).

The fuzzy approach makes the search for optimal

combinations of explanatory variables more

complex from a computational point of view.

The total number of the potential hyperplanes to

be tested increases exponentially with the number p

of the starting variables considered: in fact, for each

subset of q≤p variables, 2

different hyperplanes

result from all combinations of the signs assumed by

the corresponding regression coefficients.

In order to avoid complications related to the

above checks, we introduce a stepwise procedure

which enables us to find the optimal combination of

the starting variables by including only one of them

at a time. At each iteration the procedure selects the

variable which helps to explain the total sum of

squares of the dependent variable more than the

other variables not yet included in the model and

which is also less correlated with the ones already

included. This allows us to estimate

∑

−

2)kp(2

model at most.

More specifically, in the first step

(1)

included in the equation if it presents the highest

correlation with the dependent variable

; in the

q.th step

(q)

is selected to enter the model if its

explanatory contribution to the sum of squares of

is higher than the other variables not yet included

and also than an arbitrary threshold value. Such a

contribution can be measured as the increase in the

FFI due to the introduction of

(q)

into the equation,

FCTA 2011 - International Conference on Fuzzy Computation Theory and Applications

424

equal to FFI

y;1,2,...,q

- FFI

y;1,2,...,q-1

(where the two terms

of the subtraction represent the proportion of the

sum of squares of

explained by the model

including

(q)

and not). The higher the threshold

value, the easier the procedure inhibits the entry of

new independent variables, because of the increases

in the fraction of the total variability which should

be explained.

Once

(q)

is selected, its originality is evaluated

through the so called tolerance T

=1-FFI

q;1,2,...,q-1

where FFI

q;1,2,...,q-1

represents the share of variability

(q)

explained by the q-1 independent variables

already in the model. The tolerance ranges between

0 and 1, depending on the degree of linear

correlation of

(q)

with the other variables;

therefore, only if T

exceeds a threshold between 0

and 1,

(q)

will become part of the model. A high

value of the threshold allows to select very original

variables, but it can also stop the process right from

the initial steps; on the contrary, a low value allows

most of the variables enter into the equation only if

they explain a significant fraction of variability of

. The described procedure stops when none of the

variables not yet included in the equation may

introduce a significant contribution to the model, or

if none of the candidate variables to enter is

significantly original.

For an application of this procedure see

Montrone, Campobasso, Perchinunno and Fanizzi,

2011, which elaborates on data revealed by the EU-

SILC survey of 2006 regarding the perception of

poverty by Italian families. For this purpose, by

using the editor of Matlab, we generated a function

which requires, as input parameters, the matrices of

cores, left extremes and right extremes both of the

dependent and of the independent fuzzy variables.

A more accurate procedure provides the

possibility of eliminating at each iteration variables

already included in the model, whose explanatory

contribution is subrogated by the combination of the

independent variables introduced later.

In particular, unlike the procedure just described,

we can verify at each iteration that the explanatory

contribution of the variable

(i)

(i = 1, 2, ..q-1) is

still significant, once the candidate variable

(q)

inserted. In the q.th step such a contribution can be

measured by the reduction of FFI in the elimination

of the variable

(i)

from the model, equal to

FFI

y;1,2,...,q

- FFI

y;1,2,...,q (-i)

(where the two terms of the

subtraction represent the proportion of the sum of

squares of

explained by the model including all

the variable

and without the variable

(i),

respectively). So, the variable

(i)

remains in the

model if the percentage of the sum of squares

explained by the model including all variables is

higher than the model without the variable

(i)

and

also arbitrary threshold value.

7 CONCLUSIONS

In this work we first explicit the expressions of the

estimated parameters of a multivariate fuzzy

regression model with a fuzzy asymmetric intercept.

Such an intercept is more appropriate than a non-

fuzzy on, as it is to be estimated by the average

value of the dependent variable (which is also

fuzzy) when the independent variables equal zero.

Moreover we verify that the sum of squares of

the dependent variable consists simply in the

regression sum of squares and the residual one, like

it happens in the classic OLS estimation procedure,

only when the intercept is fuzzy asymmetric

triangular. Conversely, when the intercept is

symmetric (both fuzzy and not), the analysis of the

forecasting capability of the model is more difficult.

This happens because of the presence of two

additional components of the sum of squares: the

first one which is related to the difference between

the theoretical and the empirical average values of

the dependent variable, the second one which is

residual in nature and is characterized by an

uncertain sign.

The selection of the most significant independent

variables in a fuzzy regression model presents

computational difficulties due to the large number of

potential hyperplanes to be tested. We propose to

overcome such difficulties through a stepwise

procedure, based on a fuzzy version of the R

index.

In each step a single variable is included between

the starting ones,

according to two basic criteria: its

explanatory contribution to the model and its

originality with respect to the other variables already

included

in the model.

A more accurate procedure provides the

possibility of eliminating at each iteration variables

already included in the model, whose explanatory

contribution is subrogated by the combination of the

independent variables introduced later.

The forecasting capability of the proposed fuzzy

regression model has been successfully verified in a

recent application to data revealed by the EU-SILC

survey of 2006, regarding the perception of poverty

by Italian families. In that circumstance we have

used the editor of Matlab and, in particular, we have

A STEPWISE PROCEDURE TO SELECT VARIABLES IN A FUZZY LEAST SQUARE REGRESSION MODEL

425

generated a function which requires, as input

parameters, the matrices of cores, left extremes and

right extremes both of the dependent and of the

independent fuzzy variables.

Some improvements to the model mainly

concern the shape of the membership function

different from the triangular one.

REFERENCES

Bilancia, M., Campobasso, F., Fanizzi, A., 2010. The

pricing of risky securities in a Fuzzy Least Square

Regression model. In

Advances in Data Analysis and

Classification 2010.

Springer Berlin-Heidelberg-New

York,.

Campobasso, F., Fanizzi, A., Tarantini, M., 2009. Some

results on a multivariate generalization of the Fuzzy

Least Square Regression. In

Proceedings of the

International Conference on Fuzzy Computation,

Madeira.

Campobasso, F., Fanizzi, A., 2011. A Fuzzy Approach To

The Least Squares Regression Model With A

Symmetric Fuzzy Intercept. In

Proceedings of the 14th

Applied Stochastic Model and Data Analysis

Coinference, Roma.

Campobasso, F., Perchinunno, P., Fanizzi, A., 2008.

Homogenous Urban Poverty Clusters within the city

of Bari. In

Lecture Notes in Computer Science ICCSA

2008

. Springer.

Diamond, P. M., 1988. Fuzzy Least Square. In

Information Sciences.

Kao, C., Chyu, C. L., 2003. Least-squares estimates in

fuzzy regression analysis. In

European Journal of

Operational Research

Montrone, S., Campobasso, F., Perchinunno, P., Fanizzi,

A., 2011. A Fuzzy Approach to the Small Area

Estimation of Poverty in Italy. In

Advances in

Intelligent Decision Technologies – Proceedings of the

Second KES International Symposium IDT 2010,

Springer.

Montrone, S., Campobasso, F., Perchinunno, P., Fanizzi,

A., 2011. An Analysis of Poverty in Italy through a

fuzzy regression model. In

Lecture Notes in Computer

Science ICCSA 2011

, Springer.

Montrone, S., Perchinunno, P., Di giuro, A., Torre, C. M.,

Rotondo, F., 2011. Identification of hot spot of social

and housing difficulty in urban areas. In

Lecture Notes

in Computer Science ICCSA 2011

, Springer.

Takemura, K., 2005. Fuzzy least squares regression

analysis for social judgment study. In

Journal of

Advanced Intelligent Computing and Intelligent

Informatics.

FCTA 2011 - International Conference on Fuzzy Computation Theory and Applications

426