An Architecture-Altering and Training Methodology for

Neural Logic Networks: Application in the banking

sector

Athanasios Tsakonas

1

and Georgios Dounias

2

1

Department of Production and Management Engineering, Demokritus University of Thrace,12

Vas.Sofias St.,Xanthi,Greece

2

Department of Financial and Management Engineering, University of the Aegean, 31 Fostini

St., Chios, Greece

Abstract. Artificial neural networks have been universally acknowledged for

their ability on constructing forecasting and classifying systems. Among their

desirable features, it has always been the interpretation of their structure, aiming

to provide further knowledge for the domain experts. A number of

methodologies have been developed for this reason. One such paradigm is the

neural logic networks concept. Neural logic networks have been especially

designed in order to enable the interpretation of their structure into a number of

simple logical rules and they can be seen as a network representation of a

logical rule base. Although powerful by their definition in this context, neural

logic networks have performed poorly when used in approaches that required

training from data. Standard training methods, such as the back-propagation,

require the network’s synapse weight altering, which destroys the network’s

interpretability. The methodology in this paper overcomes these problems and

proposes an architecture-altering technique, which enables the production of

highly antagonistic solutions while preserving any weight-related information.

The implementation involves genetic programming using a grammar-guided

training approach, in order to provide arbitrarily large and connected neural

logic networks. The methodology is tested in a problem from the banking sector

with encouraging results.

1 Introduction

Being a methodology that aims to the integration between artificial neural networks

and AI’s rule-based systems [10], the neural logic networks paradigm [16] has been

developed and applied nowadays, in a number of domains [9]. Neural logic networks

have been proved successful when used in the AI framework, by providing network

representations for nearly every logical rule base. The potential of using neural logic

networks within the computational intelligence framework has been proposed since

their first presentation [16]. Various training methods have been developed since

then. In [16], a training methodology related to back-propagation was proposed.

Later, the Supervised Clustering and Matching (SCM) algorithm [15] was introduced.

Tsakonas A. and Dounias G. (2005).

An Architecture-Altering and Training Methodology for Neural Logic Networks: Application in the banking sector.

In Proceedings of the 1st International Workshop on Artiﬁcial Neural Networks and Intelligent Information Processing, pages 82-93

DOI: 10.5220/0001192100820093

Copyright

c

SciTePress

However, when used within the CI framework, the extracted networks almost always

resulted into non-interpretable forms, eliminating this way their potential advantage

over common neural network models. These early training methodologies used

refinements of the edge weights, which resulted in non-interpretable solutions. This

drawback led the research to alternative solving methodologies such as the genetic

programming [1]. However, their system provided a model that was capable of

producing only a limited set of neural logic network representations [1], which were

actually neural logic binary trees. The latter problem is solved in the methodology of

this paper, by providing indirect representation of neural logic network architectures

within the genetic programming framework. The indirect representation is based on

the cellular encoding [2], and is applied into the genetic programming [6] using a

context-free grammar system. The problem where the system is applied comes from

the banking sector and involves the loan management for enterprises.

The paper is organized as follows. In section 2 we introduce the theoretical

background of the neural logic networks and the genetic programming framework.

The paragraph covers also a review in grammar-guided methodologies for genetic

programming and the cellular encoding advances for connectionist systems

representation within genetic programming individuals. Section 3 contains the design

and the implementation description of the proposed system. In section 4 we include

the description of the problem domain, our system configuration and the obtained

results together with a discussion. Finally, section 5 contains our conclusion regarding

this work and proposes future directions for this domain.

2 Background

2.1 Neural Logic Networks

A neural logic network is defined as a finite directed graph. It consists of a set of

input nodes and an output node. In its 3-valued form, the possible value for a node

can be one of three ordered pair activation values (1,0) for true, (0,1) for false and

(0,0) for don't know. Every synapse (edge) is also assigned a an ordered pair weight

(x,y) where x and y are real numbers. An neural logic network and its output value

(a,b) of node P is shown in Figure 1. It is possible to map any rule of conventional

knowledge into a neural logic network by using different sets of weights, which

enable the representation of different logical operations. In Figure 2, some examples

Fi

g

.1.Neural lo

g

ic network and its out

p

ut.

(α

1

,β

1

) Q

1

(α

2

,β

2

) Q

2

(α

k

,β

k

) Q

k

(x

1

,y

1

)

(x

2

,y

2

)

(x

k

,y

k

)

P

11

11

(1, 0) 1

(,) (0,1) 1

(0,0)

kk

jj jj

jj

kk

jj jj

jj

if a x b y

ab if ax by

otherwise

==

==

⎧

−≥

⎪

⎪

⎪

⎪

=−≤−

⎨

⎪

⎪

⎪

⎪

⎩

∑∑

∑∑

83

of these logical operators and their implementation into neural logic networks are

presented. The neural logic networks have not applied in many domains within the CI

framework, although they are powerful by their definition. The main reason can be

located in the fact that for the known training methodologies [16], [15], the

adjustment of the synapse weights eliminates the ability to interpret the extracted

solution into a number of logical rules, thus depriving these networks from their

valuable feature. In Figure 3(a) it is shown an easily interpretable neural logic

network, while in Figure 3(b), a network with adjusted synapse weights fails to be

interpreted. Some steps for the preservation of the interpretability have been

performed by [1], which however proposed a system that lacks the ability to express

arbitrarily large and connected neural logic networks. In Figure 4(a), a network that is

produced by the methodology of by [1] is shown. In Figure 4(b), it is shown a neural

logic network, which performs the fundamental logical operation of XOR, and it

cannot be represented using the direct encoding of [1]. In order to make the networks

able to handle any real value (i.e. not only 3-valued pairs), the fuzzy extension for

neural logic networks has been proposed [16], a design that is also adapted here for

the needs of the problem examined in this paper.

2.2 Genetic Programming

Genetic programming [6] is an evolutionary computation approach, which in its

canonical form enables the automatic generation of mathematical expressions or

programs. Genetic programming retains a significant position among successful

evolutionary computation approaches due to its valuable characteristics, such as the

Fig. 2. Example logical operations in neural logic networks

(2,1/2)

(2,1/2)

(1/2,2)

(1/2,2)

(1/k,0)

(1/k,0)

(1/k,0)

Ρ

1

Ρ

2

Ρ

n

V

k

( Ρ

1

,

Ρ

2

,..,P

n

)

(0,1/k)

(0,1/k)

(0,1/k)

Ρ

1

Ρ

2

Ρ

n

Λ

k

( Ρ

1

,

Ρ

2

,..,P

n

)

(2

n-1

,2

n-1

)

(2

n-2

,2

n-2

)

(2

0

,2

0

)

Ρ

1

Ρ

2

Ρ

n

∆ ( Ρ

1

,

Ρ

2

,..,P

n

)

(1/k,1/k)

(1/k,1/k)

(1/k,1/k)

Ρ

1

Ρ

2

Ρ

n

M

k

( Ρ

1

,

Ρ

2

,..,P

n

)

12

PP∨

at least k-yes

at least k-false

k-majority

12

PP∨

84

flexible variable-length solution representation and the absence of population

convergence tendency. In most implementations, a population of candidate solutions

is maintained, and after a generation is accomplished, the population increases its

fitness for a given problem. The genetic operators that are mostly used in these

algorithms are reproduction, recombination (crossover) and mutation. The first

operation copies an individual without affecting it, the recombination exchanges

genetic material between two individuals and mutation alters a part of a randomly

selected genetic material. A genetic programming training cycle usually includes the

following steps:

1. Initialise a population of individuals at random.

2. Evaluate randomly an individual and compare its fitness to other (this fitness

determines how closely is an individual to the desired goal).

3. Modify an individual with a relatively high fitness using a genetic operator.

4. Repeat steps 2-3 until a termination criterion is met.

Common termination criterions are the accomplishment of a number of generations,

the achievement of a desired classification error, etc.

As previously stated, the genetic programming has been proved an advance over

traditional genetic algorithms due to its ability to construct functional trees of variable

length. This property enables the search for very complex solutions that are usually in

the form of a mathematical formula - an approach that is commonly known as

(a) (b)

Fig. 3. (a) Interpretation of a neural logic network into logical rules. (b) a network with refi-

ned edge weights that cannot be interpreted.

(1/2,0)

(1/2,2)

(1/2,2)

V

4

P

1

P

0

V

8

P

2

(1/2,0)

(2,2)

(1,1)

Q1 ⇐ At least 2-yes(V

4

,V

8

)

Q2

⇐ Conjunction(V

4

,V

8

)

Q3

⇐ Priority(Q1,Q2)

(1,3)

(1,4)

(-2,3)

V

4

P

1

P

0

V

8

P

2

(2,0)

(3/2,-2)

(0,1)

?

(a) (b)

Fig. 4. (a) Tree-like neural logic networks as generated in [1]. (b) a neural logic network that

performs the XOR operation needs the general structure of a finite graph and cannot be

described directly by the approach of [1].

V

0

V

4

(1/2,1/2)

(2,1/2)

(2,1/2)

P

1

P

0

P

2

(1/2,1/2)

(-2,0)

(1,1)

V

1

V

3

(1/2,1/2)

(2,1/2)

(2,1/2)

V

0

P

1

P

0

V

1

P

2

(1/2,1/2)

(-2,0)

(1,1)

V

0

85

symbolic regression. Although later paradigms extended this concept to calculate any

boolean or programming expression, the task of implementing complex intelligent

structures into genetic programming functional sets in not rather straightforward. The

function set that composes an intelligent system retains a specific hierarchy that must

be traced in the genetic programming tree permissible structures. This writing offers

two advantages. First, the search process avoids candidate solutions that are

meaningless or, at least, obscure. Second, the search space is reduced significantly

among only valid solutions. Therefore, when the syntax form of the desired solution

is well defined, it is useful to restrain the genetic programming from searching

solutions with different syntax forms [3], [8]. One of the advantageous methods to

implement such restrictions is to apply syntax constraints to genetic programming

trees, usually with the help of a context-free grammar [2], [5], [13]. The execution of

massively parallel processing intelligent systems such as the neural logic networks

within the genetic programming framework is not considered a straightforward task.

In order to explore variable sized solutions, it is required the application of an indirect

encoding system. The most common one is the cellular encoding [2], in which a

genotype - a point in the search space- can be realised as a descriptive phenotype - a

point in the solution space. More specifically, within such a function set, there are

elementary functions that modify the system architecture together with functions that

calculate tuning variables. Related implementations include encoding for feedforward

and Kohonen neural networks [2], [4], [17] and fuzzy Petri-nets [18]. A similar

technology, called edge encoding, developed by [7] is also today used with human

competitive results in a wide area of applications.

3 Design and Implementation

As mentioned in the previous paragraph, the characteristic feature of neural logic

networks should be the ability to interpret any network architecture into a set of

logical rules. For this reason, the cellular encoding is used in our model to represent

the candidate solutions into genetic programming trees. One cellular encoding scheme

includes (I) functions for architecture altering and (II) functions for parameter tuning.

The functions for parameter tuning have common properties with the usual genetic

programming functions, which operate as procedures or program elements. The

functions that are used for architecture altering however are not used in standard

genetic programming systems. They comprise a function set that alters an embryonic

neural logic network, by entering nodes sequentially or in parallel onto an initial

(elementary) neural logic network, in order to form the final / desirable architecture.

Hence, among the architecture altering functions we may discriminate between (I-a)

functions that enter a node serially, and (I-b) functions that enter a node in parallel.

The problem that has arisen during the prime implementations of cellular encoding

concerns the grammar description, which enabled the existence of networks without

inputs [2], a situation that can easily lead into premature population convergence.

86

Thus, in order to be able for a system to include at least one input, we incorporated

different functions for the architecture altering on the system inputs, than those used

for the architecture altering on internal nodes. Hence, we may further divide the

architecture altering functions into two additional sub-classes, (I-1) functions that are

applied on the system inputs and (I-2) functions that are applied on internal nodes. To

conclude with, we use a function that enters a node in serial to an input node (S1), a

function that enters a node in parallel to an input node (P1), a function that enters a

node in serial to an internal node (S2) and, finally, a function that enters a node in

parallel to an internal node (P2). In Figure 5, we demonstrate the operation regarding

function S1. The application of this function on the B node in Figure 5(a), results in

the construction of the network shown in Figure 5(b). The grey-coloured arrow shows

the running cursor, which marks the point from which any further network expansion

will occur. Figure 6 illustrates the operation of P1 function. Application of this

function on the B node in Figure 6(a), results to the network shown in Figure 6(b).

Table 1 depicts in short the functions that we used in order to describe the

evolutionary neural logic networks. The system grammar is presented in Table 2.

Initial symbol (root) of a tree can be a node of a type

<PROG>. The logical functions

that construct in this work the operator set for the neural logic networks expressed in

our system are shown in Table 3.

A

B

C

A B

(a) (b)

Fig. 5. Application of the function S1 on the Β node.

(a) (b)

Fig. 6. Application of the function P1 on the Β node.

A

B A B

C

87

Table 1. GP- function set for neural logic networks.

Class Functi

on

Operation

P1 Enters a node in parallel to an input node

S1 Enters a node in serial to an input node

P2 Enters a node in parallel to an internal node

S2 Enters a node in serial to an internal node

Architecture

altering

PRO

G

Initial function. Creates the embryonic network

CNR Applies logical operator based on the CNRSEL and

Κ values

LNK Cuts links based on the NUM and CUT values

IN Enters an input parameter value into the network

NUM Supplementary to the LNK function, selects the link

to cut

CUT Supplementary to the LNK function, determines

whether the link will be cut or not

CNR

SEL

Supplementary to the CNR function, its value

determines the operator that will be applied to a

node

Parameter tuning

Κ Supplementary to the CNR, function, its value

determines the logical operator’s parameter

(determined by CNRSEL).

Table 2. Context-free grammar for neural logic networks.

<PROG> : = PROG <PLACE1><SYNAPSE>

<PLACE1> : = S1 <PLACE1><SYNAPSE><PLACE2>

| P1 <PLACE1><PLACE1>

| IN

IN : = Data attribute (system input)

<PLACE2> : = S2 <PLACE2><SYNAPSE><PLACE2>

|P2 <PLACE2><SYNAPSE><PLACE2>

| E

E : =

∅

<SYNAPSE> : = LNK <NUM><CUT><SYNAPSE>

| CNR <CNRSEL><K>

<NUM> : = NUM

<CUT> : = CUT

<CNRSEL> : = CNRSEL

<K> : = K

NUM : = Integer in [1,256]

CUT : = Integer in [0,1]

CNRSEL : = Integer in [0,10]

K : = Integer in [0,9]

88

4 Results and discussion

The proposed system was applied in a banking sector aiming to both provide high

classification rates and the ability to interpret the extracted solution. This data is

acquired by an Australian bank [12]. The features in this data are given as simple

elements and their interpretation is not known, since this data set has been considered

as classified information. However, the application of our methodology to this

problem may provide useful conclusions regarding the system’s effectiveness, since

enough successful applications of other approaches exist [11], [14], in related

literature. The data is consisted of 688 records from which 344 were randomly used as

training set, 172 as validation set and another 172 as test set (unknown data). There

are 67 missing values in total, which were substituted with zeroes. Table 4 presents

the related attributes and the encoding that was used. This data is primarily composed

by discrete attributes, which are encoded into independent binary features for further

processing by our system. The percentage of cases for which the application was

finally accepted, reaches 44.5 % of the total (307 records). A total of 37 records have

one or more missing values (5% of total data). After the training process was

accomplished (200 genetic programming generations), our algorithm generated the

neural logic network shown in Figure 7.

Table 3. Implemented logical functions.

Conjunction (AND)

Disjunction (OR)

Priority

At least k-true

At least k-false

Majority influence

Majority influence of k

2/3 majority

Unanimity

IF-Then (Kleene’ s model)

(logical) Difference

Exclusive OR (XOR)

Negative exclusive OR (XΝOR), or equivalence (EQV)

Negative conjunction (NAND)

Negative disjunction (ΝOR)

Exactly k-true

89

Table 4. Feature description for the Australian bank credit-scoring problem.

Variable Values Encoded features

T1-Τ3 discrete (b, a, ?), 12

missing values

1 (discrete) Æ 3

binaries, 1 of 3 (b="1 0

0", a="0 1 0", ?="0 0 1")

T4-Τ5 continuous

(13.75...80.25)

1 (continuous) Æ 2 ( 1

continuous 0..1, 1

binary, 1: absent 0:

non-absent)

Τ6 continuous (0...28) 1 (continuous) Æ 0..1

Τ7-Τ11 discrete (u, y, l, t,

?), 6 missing values

1 (discrete) Æ 5

binaries, 1 of 5

Τ12-Τ15 discrete (g, p, gg, ?),

6 missing values

1 (discrete) Æ 4

binaries, 1 of 4

Τ16-Τ30 discrete (c, d, cc, i, j,

k, m, r, q, w, x, e,

aa, ff, ?), 9 missing

values

1 (discrete) Æ 15

binaries, 1 of 15

Τ31-Τ40 discrete (v, h, bb, j,

n, z, dd, ff, o, ? ), 9

missing values

1 (discrete) Æ 10

binaries, 1 of 10

Τ41 continuous

(0...28.5)

1 (continuous) Æ 0..1

Τ42 binary 1 (binary) Æ 0..1

Τ43 binary 1 (binary) Æ 0..1

Τ44 continuous (0...67) 1 (continuous) Æ 0..1

Τ45 binary 1 (binary) Æ 0..1

Τ46-Τ48 discrete (g, p, s) 1 (discrete) Æ 3

binaries, 1 of 3

Τ49-Τ50 continuous

(0...2000),13

missing values

1 (continuous) Æ 2 ( 1

continuous 0..1,

1 binary, 1: absent 0:

non-absent)

Τ51 continuous

(0...10000), 13

missing values

1 (continuous) Æ 0..1

The classification rate of this solution in the test set (unknown data) reaches 89.53%

(154/172) which is higher than those reported in literature [12]. The corresponding

classification rates in the training and the validation set were 85.47% (294/344) and

87.21% (150/172) respectively. The generated neural logic network can be described

by the following extremely simple decision rule:

Q ← Conjunction (ΑΝD) (Τ12, Τ15, Τ15, Τ15, Τ15, Τ27, Τ42, Τ42, Τ42, Τ43, Τ44,

Τ49, Τ49)

90

The main conclusions regarding this extracted decision-rule follow:

· Only one simple attribute conjunction (AND operation) is contained in the

resulting network, a result showing that no complex rules should necessarily be

expected for building up an effective decision – making strategy from the bank’s

viewpoint, regarding credit applicants’ evaluation.

· Getting into greater detail, attribute (feature) T15 seems to be of significant

importance for the overall decision-making process. This feature is binary. The initial

data feature from which T15 was derived, receives values from the set {g, p, gg, ?},

with T15 corresponding to the last value.

· Feature T42 is also of significant importance for the decision-making process.

(CNLN (P1 (P1 (In T42) (P1 (In T15) (P1 (In T15) (P1 (P1 (In T42) (P1 (In T44) (P1 (P1

(P1 (In T42) (P1 (In T43) (P1 (In T15) (In T27)))) (In T15)) (In T12)))) (In T49))))) (In

T49)) (Rule 0 0))))

Fig. 7. Neural logic network (description and representation) generated for the

Australian credit-scoring problem.

91

Table 5. Comparison among various approaches for the Australian bank data [14].

Methodology Classification rate in unknown data (%)

Cal5 86.9

Itrule 86.3

LogDisc 85.9

Discrim 85.9

Dipol92 85.9

Radial 85.5

Cart 85.5

Castle 85.2

Bayes 84.9

IndCart 84.8

BackProp 84.6

C4.5 84.5

Smart 84.2

BayTree 82.9

KNN 81.9

Ac2 81.9

NewId 81.9

LVQ 80.3

Alloc80 79.9

Cn2 79.6

QuaDisc 79.3

Default 56.0

This work (NLN) 89.5

· Another significant feature is the T49, which in the initial data set receives values

within the range [0,2000].

The Australian bank data problem is particularly interesting, since it enables the

comparison of our model with a number of competitive statistical and intelligent

approaches. Table 5 presents our results as compared with 22 other competitive

approaches presented in [14]. Among the provided methodologies of Table 5, the

proposed approach succeeded in obtaining the highest classification score using the

specific neural logic network.

5 Conclusions and Further Research

Neural logic networks are a family of artificial neural networks designed with the aim

to provide network interpretation into simple logical rules. Although successful in

representing any set of logical rules into a network structure, the opposite has been

proved unsuccessful. This result derived from the training procedures used so far

which, based on edge tuning, destroyed the network’s interpretation. Aiming to

successfully use the neural logic networks family in the CI framework, this paper

presented an architecture-altering and training methodology. This methodology

overcomes the problems encountered in previous approaches and succeeds in

producing highly accurate and interpretable neural logic networks. The methodology

is tested in a problem from the banking domain with very encouraging results. It is

acknowledged that further experiments are needed in similar or different domains in

order to obtain a wider valuation of the examined methodology.

Further research includes the application of the system into more domains, especially

from the financial sector where the neural networks have been proved an effective

classifying and forecasting methodology and at the same time, the interpretation of a

network has always been a desirable target. Moreover, the research will be driven to

92

implement higher order recursive neural logic network, in an attempt to develop

neural logic networks for financial problems dealing with time-series data.

References

1. Chia H.W-K., Tan C-L., 2001. Neural logic network learning using genetic programming,

Intl. Journal of Comp. Intelligence and Applications, 1:4, pp 357-368

2. Gruau F., 1994. Neural Network Synthesis using Cellular Encoding and the Genetic

Algorithm, Ph.D. Thesis, Ecole Normale Superieure de Lyon, ftp:lip.ens-lyon.fr

(140.77.1.11) pub/Rapports/PhD PhD94-01-E.ps.Z

3. Gruau F., Whitley D., Pyeatt L., 1996. A Comparison between Cellular Encoding and

Direct Encoding for Genetic Neural Networks, in Koza J.R., Goldberg D.E., Fogel D.B.,

Riolo R.L. (eds.), Genetic Programming 1996: Proceedings of the First Annual Conf., pp

81-89, Cambridge, MA, MIT Press

4. Hussain T., Browse R., 1998. Attribute Grammars for Genetic Representations of Neural

Networks and Syntactic Constraints of Genetic Programming, in AIVIGI’98:, Workshop on

Evol.Comp., Vancouver BC.

5. Janikow C.Z., 1996. A Methodology for Processing Problem Constraints in Genetic

Programming, in Computers Math.Applic. Vol.32:8,pp 97-113.

6. Koza J.R., 1992. Genetic Programming: On the Programming of Computers by Means of

Natural Selection, Cambridge, MA, MIT Press.

7. Koza J.R., Keane M.A., Streeter M.J., 2003. Genetic Programming Human-Competitive

Results, IEEE Intelligent Systems, May/June 2003, pp 25-31.

8. Montana D.J., 1995. Strongly Typed Genetic Programming, Evolutionary Computation,

vol. .3, no. 2.

9. Quah T-S., Tan C-L., Teh H-H.,Sriniivasan B., 1995. Utilizing a Neural Logic Expert

system in Currency Option Trading, Expert Systems with Applications, 9:2, pp 213-222.

10. Quah T-S., Tan C-L., Raman K., Sriniivasan B., 1996. Towards integrating rule-based

expert systems and neural networks, Decision Support Systems, 17, pp 99-118.

11. Quinlan, J.R., 1987. Simplifying decision trees. International Journal of Man-Machine

Studies, 27, December: pp. 221-234.

12. Quinlan J.R. 1992. C4.5: Programs for Machine Learning, Morgan Kaufmann.

13. Ryan C., Collins J.J., O'Neil M. , 1998. Grammatical Evolution: Evolving Programs for an

Arbitrary Language, in W.Banzhaf, R.Poli, M.Schoenauer, T.C.Fogarty (Eds.), Genetic

Programming, Lecture Notes in Computer Science, Springer.

14. Statlog 2005. Statlog Use, Test of Australian Credit Scoring data, available online:

http://www.liacc.up.pt/ML/statlog/datasets/australian/australian.use.html, date of last

access:05/22/05.

15. Tan A-H., Teow L-N., 1997. Inductive neural logic network and the SCM algorithm,

Neurocomputing 14, pp 157-176

16. Teh H-H., 1995. Neural Logic Networks, World Scientific.

17. Tsakonas A. Dounias G., 2002. A Scheme for the Evolution of Feedforward Neural

Networks using BNF-Grammar Driven Genetic Programming, in Proc. of Annual

Workshop of European Network of Excellence on Smart Adaptive Systems 2002, Eunite-

02, Algarve.

18. Wong M.L., 2001. A flexible knowledge discovery system using genetic programming and

logic grammars, Decision Support Systems, 31, pp 405-428

93