IMPROVED ADAPTIVE META-NETWORK DESIGN

EMPLOYING GENETIC ALGORITHM TECHNIQUES

Ben McElroy and Gareth Howells

School of Engineering and Digital Arts, University of Kent, Canterbury, Kent, CT2 7NT, U.K.

Keywords: Weightless Neural Network, Meta-Networks, Genetic Algorithm, Neural Architectures, Ultrasonic Sensors.

Abstract: This paper investigates the employment of a Genetic Algorithm to optimally configure the parameters of a

class of weightless artificial neural network architectures. Specifically, the Genetic Algorithm is used to

vary the parameters of the architecture and reduce the rigidity of the mutation algorithm to allow for a more

varied population and avoidance of local minima traps. An exemplar of the system is presented in the form

of an obstacle avoidance system for a mobile robot equipped with ultrasonic sensors.

1 INTRODUCTION

The optimisation problem for classes of artificial

neural networks has been investigated over a number

of years (Kordík, et al., 2010) (Abraham, 2004)

(Abraham, 2002). This paper investigates the

optimisation problem for a class of simple yet

flexible artificial networks termed RAM based or

weightless networks. The specific problem

considered in this paper is that of enabling a mobile

robot to determine the optimal direction of travel

using weightless neural networks. This work looks

at using simple measures fed into the network to

determine if it can settle on an appropriate obstacle

avoiding response/route. Adaptive systems offer the

ability to learn and generalize from a set of known

examples allowing them to recognize previously

unseen inputs based on their similarity of

characteristics with previously seen examples.

RAM-based weightless neural networks are a type of

neural network well suited to the expression of

solutions to such logical problems (McElroy, et al.,

2010). Such neural networks possess many

advantages over conventional weighted neural

networks since they allow the possibility of:-

• One shot learning – this is an object

categorization problem of current research

interest. Usually machine learning based object

categorization algorithms require a lot of training

on hundreds or thousands of items, one-shot

learning attempts to minimize this by having

sufficient information about the object categories

from just one, or only a few, training

images/items.

• Arbitrary mappings from inputs to outputs – the

system is more robust as it does not require

specific pathways to be assigned from input to

output.

• Easier direct hardware implementation.

Manual optimisation of the configuration parameters

of Artificial Neural Networks (ANNs) for a specific

problem domain currently relies heavily on human

experts with sufficient knowledge on the different

aspects of the network as well as the problem

domain itself (Yao, 1999). As the complexity of the

problem domain increases and when near-optimal

networks are desired, manual searching becomes

more difficult and unmanageable. This paper seeks

to evaluate the potential of using weightless ANNs

within a meta-network structure to determine the

direction of a robot when it encounters an obstacle

and improve the accuracy of the response generated.

The paper is structured as follows. Section 1.1

briefly outlines evolutionary neural networks and

their advantages. Section 2 describes the robot being

used and the setup of the sensors. Section 3 goes into

detail about weightless neural networks and how

they differ from their weighted counterparts. Section

4 covers the problem this paper is exploring while

section 5 explains how the meta-network operates in

detail. Section 6 shows the experimental setup and

how the data was encoded, and section 7 displays

the results. Finally, section 8 concludes the paper

with finishing remarks.

142

McElroy B. and Howells G..

IMPROVED ADAPTIVE META-NETWORK DESIGN EMPLOYING GENETIC ALGORITHM TECHNIQUES.

DOI: 10.5220/0003539401420148

In Proceedings of the 8th International Conference on Informatics in Control, Automation and Robotics (ICINCO-2011), pages 142-148

ISBN: 978-989-8425-74-4

 2011 SCITEPRESS (Science and Technology Publications, Lda.)

1.1 Evolutionary Artificial Neural

Networks

Evolutionary Artificial neural networks (EANNs)

are a specific type of ANN whereby evolution is

used as another form of adaptation which

supplements learning (Abraham, 2004) (Abraham,

2002) (Slowik, et al., 2008). One distinct feature of

EANN’s is their adaptability to a dynamic

environment. EANN’s can adjust to a multitude of

scenarios as well as changes in the environment.

There are three significant areas to which evolution

has been introduced: the connection weights, the

learning rules and the architectures (Yao, 1999). In

this paper we deal primarily with the evolution of

architectures which enables ANN’s to adapt their

topologies to different tasks without human

intervention and thus provides an approach to

automatic weightless ANN design, a class of

architectures to which this technique has not

previously been applied.

2 ULTRASONIC SENSORS

A robot equipped with seven ultrasonic sensors was

used in the investigation. Although further sensors

may prove necessary for a practical system, this

configuration was used for the purposes of

evaluating the proposed network configuration as it

formed a balance between complexity and utility of

the data. The ultrasonic sensors that adorn the robot

are shown in figure 1. Each sensor has an associated

ID which was used to identify them in the data. The

grey area in Figure 1 shows the approximate cone

that the sensors operate within. The sensors have a

maximum range of four metres, and return the

distance to the closest object within their cone of

detection. The sensors are set to return data every

half a second with a distance given in millimetres.

Ultrasonic sensors work in a similar way to radar

or sonar which assess characteristics of a target by

interpreting the return signal, or ‘echoes’, from radio

or sound waves respectively. These sensors typically

produce high frequency sound waves and look for

the echo from them. The Sensors then have a time

delay between sending and receiving the echo and

use this to calculate the distance to the object. Sonar

sensors are used successfully in a wide variety of

applications, such as medical imaging,

non-destructive testing and vehicle ranging systems.

Figure 1: Robot Sensor Setup.

3 WEIGHTLESS ARTIFICIAL

NEURAL ARCHITECTURES

As a class of architectures, Weightless Neural

Networks were first introduced by Bledsoe and

Browning in 1959 (Bledsoe, et al., 1959). They

consist of 'weightless' neurons - neurons that have no

weight between the input and the node – with the

inputs and outputs expressed as simple binary

values. While their weighted brethren require a lot

of training, Weightless networks can be trained very

quickly and installed on much simpler hardware.

Further, whereas in other neural network models the

weights are adjusted, WsNNs are trained by

modifying the composition of the look-up tables.

The architecture to be employed in this

investigation is the Generalised Convergent Network

(GCN) (Howells, et al., 1995 ). It employ layers of

neurons each independently attached to a given

sample pattern whose distinct outputs are merged via

a further layer to produce an output matrix equal in

dimension to the original sample pattern. A varying

number of groups of such layers arranged

sequentially may be employed by the various

architectures.

An example GCN architecture is illustrated in

Figure 2 and it possesses the following general

properties:-

• Network neurons are typically arranged in a two

dimensional layer where each row represents a

component of the input data.

• Each element within the pattern is associated

with a corresponding neuron within each layer.

• The layers comprising the network are arranged

in two groups, termed the Pre group and the

Main group.

IMPROVED ADAPTIVE META-NETWORK DESIGN EMPLOYING GENETIC ALGORITHM TECHNIQUES

143

Figure 2: GCN example layer setup.

• A further Merge layer exists after each group

whose function is to combine the outputs of the

constituent layers of the group. The Merge

operation is performed on the corresponding

neurons from each layer within the group and

for each position within the layer shown in

figure 3. The number of inputs of a neuron

comprising a Merge layer is thus equal to the

number of layers within the group to which it

pertains.

• The Merged output of the Main Group is fed

back, unmodified, to the inputs of each layer

comprising the group.

• The number of layers within each group and the

connectivity of the neurons within differing

layers are arbitrary and the optimal number of

each varies with the application area. It is the

determination of these parameters which forms

the research focus of this paper.

• The constituent layers of a group differ in the

selection of elements attached to the inputs of

their constituent neurons (termed the

connectivity pattern). Again, this is a major

parameter to be optimised within the network

configuration.

• However, neurons within a given layer possess

the same connectivity pattern relative to their

position within the matrix and this is not a

network parameter which may be modified.

The connectivity patterns for neurons of differing

layers take various forms and hence represents an

area for optimisation. Further, for each neuron, the

values comprising the input set are calculated

modulo the dimensions of the pattern. Therefore

neurons which, for example, are situated at the right

edge of a pattern and are virtually clamped to the ‘x’

bits on their right will actually be clamped to ‘x’ bits

at the left side of the pattern. Some layers take inputs

from non-local areas of the pattern. For the code we

Figure 3: Sample potential layer connectivities.

are using which has been parsed into a matrix this

means that each layer will be looking at a slightly

different ‘picture’ of the same sample, allowing for a

high chance of recognition should there be a pattern.

The GCN network architecture employs RAM-

based neurons with varying sized symbol sets which

are the values stored within rather than more

conventional Boolean symbol sets employed by

alternative RAM-based networks (Howells, et al.,

1994 ). The symbol set is extended to allow a

symbol to represent each pattern class under

consideration. For example, if ten pattern classes

representing numerals were being considered, the

symbols `0' through `9' could be employed

representing the numerals 0 through 9 respectively.

In the case of this paper, these ‘symbols’ are

represented by the sensor data collected from the

robot. As the network only takes binary as an input,

this data has to be specially formatted – this is

discussed in detail in a later section. These symbols

are referred to as the base symbols of the network.

A comprehensive description of this architecture

and its associated training and recognition

algorithms may be found in (Howells, et al., 1995 ).

4 THE PROBLEM DOMAIN

The robot begins in an open area with a few static

obstacles placed at strategic points in its path. Only

one obstacle will be negotiated at any given time for

these experiments. In order to correctly negotiate the

robot around obstacles, the system must be able to

identify not only where the object is in relation to

the robot, but also which way to turn in order to

avoid it. With data that could be in constant flux

when in tight areas such as small corridors or

cluttered environments, the ability to modify the

neural network to deal with this is incredibly useful.

These experiments should show that the network can

identify an object in proximity and correctly

ICINCO 2011 - 8th International Conference on Informatics in Control, Automation and Robotics

144

determine a new direction in order to avoid the

obstacle. This is a simple task but the key aim is to

demonstrate the meta-network’s ability to improve

the decision making ability of the robot by

modifying the architecture of the network.

Weightless Neural Networks typically take

binary as inputs and as such the data must to be

parsed – this is shown in a later section. For this

particular experiment, the network will be analysing

the distances and determining a direction from this.

As a comparison, a GCN network using randomly

generated architectures will be employed to see how

effective the meta-network is at improving the

accuracy of the results.

5 META-NETWORK

As stated above, to combat the problem of finding a

practically employable architecture for a weightless

neural network, a genetic algorithm was chosen. The

framework used in the present work is a layered

process of an evolutionary search of ANNs, as

illustrated in Figure 4.

Figure 4: Basic Diagram depicting the various processes

involved.

As mentioned in the introduction, architecture

design is crucial to the successful application of a

weightless ANN because it has significant impact on

the network’s information processing capabilities.

With a variable number of layers, neurons per layer,

and neuron placement, the number of potential

architectures is potentially very large. For a

particular learning scenario, a network with

relatively few connections may imply it will be

unable to perform the task due to its limited

capability. Conversely, a network with a large

number of connections may add noise to the training

data and fail to generalise appropriately –a balance

must be struck. To combat the problem of finding a

practical architecture for a weightless neural

network, the following Genetic Algorithm was

employed.

There are several variables that can be modified

when using the GCN architecture, including input

size, number of layers, number of neurons (per

layer), and the size of the training set size. For these

experiments, the size of the training set and the input

size were set to seven and 6x7 respectively. The

reason for the input size is described in a later

section. This paper investigates the use of a Genetic

Algorithm to optimise the parameter configuration

for employment in an obstacle avoidance task.

There were some inherent problems with using at

standard genetic algorithm as a base however:

5.1 Inputs

Typically the inputs to a genetic algorithm are

strings or numbers which are the parameters the

genetic algorithm can modify. However, for this

experiment, numbers would not suffice due to the

complexity of the problem – It needed to modify 3

component parameters of information. The first is

the number of layers, the second is the number of

neurons within each of these layers, and the third is

the placement of these neurons. As such, a custom

input was defined as shown in Figure 5. On the left,

Figure 5 shows 3 ‘layers’– each pair of zeros

represents the relative coordinates for a neuron from

which its inputs will be derived, remembering that

the dimensions of the layer and input pattern are

identical. As described in the previous section

discussing the weightless neural architecture, the

pattern ‘wraps’ around, meaning that neurons on the

right side of the pattern are virtually clamped to

those on the left. If a coordinate given exceeds the

boundaries of the matrix, it simply wraps around. So

the layer on the right in Figure 3 translates as a

straight line of neurons for that layer for the element

in the centre of the layer.

5.2 Initial Population

The initial population is created using a random

generator for both the number or layers and how

many neurons will be in each individual layer.

Subsequently, tests are carried out on the data and

the error rate is returned for each individual. The

error rates are then multiplied by a factor of their

complexity – each additional layer adds a 0.05 to a

value that the error rate will be multiplied by, so that

smaller networks with similar results will edge out

those with larger networks. For example, if there are

two architectures – A and B - each with an error

IMPROVED ADAPTIVE META-NETWORK DESIGN EMPLOYING GENETIC ALGORITHM TECHNIQUES

145

rate of 10%, but A has 3 layers while B has 5, A has

a better fitness rating (10*1.15=11.5) than B

(10*1.25=12.5) and as such, the genetic algorithm

will favour it in selection. These fitness ratings are

then ranked, and the bottom 50% are pruned.

Figure 5: (Left) Shows input design for the genetic

algorithm. (Right) Shows graphic translation of

coordinates to layer.

5.3 Crossover

Crossover takes selected individuals and ‘mates’

them by taking the first half of each layer of the first

individual and the second half of each layer of the

second individual, as shown in figure 6. This creates

a new individual (representing an architecture)

which is then added to the next generation for

testing.

Figure 6: During Crossover, top of parent A combined

with bottom half of parent B for each layer.

5.4 Mutation

Mutation is employed to a few individuals via slight

modifications to create new architectures. This is

achieved by adding a matrix of integers that range

between -1 and 1, as shown in figure 7.

Figure 7 : Each layer has a matrix of the same size added

to it. This layer has values between -1 and 1.

6 EXPERIMENTATION

The robot needed to be able to identify an obstacle

and turn in the correct direction in order to avoid it.

As such data needed to be collected from the sensors

so that it could be analysed and parsed. If the object

is a medium distance away for example, it should

take less drastic action than if it was a short distance

away.

Data was collected from the sensors during a test

run of the robot which lasted for 37 seconds. 518

readings were taken during this time, or 74 per

sensor. The data format is shown below in Table 1.

From this simulated data was created representing

different scenarios the robot would encounter.

Table 1: Example data collected.

Sensor ID Distance (mm)

48 1982

50 2967

52 3001

54 4371

56 2489

58 1001

60 443

The distance needed to be converted into a Gray

code (Gilbert, 1958) so that it could be fed into the,

network. This was necessitated as it is required for

codes representing similar values to possess similar

binary encodings. This is not the case with natural

binary due to the severe changes evident when going

from certain numbers, for example 31 to 32 shown

in figure 8.

Figure 8: Difference between normal binary code and gray

code.

Since the network works by finding similarities

in the patterns, similar distances must be represented

by similar encoded data, which is not the case in

binary. Figure X shows how the data was quantized

into 64 equidistant parts ranging between 0(0) –

4500(63), representing the range of distances found

in the data. This number was then encoded using the

above method for each of the sensors, forming a 6x7

matrix displayed in figure 9.

ICINCO 2011 - 8th International Conference on Informatics in Control, Automation and Robotics

146

Figure 9: Showing how the data is encoded.

Each row in the above binary directly relates to a

sensor and the distance recorded at that particular

interval.

Five classes were created each representing a

direction the robot should take. The reason for using

five is to show that the network can discern the

difference between taking a small change in

direction as opposed to a large one. These were; left

(-90̊ -45̊) slightly left (-44̊-1̊), straight (0̊), slightly

right (1̊-44̊) and right (45̊-90̊). Each category was

allocated a training set containing 7 encoded

matrices. These included examples of an obstacle at

varying distances from the sensor in different

locations. A test set for each direction was also

derived from the data collected were selected

however, only ‘simple’ examples were taken, where

there could only logically be ‘one’ obstacle (adjacent

sensors ‘hit’).

7 RESULTS AND DISCUSSION

The meta-network was applied to the training data

derived from the mobile robot. The initial

population was set to 30, and the architectures were

limited to a maximum of 12 layers. The reason for

this is to ensure that the genetic algorithm doesn’t

create overly large, time consuming networks. The

results show that the system finds a result that has an

error rate of 4.75% in just 21 generations. The

fitness value on the Y axis represents error rate

multiplied by a factor of the architectures

complexity, as described at the start of section 4.

To further determine the usefulness of this

approach, the test set for each class was increased to

3, meaning that each architecture would see a total

of 15 different scenarios. The results from this can

be seen in figure 11, which show little change in the

best individual.

However, as the generation average is much lower

than that in figure 10, it is possible the initial system

configuration was in a near optimal state, meaning

that there was little room for improvement. A

comparison can be seen in Table 2, which shows the

differences between the experiments. The initial

means differ quite substantially, meaning that the

initial population from the second experiment out-

performed those from the first.

Figure 10: Results from initial experiments.

Figure 11: Results from experiment with increased test set.

Table 2: Comparison of results.

Comparison of experiments

# of

Test-

cases

Initial

Mean

Initial

Best

Individual

Final

Individual

Standard

Deviation

of Best

Individuals

1 41 18.5 5.9 3.4

3 29.8 10.9 9.1 0.6

It was found that increasing the number of

generations past 30 did not yield better results, and it

would eventually stop due to there being no

improvement in fitness over a set number of

generations. For both experiments the best

architecture found had 5 layers, indicating a balance

between network complexity and fitness (more

IMPROVED ADAPTIVE META-NETWORK DESIGN EMPLOYING GENETIC ALGORITHM TECHNIQUES

147

complex architectures with similar accuracies were

discarded).

Figure 12: Using GCN without the meta-network.

When drawing a comparison with using the GCN

network without the meta-network the differences

are clear. Using the same data and test cases as the

second experiment, the GCN was used on its own.

The number of layers and the number of neurons per

layer must be set manually, and a random

architecture is derived from this. Figure 12 shows

that over a multitude of different layer and neuron

configurations, it fails to find an architecture that

comes close to that displayed in figures 10 and 11.

8 CONCLUSIONS

The paper has investigated the use of a Genetic

Algorithm to optimise the configuration of a simple

weightless neural architecture. The results indicate

that this approach possesses potential merit. A major

advantage is the simple nature of both the parsed

data input and the network being used.

ACKNOWLEDGEMENTS

This research is supported by the European Union

ERDF Interreg V scheme under the SYSIASS Grant.

REFERENCES

Abraham, A., 2004, Meta learning evolutionary artificial

neural networks. Neurocomputing, Issue - 38 : Vol. -

56, 0925-2312.

Abraham A., 2002, Optimization of Evolutionary Neural

Networks Using Hybrid Learning Algorithms.

International Joint Conference on Neural Networks,

0-7803-7278-6.

Bledsoe W. and Browning I., 1959, Pattern recognition

and reading by machine. AFIPS Joint Computer

Conferences. Boston, Massachusetts.

Gilbert E., 1958, Gray codes and paths on the n-cube.

BellSystem Technical Journal, Vol. 37. 815-826.

Howells G., Fairhurst M. C., and Bisset D. L., 1994. BCN:

an architecture for weightless RAM-based neural

networks. IEEE International Conference on Neural

Networks. Orlando, FL. 0-7803-1901-X.

Howells G., Fairhurst M.C., and Bisset D. L., 1995. GCN:

the generalised convergent network. International

Conference on Image Processing and its Applications.

Edinburgh, 0-85296-642-3.

Kordík, P., Koutník, J, Drchala, J., Kováříka, O., Čepeka,

M. and Šnoreka, M., 2010. Meta-learning approach to

neural network optimization. Neural Networks, Vol.

23. - 0893-6080.

McElroy B. and Howells G., 2010. Evaluating the

Application of Simple Weightless Networks to

Complex Patterns International Conference on

Emerging Security Technologies (EST), 978-0-7695-

4175-4.

Slowik A. and Bialko M., 2008. Training of artificial

neural networks using differential evolution algorithm.

Conference on Human System Interactions, 978-1-

4244-1542-7.

Yao, X., 1999 Evolving Artificial Neural Networks.

Proceedings of the IEEE, Vol. 87. - 0018-9219 .

ICINCO 2011 - 8th International Conference on Informatics in Control, Automation and Robotics

148