Evaluation of a Self-organizing Heuristic for Interdependent Distributed

Search Spaces

Christian Hinrichs

, Michael Sonnenschein

and Sebastian Lehnhoff

Department for Environmental Informatics, University of Oldenburg, Oldenburg, Germany

R&D Division Energy, OFFIS Institute for Information Technology, Oldenburg, Germany

Keywords:

Self-organization, Cooperation, Combinatorial Optimization, Smart Grid.

Abstract:

Whenever multiple stakeholders try to optimize a common objective function in a distributed way, an adroit

coordination mechanism is necessary. This contribution presents a formal model of distributed combinato-

rial optimization problems. Subsequently, a heuristic is introduced, that uses self-organizing mechanisms to

optimize a common global objective as well as individual local objectives in a fully decentralized manner.

This heuristic, COHDA

, is implemented in an asynchronous multi-agent system, and is being extensively

evaluated by means of a real-world problem from the smart grid domain. We give insight into the conver-

gence process and show the robustness of COHDA

against unsteady communication networks. We show that

COHDA

is a very efﬁcient decentralized heuristic that is able to tackle a distributed combinatorial optimiza-

tion problem with regard to multiple local objective functions, as well as a common global objective function,

without being dependent on centrally gathered knowledge.

1 INTRODUCTION

By exploiting the limitations and constraints that are

inherent to the search space of valid solutions, many

real-world optimization problems can be solved very

efﬁciently. Such approaches, however, are based on

global knowledge and can not be directly transferred

to decentralized systems, where the search space is

distributed into disjoint subspaces. One possible ap-

proach might be to communicate the locally available

information to a central place. However, this is not al-

ways desirable. For example, the global collection of

data might violate privacy considerations. The gather-

ing of such data might even be impossible, as it is the

case if local search spaces are partially unknown or

cannot be enumerated (i.e. due to inﬁniteness). An-

other limitation is that distributed search spaces are

often not independent. Such interdependencies re-

quire to evaluate search spaces with relation to each

other. Thus, a parallel search for optimal solutions

would require a large communication overhead.

For instance, this type of problem is present in the

smart grid domain. According to (Gellings, 2009),

”a smart grid is the use of sensors, communications,

computational ability and control in some form to en-

hance the overall functionality of the electric power

delivery system.” We focus on active power schedul-

ing, which can be expressed as a combinatorial op-

timization problem. Here, a collective power proﬁle

is to be produced by a number of devices, which can

then be sold as a product on a market, for example.

However, individual constraints of the devices have

to be considered, which may be known only to the

concerning device. Due to temporal overlaps between

device schedules, the quality of an individual sched-

ule (with respect to the global goal of realizing the

given power proﬁle) usually depends on the current

schedule selection of several other devices. These in-

terdependencies can only be tackled through commu-

nication between devices during the search process.

For this purpose, a swarm-based method is devel-

oped which makes use of self-organization strategies.

In hitherto existing population-based heuristics, each

individual represents a solution to the given problem

within a common search space. In our approach, how-

ever, an individual incorporates a local, dependent

search space, and thus deﬁnes a partial solution that

can only be evaluated with respect to all other indi-

viduals. The task of each individual is to ﬁnd a partial

local solution that, if combined with all other local

solutions, leads to the optimal global solution.

The contribution is organized as follows. In sec-

tion 2, the MC-COP problem is recalled from liter-

ature and subsequently is extended to a distributed

Hinrichs C., Sonnenschein M. and Lehnhoff S..

Evaluation of a Self-organizing Heuristic for Interdependent Distributed Search Spaces.

DOI: 10.5220/0004227000250034

In Proceedings of the 5th International Conference on Agents and Artiﬁcial Intelligence (ICAART-2013), pages 25-34

ISBN: 978-989-8565-38-9

 2013 SCITEPRESS (Science and Technology Publications, Lda.)

variant with multiple objectives. A self-organizing

heuristic for this problem, COHDA

, is introduced

in section 3. Following, Section 4 gives an extensive

evaluation of the heuristic. Section 5 relates the ap-

proach to existing work. The contribution concludes

with a summary and an outlook in Section 6.

2 PROBLEM DEFINITION &

MODEL

As a ﬁrst approach, we restrict our point of view

to combinatorial optimization problems (COP). Such

problems can easily be modelled if we assume that

each search space is discrete by nature, and that

the elements within are known and may be enumer-

ated. From a central perspective, these problems may

be formulated with an integer programming model.

In (Hinrichs et al., 2012), the multiple-choice com-

binatorial optimization problem (MC-COP) is de-

scribed as:

min



c −

∑

i=1

∑

j=1

i j

· x

i j

)



(1)

subject to

∑

j=1

i j

= 1, i = 1 . . . m,

i j

∈ {0, 1}, i = 1. . . m, j = 1 . . . n

Here, m search spaces are deﬁned with each search

space s

containing n

partial solutions. The jth partial

solution in search space s

is described by an element

j with a value w

i j

. Note that this value may be a vec-

tor and thus may have any number of dimensions. The

goal is to select a value w

i j

from each search space s

so that the sum of these selected values approaches a

given target c as close as possible. This is a general-

ization of the well-known subset-sum problem, which

does not allow solutions > c. Since from each search

space exactly one element (no more, no less) has to

be chosen for a feasible global solution, each element

i j

∈ s

in this model has an associated selection vari-

able x

i j

, wich deﬁnes whether an element has been

chosen (x

i j

= 1) or not (x

i j

= 0).

2.1 Distributed-objective Model

In the contribution at hand, we would like to extend

the centrally driven MC-COP (1) to the distributed

case. In our approach, each local search space s

represented by a single agent a

, whose task is to se-

lect one of its elements w

i j

with respect to a common

global goal c. More formally, an agent a

has to ﬁnd

an assignment of its own selection variables x

i j

, such

that the objective function in (1) is minimized.

Deﬁnition 1. A selection of an agent a

is a tu-

ple γ

= hi, ji where i is the identiﬁer of a

and j

identiﬁes the selected element w

i j

such that x

i j

= 1,

∑

j=1

i j

= 1.

In order to decide which of its local elements

i j

∈ s

yields the optimum, an agent has to take the

selections of the other agents in the system into ac-

count.

Deﬁnition 2. A context is a set Γ = {γ

, γ

, . . . } of

selections. A selection belonging to an agent a

can

appear in a context no more than once:

= hi, j

i ∈ Γ ∧ γ

= hk, j

i ∈ Γ ⇒ i 6= k

Note that this deﬁnition allows a context to be in-

complete with regard to the population of agents in

the system, which enables us to model a local view,

that an agent a

has on the system. This is quite simi-

lar to the deﬁnition of context in (Modi et al., 2005).

Deﬁnition 3. A global context regarding the whole

system is denoted by Γ

global

= {γ

| i = 1 . . . m}.

Deﬁnition 4. A perceived context of an agent a

is a

context Γ

= {γ

| a

is aware of a

Assuming that an agent a

is able to somehow per-

ceive a context Γ

containing information about other

agents that a

is aware of (we will address this in the

following section), it may now select one of its own

elements w

i j

∈ s

with respect to the currently chosen

elements of other agents in Γ

and the optimization

goal c.

Furthermore, we introduce local constraints,

which impose a penalty value p

i j

(i.e. cost) to each

element w

i j

within the search space s

of an agent a

These local constraints are known to the correspond-

ing agent only, as described in the introductory exam-

ple. Thus, each agent has two objectives: minimizing

the common objective function as given in (1), and

minimizing its local penalties that are induced by con-

tributing a certain element w

i j

. This compound opti-

mization goal at agent level may be expressed with a

utility function:

= α

· z

+ (1 − α

) · z

(2)

Here, z

represents the common global objective

function and z

incorporates the local constraints. The

parameter α

allows to adjust the importance of the

global goal versus local constraints of an agent a

, and

hence deﬁnes the degree of altruism at agent level.

From a global point of view, this yields the

distributed-objective multiple-choice combinatorial

ICAART2013-InternationalConferenceonAgentsandArtificialIntelligence

optimization problem (DO-MC-COP):

min

∑

i=1

(3)

where z

= α

· z

+ (1 − α

) · z



c −

∑

j=1

i j

· x

i j

) +

∑

w∈Γ



∑

j=1

i j

· x

i j

subject to

∑

j=1

i j

= 1, i = 1 . . . m,

i j

∈ {0, 1}, i = 1. . . m, j = 1 . . . n

∈ R, 0 ≤ α

≤ 1, i = 1 . . . m .

Summarizing, in this model there are m decision mak-

ers (agents) a

, that pursue a common goal by each

contributing one solution element w

i j

from their as-

sociated local search space s

, while at the same time

minimizing the resulting local penalties p

i j

. For that,

an agent a

evaluates its local search space with re-

spect to the global target c as well as the perceived

context Γ

Obviously, a change in the selection γ

made by

an agent a

changes the current global context Γ

global

as well as every perceived context Γ

which con-

tains γ

. Thus, the deﬁnition of how an agent a

per-

ceives a context Γ

, and how this relates to Γ

global

is crucial for solving the DO-MC-COP. The follow-

ing section addresses these questions and describes a

self-organizing approach to this distributed-objective

problem.

3 SELF-ORGANIZING

HEURISTIC

In nature, we ﬁnd many examples of highly efﬁcient

systems, which perform tasks in a completely de-

centralized manner: swarming behavior of school-

ing ﬁsh or ﬂocking birds (Reynolds, 1987), foraging

of ants (H

olldobler and Wilson, 1990) and nest ther-

moregulation of bees (Jones et al., 2004). Even pro-

cesses within single organisms show such astonish-

ing behavior, for instance the neurological develop-

ment of the fruit ﬂy (Kroeker, 2011) or the foraging of

Physarum polycephalum, a single-celled slime mold

(Tero et al., 2010), which both exhibit rules for adap-

tive network design. One of the core concepts in these

examples is self-organization. From the perspective

of multi-agent systems, this term can be deﬁned as

”the mechanism or the process enabling a system to

change its organization without explicit external com-

mand during its execution time” (Serugendo et al.,

2005). If such a process executes without any cen-

tral control (i.e. neither external nor internal), it is

called strong self-organization. From the perspective

of complex systems theory, this is related to emer-

gence, which can be deﬁned as ”properties of a sys-

tem that are not present at the lower level [...], but are

a product of the interactions of elements” (Gershen-

son, 2007).

The COHDA heuristic, as proposed in (Hinrichs

et al., 2012), applies these concepts to create a self-

organizing heuristic for solving distributed combi-

natorial optimization problems. Note that we devi-

ate from the formal identiﬁers used in the referenced

work, to reﬂect the extended problem description (3).

Moreover, we include local objectives of agents into

the search process, which yields an extended heuristic

COHDA

. In the following, we will ﬁrst extend the

deﬁnitions introduced in the previous section to meet

the needs of a heuristic, and subsequently summarize

the process in three steps.

In the considered heuristic, agents iteratively

search for partial solutions. This yields an evolving

process, hence we need to extend the notion of selec-

tion and context (deﬁnitions 1 to 4) with a temporal

component, and thus deﬁne state and conﬁguration:

Deﬁnition 5. The state of an agent a

is given by

= hγ

, λ

i, where γ

is a selection containing an as-

signment of a

’s decision variables x

i j

, and λ

is a

unique number within the history of a

’s states. Each

time an agent a

changes its current selection γ

, the agent enters a new state

= h

i where

= λ

+ 1. This imposes a strict total order on a

’s

selections, hence λ

reﬂects the ”age” of a selection.

Deﬁnition 6. A conﬁguration Σ = {σ

, σ

, . . . } is a

set of states. A state belonging to an agent a

can

appear in a context no more than once:

∈ Σ ∧ σ

∈ Σ ⇒ i 6= k

Deﬁnition 7. A global conﬁguration regarding the

whole system is denoted by Σ

global

= {σ

| i = 1. . . m}.

Deﬁnition 8. A perceived conﬁguration of an agent

is a conﬁguration Σ

= {σ

| a

is aware of a

In the DO-MC-COP model (3), an agent a

cre-

ates an assignment of its decision variables x

i j

based

on its global objective z

as well as the local objective

. While the latter is locally deﬁned at agent level,

the former is realized by a perceived context Γ

. For

the COHDA

heuristic, we replace this by a perceived

conﬁguration Σ

. This does not change the problem

EvaluationofaSelf-organizingHeuristicforInterdependentDistributedSearchSpaces

description, but enables us to describe the interactions

of agents, and thus the ability to actually perceive in-

formation.

For that purpose, each agent a

maintains a conﬁg-

uration Σ

, which reﬂects the knowledge of a

about

the system. This conﬁguration is initially empty, but

is updated during the iterative process through infor-

mation exchange with other agents. The COHDA

heuristic is inspired by swarming behavior, and de-

ﬁnes a local view on the system for each agent

through the use of neighborhood relations. This can

be expressed with a graph G = (V , E), where each

agent is represented by a vertex a

∈ V . Edges e =

, a

) ∈ E depict communication links. Usually, this

graph is not fully connected. Thus, the neighborhood

of an agent a

is given by:

= {a

| (a

, a

) ∈ E} (4)

An agent may not communicate with any other agent

outside of its neighborhood. Just like ﬂocking birds,

the agents now observe their local environment and

react to changes within their perception range. That

is, whenever an agent a

enters a new state

changing the assignment of its decision variables x

i j

its neighboring agents a

∈ N

perceive this event.

These agents now each update their current local

view Σ

on the system, and react to this event by

re-evaluating their search spaces s

and subsequently

adapting their own decision variables. However, usu-

ally Σ

6= Σ

global

, hence an agent has to deal with in-

complete, local knowledge.

Thus, for improving the local search at agent level,

the COHDA

heuristic uses an information spread-

ing strategy besides this reactive adaptation. When-

ever a local change is published to the neighborhood,

the publishing agent a

includes information not only

about its updated state σ

, but about the currently

known conﬁguration Σ

of all other agents it is aware

of as well. A receiving agent a

now updates its ex-

isting knowledge base Σ

with this two-fold informa-

tion (Σ

∪ {σ

}). In this update procedure, an element

= hγ

, λ

i ∈ Σ

of the sending agent a

is added to

of the receiving agent a

if and only if any of the

following conditions hold:

1. Σ

does not already contain a state σ

with z = y,

such that

∀σ

∈ Σ

: z 6= y

2. Σ

already contains a state σ

with z = y, and σ

has a lower value λ

, such that

∃σ

= hγ

, λ

i ∈ Σ

: z = y ∧ λ

< λ

In this case, σ

replaces σ

in Σ

Using this information spreading strategy, agents

build a complete representation Σ

global

of the whole

system over time, and take this information into ac-

count in their decision making as well. However, due

to possibly rather long communication paths between

any two agents, these global views on the system are

likely to be outdated as soon as they are built and

represent beliefs about the systems rather than facts.

Nevertheless, they provide a valuable guide in the

search for optimal local decisions.

In order to ensure convergence and termination, a

third information ﬂow is established on top of that.

In addition to the currently known system conﬁgura-

tion Σ

(including the agent’s own current state σ

each agent keeps track of the best known conﬁgura-

tion Σ

∗

= {σ

∗

, σ

∗

, . . . } it has seen during the whole

process so far. This is, whenever an agent updates its

by means of received information, it compares this

new conﬁguration Σ

to Σ

∗

. If Σ

yields a better so-

lution quality than Σ

∗

according to DO-MC-COP (3),

is stored as new best known conﬁguration Σ

∗

. In

addition to σ

and Σ

, an agent a

also exchanges its

∗

with its neighbors, everytime it changes. Thus,

when an agent a

receives a Σ

∗

from a neighbor a

the agent replaces its currently stored Σ

∗

by Σ

∗

, if the

latter yields a better solution quality than the former.

Similar to (Hinrichs et al., 2012), the whole pro-

cess can be summarized in the following three steps:

1. (update) An agent a

receives information from

one of its neighbors and imports it into its own

knowledge base. That is, its belief Σ

about the

current conﬁguration of the system is updated, as

well as the best known conﬁguration Σ

∗

2. (choose) The agent now adapts its own decision

variables x

i j

according to the newly received in-

formation, while taking its own local objectives

into account as well, scaled by the altruism pa-

rameter α

. If it is not able to improve the be-

lieved current system conﬁguration Σ

, the state

∗

stored in the currently best known conﬁgura-

tion Σ

∗

will be taken. The latter causes a

to revert

its current state σ

to a previous state σ

∗

, that once

yielded a better believed global solution.

3. (publish) Finally, the agent publishes its be-

lief about the current system conﬁguration Σ

(in-

cluding its own new state

), as well as the best

known conﬁguration Σ

∗

to its neighbors. Local

objectives are not published to other agents, thus

maintaining privacy.

Accordingly, an agent a

has two behavioral options

after receiving data from a neighbor. First, a

will

try to improve the currently believed system conﬁg-

uration Σ

by choosing an appropriate w

i j

, and subse-

quently adding its new state

to Σ

. Yet, this only

happens if the resulting Σ

would yield a better solu-

ICAART2013-InternationalConferenceonAgentsandArtificialIntelligence

tion quality than Σ

∗

. In that case, Σ

replaces Σ

∗

, so

that they are identical afterwards. If the agent can-

not improve Σ

over Σ

∗

, however, the agent reverts its

state to the one stored in Σ

∗

. This state, σ

∗

, is then

added to Σ

afterwards.

Thus, Σ

always reﬂects the current view of a

the system, while Σ

∗

always represents the currently

pursued goal of a

, since it is the best conﬁguration the

agent knows. In either case, Σ

and Σ

∗

both contain

’s current state after step 2.

4 EMPIRICAL EVALUATION

We implemented the proposed heuristic COHDA

a multi-agent system (MAS). In our simulation en-

vironment, agents communicate asynchronously, us-

ing a network layer as communication backend. This

backend may be a physical one, so as to be able to dis-

tribute the MAS over arbitrary machines. In our eval-

uation however, we used a simulated network layer,

in order to have full control over message travelling

times, and to permit deterministic repetitions of sim-

ulation runs. For this, we used predeﬁned seeds for

the random number generators. This allows us to

simulate unsteady communication layers with varying

message delays. Basically, our simulation is event-

driven. However, an event at agent level (i.e. the

adaptation procedure as described in the previous sec-

tion) is only triggered through a message by another

agent. Hence we may call the minimal time, that

it takes in principle for a message to be transferred

from the sender to the receiver, a simulation step. We

set this minimal possible message delay to 1 rather

than 0, since a message cannot be received instantly in

any physical communication network, no matter how

fast it is. In particular, this means that a simulation

step refers to one simulated unit of time, so that a

sent message will be received in the next simulation

step at the earliest (depending on its delay induced

by the communication backend). Our implementation

ensured that we were able to monitor all exchanged

messages.

In the following experiments, each agent repre-

sents a simulated combined heat and power (CHP)

device with an 800 l thermal buffer store. We used the

simulation model of an EcoPower CHP as described

in (Bremer and Sonnenschein, 2012). For each of

those devices, the thermal demand for a four-family

house during winter was simulated according to (Jor-

dan and Vajen, 2001). The devices were operated in

heat driven operation and thus primarily had to com-

pensate the simulated thermal demand. Additionally,

after shutting down, a device would have to stay off

for at least two hours. However, due to their thermal

buffer store and the ability to modulate the electrical

power output within the range of [1.3kW, 4.7 kW], the

devices had still some degrees of freedom left.

Since we are focusing on combinatorial problems

in the contribution at hand, for each conducted exper-

iment a set of feasible electrical power output proﬁles

was pre-generated from this simulation model. That

is, the simulation model has been instantiated with a

random initial temperature level of the thermal buffer

store and a randomly generated thermal demand, for

each CHP device separately. Subsequently, a num-

ber of feasible power proﬁles were generated from

each of these simulation models. The resulting sets of

power proﬁles are then used as local search spaces by

the agents. The global goal c of the optimization prob-

lem was generated as a random electrical power pro-

ﬁle, which was scaled to be feasible for the given pop-

ulation of CHP devices. However, we cannot guaran-

tee that an optimal solution actually lies within in the

set of randomly enumerated search spaces. The task

of the agents now was to select one element out of

their given sets of power proﬁles each, so that the sum

of all selected power proﬁles approximates the target

proﬁle c as exactly as possible.

4.1 General Behavior

As a ﬁrst step, we examined the general behavior of

the heuristic. In Figure 1, the results of a single sim-

ulation run (m = 30 devices with n = 2000 possible

power proﬁles each) are visualized. The planning

Figure 1: Optimization result of a single simulation run with

30 CHP (and local search spaces comprising 2000 feasible

power proﬁles each), for a planning horizon of four hours

in 15-minute intervals.

horizon was set to four hours in 15-minute intervals.

The upper chart shows the target proﬁle (dashed line)

and the resulting aggregated power output (solid line).

The remaining power imbalance is shown in the mid-

EvaluationofaSelf-organizingHeuristicforInterdependentDistributedSearchSpaces

dle chart, while the individual power output proﬁles

of the devices are depicted in the lower chart. The

latter is quite chaotic, which is due to the limited sets

of available power output proﬁles per device. Nev-

ertheless, the heuristic was able to select 30 proﬁles

(one for each device), whose sum approximates the

target proﬁle with a remaining imbalance of less than

2.5kW per time step in the planning horizon.

In Figure 2, the process of the heuristic for this

simulation run is shown in detail. This data is visible

Figure 2: Detailed illustration of the COHDA

heuristic

during a simulation.

to the simulation observer only, the individual agents

still act upon local knowledge. The solid line depicts

the global ﬁtness value of the heuristic over time. This

ﬁtness represents the global solution quality accord-

ing to equation (3), but has been normalized to the

interval [0.0, 1.0], with 0.0 being the optimum. The

normalization was done by taking an approximation

for the worst combination of power proﬁles

worst

= max

∑

i=1

i,min

, d

∑

i=1

i,max

as upper bound (with w

i,min

and w

i,max

being ele-

ments having the minimal/maximal value in class i),

and assuming the existence of an optimal solution

(no remaining imbalance) as lower bound. In order

to examine convergence, the agent population was

parametrized with the upper bound as initial solution.

In general, the ﬁtness decreases over time until it

converges to a near-optimal solution. However, it is

not strictly decreasing: Temporary deteriorations (in-

crease of the ﬁtness value, which means an increas-

ing imbalance) are produced due to the decentralized

nature of the heuristic. They can be explained with

the distribution ratios of locally best known conﬁgu-

rations Σ

∗

(see Section 3). For this we centrally ob-

served the Σ

∗

of every agent a

in the network. Out

of these sets, at each simulation step, we identiﬁed

the Σ

∗

which yielded the best overall solution quality,

and measured the relative frequency of occurence of

this particular Σ

∗

in the population. The ﬁlled area

shows this distribution (the higher, the more agents

are aware of this speciﬁc Σ

∗

). Recall that an agent a

inherits a received Σ

∗

from a neighbor a

if Σ

∗

yields

a better rating than the currently stored Σ

∗

of the agent

. Thus, a Σ

∗

with very good rating prevails and

spreads in the network, until a better rated Σ

∗

is found

somewhere. This effect can be seen during simula-

tion steps 10 to 30, for instance. As the distribution

of the currently best conﬁguration, say Σ

∗

, rises, the

global ﬁtness improves steadily. In simulation step

30, however, an even better conﬁguration, say Σ

∗

, has

emerged somewhere, which begins to spread subse-

quently. As the agent population continually adapts

this new Σ

∗

, the ﬁtness value temporarily deteriorates,

before it steadily improves from simulation step 35

on, until in simulation step 45 another conﬁguration,

say Σ

∗

, is found somewhere, that yields a better ﬁt-

ness than Σ

∗

. This process continues up to the point

where no better rated Σ

∗

can be found, and the heuris-

tic terminates after 185 simulation steps. The ﬁnal

ﬁtness value is 0.02, which amounts to a total remain-

ing imbalance of 7.09 kW (0.007% of the targeted

1004.13kW in total over the planning horizon).

Figure 3 shows the performance of the COHDA

heuristic under the same parametrization, aggregated

over 100 simulation runs. For each simulation run,

Figure 3: Aggregated performance of COHDA

over 100

simulation runs with 30 CHP (and local search spaces com-

prising 2000 feasible power proﬁles each), for a planning

horizon of four hours in 15-minute intervals.

the same CHP devices and thus the same local search

spaces were used, but the communication network

was initialized with different seeds for the random

number generator. This yielded a different commu-

nication graph in each run, as well as different gen-

erated message delays. The solid line represents the

mean ﬁtness over time, while the shaded area around

this line depicts the standard deviation. Obviously, the

COHDA

heuristic is able to converge to near optimal

solutions independently from the underlying commu-

nication backend. On average over all 100 simulation

runs, each agent sent 1.5 ± 0.04 messages per simula-

tion step. The boxplot visualizes simulation lengths,

with 169.69 ± 28.38 being the mean.

4.2 Parameter Analysis

Subsequent to the inspection of the general behavior,

ICAART2013-InternationalConferenceonAgentsandArtificialIntelligence

we examined a number of input parameters of the

heuristic with regard to simulation performance. The

latter can be measured in terms of (a) the resulting ﬁt-

ness after termination, (b) the simulation length, or (c)

the average number of exchanged messages per agent

per simulation step during the process. So the inﬂu-

ence of the input parameters on each of these numbers

(a-c) has been analyzed. Each examined conﬁgura-

tion was simulated 100 times. The presented results

show mean values as well as standard deviations of

the observed properties from these 100 simulations. If

not stated otherwise, the experiments were conducted

using a small world network topology with φ = 2.0

(see Section 4.2.2 for an explanation).

4.2.1 Message Delay

An important property of the simulated communi-

cation backend is its ability for delayed messages.

In order to evaluate the robustness of the heuristic

against a non-deterministic communication layer, we

tested the approach with different amounts of mes-

sage delays. To accomplish that, we deﬁned an in-

terval [1, msg

max

], from which a random number is

generated for each sent message. The message is then

delayed for the according number of simulation steps.

We evaluated msg

max

∈ {1, 2, 5, 7, 10}.

Figure 4 shows the inﬂuence of message delays

on the simulation performance, as deﬁned in the pre-

vious paragraph (criteria a-c). Fortunately, message

Figure 4: Inﬂuence of different message delays.

delays have absolutely no inﬂuence on the ﬁnal ﬁt-

ness produced by the heuristic (criterion a, top chart).

This means that COHDA

is very stable against an

unsteady communication network. The time until ter-

mination (criterion b, middle chart) consequentially

rises linearly with increasing message delay. With re-

gard to the amount of exchanged messages (criterion

c, bottom chart), a strongly decreasing trend towards

less than one sent message on average per agent per

simulation step with increasing delay is visible. To re-

veal the best trade-off between simulation length and

communication overhead, Figure 5 shows the num-

ber of messages per agent throughout a whole simu-

lation run, depending on message delays. We ﬁnd a

Figure 5: Inﬂuence of message delay on the sum of mes-

sages per agent for the whole simulation.

minimum of exchanged messages with msg

max

= 2.

Obviously, if compared to the absence of message de-

lays (msg

max

= 1), COHDA

does not only cope with,

but even beneﬁts from a slight variation at agent level

introduced by message delays. However, in the ex-

amined scenario, this variation should be kept rather

small in order to speed up convergence. Thus, the fol-

lowing experiments were conducted using a message

delay msg

max

= 2.

4.2.2 Network Density

The composition of an agents’ neighborhood is

directly coupled to the underlying communication

graph G = (E, V ). Preliminary experiments showed

a beneﬁcial impact of random graphs with a low di-

ameter. Thus, we evaluated the following topologies:

• Ring: The agents are inserted into a ring-shaped

list. Each agent is then connected to its predeces-

sor and successor.

• Small World: This network comprises an ordered

ring with |V | · φ additional connections between

randomly selected agents, cf. (Strogatz, 2001).

We examined φ ∈ {0.1, 0.5, 1.0, 2.0, 4.0}.

In Figure 6, the results of these experiments are visu-

alized. We ordered the plotted data according to the

Figure 6: Inﬂuence of the network topology.

EvaluationofaSelf-organizingHeuristicforInterdependentDistributedSearchSpaces

approximated average neighborhood size, which de-

ﬁnes the overall density of the communication graph.

Similar to the previous section, there is no inﬂuence of

the network density on solution quality. Expectedly,

the message complexity increases with larger neigh-

borhoods. Similarly, simulation length decreases with

more connections. Again the trade-off between run-

time in terms of simulation steps, and run-time in

terms of exchanged messages is visible. A compari-

son of the number of messages per agent throughout a

whole simulation run against network topology shows

that, for the given scenario, a small world topology

with φ = 0.5 yields the least messages on average dur-

ing a whole simulation (chart not shown here).

4.2.3 Planning Horizon

For real-world applications, it is interesting to know

what planning horizon the heuristic is capable of. Fig-

ure 7 shows the result of planning horizons with a

length of {2, 4, 8, 12, 24} hours in 15-minute inter-

vals. The ﬁnal ﬁtness in the upper chart deteriorates

Figure 7: Inﬂuence of the planning horizon.

almost linearly with larger planning horizons. Sim-

ilarly, the number of simulation steps rises, whereas

the number of exchanged messages is not inﬂuenced.

While we expected the last, we did not expect the in-

ﬂuence of the planning horizon on ﬁtness and sim-

ulation length, and examined it in more detail. After

several experiments with synthetic conﬁgurations (i.e.

carefully generated search space values according to

(Lust and Teghem, 2012)), it turned out to be a side

effect in our use of the CHP simulation models: Ran-

domly enumerating a rather small number of feasible

power proﬁles does not yield a sufﬁcient coverage of

the theoretically feasible action space of the devices.

We found that increasing the size of pre-generated lo-

cal search spaces signiﬁcantly improves the ﬁnal sim-

ulation ﬁtness again, while leaving the number of sim-

ulation steps and the number of exchanged messages

unaffected.

4.2.4 Population Size

Another interesting property regarding real-world ap-

plications is the inﬂuence of population size on the

heuristic. In Figure 8, a linear increase in simulation

steps until termination can be seen. This is conse-

Figure 8: Inﬂuence of the population size.

quently due to the increased coordination complex-

ity in larger networks. Yet, since the increase is lin-

ear at most, this shows that COHDA

is quite robust

against the number of participating individuals. In-

terestingly, the ﬁnal ﬁtness as well as the number of

exchanged messages per time step signiﬁcantly im-

prove with larger population sizes. The former may

be related to the increased diversity, which could al-

ready be observed to be beneﬁcial in the analysis of

the sizes of local search spaces in the previous sec-

tion. The latter can be attributed to an increased di-

ameter of the communication graph with larger pop-

ulation sizes. Here, information spreads more slowly,

and it takes a longer time for the system to converge.

4.3 Bi-objective Behavior

As described in Sections 2 and 3, we introduced lo-

cal objective functions at agent level for the COHDA

heuristic. As a proof of concept, we conducted an

experiment with randomly generated penalty values

i j

∈ [0, max(c)]. Figure 9 shows the aggregated re-

sults of 100 simulation runs, using 30 CHP appliances

with 200 feasible power proﬁles each, over a plan-

ning horizon of four hours, using a small world topol-

ogy with φ = 0.5 and a message delay msg

max

= 2.

The altruism parameter was set to α

= 0.5 for all

agents, so that the local objectives were considered

equally important to the global objective. The heuris-

tic is able to minimize local penalties to a normal-

ized value of 0.02 ± 0.01. Despite the rather difﬁcult

setting of the altruism parameter, the global objective

ﬁtness could effectively be optimized to a normalized

value of 0.15 ± 0.07, which amounts to a remaining

ICAART2013-InternationalConferenceonAgentsandArtificialIntelligence

Figure 9: Aggregated performance of COHDA

over 100

simulation runs with distributed local objective functions

(∀i : α

= 0.5).

imbalance of 33.12kW ± 17.02 (0.06% ± 0.03 of the

targeted 544.26 kW in total over the planning hori-

zon).

5 RELATED WORK

The problem stated in this contribution is formulated

as an instance of a distributed constraint optimiza-

tion problem (DCOP). Algorithms for DCOP usu-

ally are optimal, i.e. they guarantee to ﬁnd the op-

timal solution. The SynchBB approach introduced

by (Hirayama and Yokoo, 1997) incorporates a syn-

chronous branch&bound strategy and thus is quite

slow. ADOPT, as proposed in (Modi et al., 2005),

exploits inter-agent constraints to ﬁnd an optimal so-

lution very efﬁciently. However, such a strategy is

not applicable to our problem setting, since every lo-

cal search space in principle depends on every other.

The constraint graph used in ADOPT would be fully

connected and hence would not provide any advan-

tage. (Penya, 2006) proposes COBB, a constraint op-

timization algorithm based on broadcasting. This ap-

proach is somewhat similar to COHDA

, we believe

however, that the information spreading approach in

COHDA

is more of a distributed nature than the

broadcasting used in the referenced work. Further,

COBB is a synchronous algorithm, whereas our ap-

proach is asynchronous and thus is truly decentral-

ized.

Regarding the smart grid domain, micro-

economic approaches are often used for distributed

problem solving. For example, the PowerMatcher

(Kok et al., 2005) incorporates a local price formation

process within a hierarchical structure. Here, the

optimization goal is the optimal dispatch of a traded

commodity. The approach has some major draw-

backs. First, it is statically organized with central

components, and thus is not truly decentralized.

Second, since the architecture solely speciﬁes the

pricing mechanism, all decision complexity lies

within the bidding strategies of participating agents.

This applies to several other approaches of this kind

as well.

With respect to self-organizing systems, there are

a few interesting approaches with varying concepts.

In (Li et al., 2010), a semi-distributed system for solv-

ing distributed combinatorial optimization problems

is proposed. Similar to the COHDA

approach, au-

tonomous agents hold local search spaces and pur-

sue a common goal. Coordination, however, hap-

pens through a central, black-board like communi-

cation area called StigSpace. Thus, the approach is

not truly decentralized. (Pournaras et al., 2010) intro-

duces EPOS, a bottom-up planning algorithm using

a tree overlay organization structure. In contrast to

COHDA

, the EPOS approach imposes hierarchical

relations on the agents and thus again is not truly de-

centralized.

The applied methodology of cooperative problem

solving in COHDA

is quite similar to the AMAS ap-

proach, and especially to the AMAS4Opt agent model

as proposed in (Kaddoum, 2011). Therefore our con-

tribution focuses on a problem speciﬁc implementa-

tion rather than a general methodology.

6 CONCLUSIONS & FUTURE

WORK

In the contribution at hand, we presented COHDA

which is a self-organizing heuristic for solving dis-

tributed combinatorial optimization problems. We ap-

plied the heuristic to a problem from the smart grid

domain, and performed a thorough evaluation of the

performance under varying conditions. We imple-

mented an asynchronous multi-agent system with full

control over the communication backend. Regard-

ing our example application, it could be shown that

the heuristic exhibits convergence and termination,

and is robust against unsteady communication net-

works. The run-time of COHDA

, in terms of sim-

ulation steps, rises linearly with increasing popula-

tion sizes. Further, there is a trade-off between the

number of simulation steps until termination, and the

number of exchanged messages. This trade-off can

be adjusted through the density of the communication

network (i.e., the average size of the neighborhoods).

The evaluation of a bi-objective scenario showed the

ability of the heuristic to optimize local penalties as

well as a global objective in parallel.

In the present form, COHDA

needs a central op-

erator that broadcasts the optimization goal, and is

able to detect the termination of the process (and thus

has a global view on the system). But the actual

optimization process is still performed in a truly de-

EvaluationofaSelf-organizingHeuristicforInterdependentDistributedSearchSpaces

centralized manner! A fully decentralized variant of

COHDA

, however, could be realized by including

the ability to detect termination in a self-organizing

way, as well as the capability to spontaneously nomi-

nate a spokesperson from the population of agents, in

order to announce the optimization result.

An important future subject will be to study the in-

ﬂuence of the altruism parameter on the heuristic, i.e.:

How does the resulting global ﬁtness depend on the

setting of α

? If the agents are allowed to deﬁne this

value on their own, how can we guarantee that the sys-

tem does not collapse? Future work will also include

the analysis of adaptivity, i.e. spontaneously changing

decisions of agents in already converged conﬁgura-

tions, or repeatedly varying optimization targets. Ad-

ditionally, we will address the embedding of the math-

ematical representation of device’s action spaces, as

formulated in (Bremer and Sonnenschein, 2012), in

order to circumvent the currently existing premise of

enumerated local search spaces in COHDA

with its

disadvantages as described in Section 4.2.3.

ACKNOWLEDGEMENTS

Due to the vast amounts of simulations needed, all

experiments have been conducted on HERO, a multi-

purpose cluster installed at the University of Olden-

burg, Germany. We would like to thank the main-

tenance team from HERO for their valuable service.

We also thank Ontje L

unsdorf for providing the asyn-

chronous message passing framework used in our

simulation environment, and J

org Bremer for provid-

ing the CHP simulation model.

REFERENCES

Bremer, J. and Sonnenschein, M. (2012). A distributed

greedy algorithm for constraint-based scheduling of

energy resources. In SEN-MAS’2012 Workshop, Proc.

of the Federated Conference on Computer Science and

Information Systems, pages 1285–1292, Wrocław,

Poland. IEEE Catalog Number CFP1285N-ART.

Gellings, C. (2009). The Smart Grid: Enabling Energy Ef-

ﬁciency and Demand Response. The Fairmont Press,

Inc.

Gershenson, C. (2007). Design and Control of Self-

organizing Systems. Copit-Arxives.

Hinrichs, C., Lehnhoff, S., and Sonnenschein, M. (2012).

A Decentralized Heuristic for Multiple-Choice Com-

binatorial Optimization Problems. In Operations

Research Proceedings 2012 – Selected Papers of

the International Conference on Operations Research

(OR 2012), Hannover, Germany. Springer.

Hirayama, K. and Yokoo, M. (1997). Distributed Partial

Constraint Satisfaction Problem. In Principles and

Practice of Constraint Programming, pages 222–236.

olldobler, B. and Wilson, E. O. (1990). The Ants. Belknap

Press of Harvard University Press.

Jones, J. C., Myerscough, M. R., Graham, S., and Oldroyd,

B. P. (2004). Honey bee nest thermoregulation: di-

versity promotes stability. Science (New York, N.Y.),

305(5682):402–4.

Jordan, U. and Vajen, K. (2001). Inﬂuence Of The DHW

Load Proﬁle On The Fractional Energy Savings: A

Case Study Of A Solar Combi-System With TRNSYS

Simulations. Solar Energy, 69:197–208.

Kaddoum, E. (2011). Optimization under Constraints

of Distributed Complex Problems using Cooperative

Self-Organization. Phd thesis, Universit

e de Toulouse.

Kok, J. K., Warmer, C. J., and Kamphuis, I. G. (2005). Pow-

erMatcher. In Proceedings of the fourth international

joint conference on Autonomous agents and multia-

gent systems - AAMAS ’05, page 75, New York, New

York, USA. ACM Press.

Kroeker, K. L. (2011). Biology-inspired networking. Com-

munications of the ACM, 54(6):11.

Li, J., Poulton, G., and James, G. (2010). Coordination of

Distributed Energy Resource Agents. Applied Artiﬁ-

cial Intelligence, 24(5):351–380.

Lust, T. and Teghem, J. (2012). The multiobjective multi-

dimensional knapsack problem: a survey and a new

approach. International Transactions in Operational

Research, 19(4):495–520.

Modi, P., Shen, W., Tambe, M., and Yokoo, M. (2005).

ADOPT: Asynchronous Distributed Constraint Opti-

mization with Quality Guarantees. Artiﬁcial Intelli-

gence, 161(1-2):149–180.

Penya, Y. (2006). Optimal Allocation and Scheduling of

Demand in Deregulated Energy Markets. Phd, Vienna

University of Technology.

Pournaras, E., Warnier, M., and Brazier, F. M. (2010). Local

agent-based self-stabilisation in global resource utili-

sation. International Journal of Autonomic Comput-

ing, 1(4):350.

Reynolds, C. W. (1987). Flocks, herds and schools: A

distributed behavioral model. SIGGRAPH Comput.

Graph., 21(4):25–34.

Serugendo, G., Gleizes, M.-P., and Karageorgos, A. (2005).

Self-organisation in multi-agent systems. The Knowl-

edge Engineering Review, 20(2):65–189.

Strogatz, S. H. (2001). Exploring Complex Networks. Na-

ture, 410(March):268–276.

Tero, A., Takagi, S., Saigusa, T., Ito, K., Bebber, D. P.,

Fricker, M. D., Yumiki, K., Kobayashi, R., and Nak-

agaki, T. (2010). Rules for biologically inspired

adaptive network design. Science (New York, N.Y.),

327(5964):439–42.

ICAART2013-InternationalConferenceonAgentsandArtificialIntelligence