solver – which would not address the exponential
nature of the problem itself – we can turn the
problem around and find a way to inject the missing
domain knowledge into the general assignment
mechanism, allowing it to transform the problem to
a variant of the problem that is easier to solve (i.e.,
digestible for the solver in terms of size and
complexity) without compromising the utility of the
system too much. In order to do that safely and
decide what form the domain knowledge should
take, we make some important assumptions about
the class of sCPS systems we want to support.
Assumption 1: Typically, many of the possible
configurations satisfying ensemble’s membership
constraints are undesirable due to low fitness and
consequently leading to subpar utility of the system.
In terms of our example, such configurations are
easily seen, as they are going against our intuitive
understanding of the fitness function – for example a
configuration assigning the furthest repairers and
truck to a train. While the percentage of these
undesirable configurations highly depends on the
problem and the threshold, it can be generally
expected to grow with the scale of the system – for
example, the larger the area our railroad service
covers, the more repairers it has, and the more
possibilities for assigning repairers that are beyond a
reasonable operating radius there are. In fact, with
scale, the undesirable configurations can also
become more damaging to the system utility – it
would take the furthest repairers much longer to
reach the train.
Assumption 2: A well-designed system is
influenced by ensemble formation only in terms of
system utility, not safety. As sCPS are inherently
distributed and only able to communicate with other
parts of the system via unreliable channels, such as
MANETs and other wireless networks, safety
guarantees cannot be built on top of communication.
Instead, the components must offer core safety
guarantees by design, so that a component is still
capable of secure operation even when cut off from
other parts of the system. As the ensembles are
essentially a communication abstraction, they cannot
be used to guarantee safety. In our example, this can
be seen in the design of the trains – regardless of any
network problems, the train is equipped with trusted
sensors and brakes to avoid any catastrophic
scenario. If no ensembles are ever formed, a
damaged train will hamper the system indefinitely,
but will not endanger any lives.
Disregarding Configurations. Based on the
assumptions outlined above, a useful conclusion can
be reached: It is possible to completely disregard the
undesirable configurations when deciding what
ensembles to form. While these configurations are
valid per se, they represent a situation when the
system is not performing well, and should never be
needed under normal circumstances; meaning that
situations when selecting such a configuration would
be the right thing to do are also highly improbable.
The only moment these configurations should be
realized is when there are no other options for the
system to utilize – and at such a time, the difference
between forming badly performing ensembles and
forming no ensembles at all becomes negligible. As
ensemble formation only impacts utility and not
safety, dropping these configurations influences the
system utility minimally, and safety not at all.
4 PROPOSED SOLUTION
4.1 Importance of Locality
Before we can drop undesirable configurations to
make the ensemble formation more scalable, we
must define what form this additional domain
knowledge should take, and introduce a suitable
concept to the EDL. In essence, we need a way to
allow the architect to exactly specify when a
configuration is to be deemed undesirable, and
eliminate these before handing the problem
description over to the solver.
In Sect. 3.2, we have intuitively rated the
configuration based on domain-specific properties of
individual components, e.g., saying that picking
repairers that are too far from the damaged train
does not make sense. More generally, this can be
interpreted as a distance metric expressed in terms of
the local knowledge of an individual component and
domain data of the ensemble instance in question,
and there is a correlation between the distance
metric and the impact on instance fitness (and by
extension overall system utility) if this component
would join. Assuming the existence of such a metric
is fairly realistic – many sCPS applications tend to
be very large and deployed on physical entities. It is
therefore necessary to partition the system into
manageable parts, often by taking advantage of the
physical locality of the components both in design
and execution. This is especially seen when dealing
with geographical position, but can also take form of
network distance or a similar property.
Once we have a metric with suitable properties,
we can assume that we can separate suitable and
unsuitable components with a preference function
based on their computed distance valuation. To
enable the configuration filtering, we must allow the
developer to specify this preference in the EDL.