Figure 1: An artificial cortical column.
center providing the limbic input (i.e., a goal or -
using more psychological terms - a drive or desire).
The activity of the upper division is transmitted to
the lower division where it is subsequently
integrated with signals from other lower divisions
and the thalamic input. The upper divisions
constitute a network of units that propagate search
activity from the goal, while the lower divisions
constitute a network of threshold units that integrate
search and sensory signals, and generate sequences
of firing nodes. The output of the lower division is
the output of the whole node. An inhibition
mechanism prevents cycles and similar chaotic
behavior. Simply, a node stays desensitized for a
certain time after firing.
2.2 Neurosolver as a Forecaster
Normally, in the goal-oriented problem solving the
flow of activity from the upper to the lower division
is limited. This mode of operation can be described
as exploration of possibilities and looking for
environmental cues. The cues come as thalamic
input from the sensory apparatus. Often though, we
operate without far reaching goals forcing our brains
to make predictions based on the knowledge of the
past and the currently observed facts. In the
Neurosolver, similar phenomenon may be observed
if the activity in the upper division is gradually
increased, and at the same time is allowed to be
transmitted in its entirety from the upper to the lower
division. Assuming that that activity is allowed to
grow above the firing threshold level hosted by the
lower division, a node may fire without extra signals
from the sensors, or even in absence of the thalamic
input whatsoever. In this paper, we explore this
capability to predict future outcomes based on the
statistical model built in the Neurosolver.
3 DATA SETS
I presented the ideas on using the Neurosolver in
the forecasting capacity at ISF‘2008 (Bieszczad,
2008). I was encouraged to test the ideas on the data
set that was used for the NNx competition. The last
published data set is for NN5 that was held in 2008.
For this work, I assembled a research group that is
acknowledged in the later section this paper.
The NN5 data set is actually a collection of
records of daily witdrawals from a number of ATM
machines in England over a two-year period. A set
from an individual machine is divided into a larger
training part collected over two years and smaller
test part collected over two months. Each set is a
time series that represents a temporal usage pattern
of that particular machine. That temporal nature of
the patterns was what caught our attention in the
context of the Neurosolver.
We started with the use the data in their raw
format by assigning each datum to a Neurosolver
node. In that sense, each datum is a state of the
system in the progression of states as specified by
the given time series. The Neurosolver therefore
learns the trajectory that corresponds to each training
time series, and over time generalizes the trajectories
to represent all time series by it adaptation rules.
Due to the large number of data points and the
proximity of some of them, we also tried to cluster
the data with several cluster sizes. For that, we
approximated the k-neighbor algorithm by one that
is very straighforward in one dimension. Simply, we
decided on an arbitrary number of clusters, and then
recursively dividing the data set in two allocating the
number of clusters for each of the two division
according to the data density. An example of this
process is shown in Figure 2.
A simpler approach to clustering is to divide the
domain into a number of equal segments and then
create clusters based on the data membership in the
clusters. However, the problem with this approach is
that is does not take into consideration data
distribution. Therfore, some clusters might be empty,
while others are overcrowded.
After the custering stage, we assigned the centers
of the clusters to the Neurosolver’s nodes.
Subsequently, for each data point we activated the
node that represented the cluster to which the point
was classified. The predicted sequences were built
also out of the numbers that corresponded to the
centers of the clusters represented by the firing
nodes.
FORECASTING WITH NEUROSOLVER
387