2 METHODS
2.1 Soar and Reinforcement Learning
Soar (Laird, 2012) is a cognitive architecture that
scalably integrates a rule-based system with many
other capabilities, including RL and long-term
memory. The main decision cycle involves rules that
propose new operators, as well as preferences for
selecting amongst them; an architectural operator-
selection process; and application rules that modify
agent state. The reinforcement learning module
(Soar-RL) modifies numeric preferences for
selecting operators based on a reward signal, either
via internal or external source(s). Soar has been
used in modeling large-scale complex cognitive
functions for warfighting processes like the ones in a
kill chain (Jones et al., 1999).
In this paper, we will show how to use Soar and
specifically the reinforcement learning (Soar-RL)
module to learn an effective combination of existing
CID features for decision-making, as identified by
experts and systems, in an operational environment.
2.2 Combat ID
There are many challenges in the CID process,
including 1) an extremely short time for fusion,
decision-making, and targeting; 2) uncertain and/or
missing data outside sensor (e.g., radar, radio)
ranges; 3) manual decision-making; 4)
heterogeneous data sources for decision making; and
5) multiple decision-makers in the loop.
Existing CID methods, sensors, and systems
include basic CID categories and methodologies as
follows:
1. Procedural. Procedural methods involve
analysis of a target’s “behaviors,” to include
such things as flight profile and point of original
2. Non-cooperative. These methods gather ID
information on a target without that target’s
intentional cooperation/participation.
3. Cooperative. Cooperative CID requires active
participation on the part of the target. A
common example would be an identification
friend or foe (IFF) transponder.
4. Intelligence and ID Fusion methods.
Information derived from various networks
comprises the final CID method.
The existing methods involve wide ranges of
participating platforms such as Destroyers, Cruisers,
Carriers, F/A-18s and E-2Ds; Participating Sensors
such as Radar, Forward Looking Infrared (FLIR),
Identification Friend or Foe (IFF), Precision
Participation Location Identifier (PPLI), National
Technical Means (NTM); and Participating
Networks and Systems such as the Aegis combat
system, Cooperative Engagement Capability (CEC)
and Link-16. There are diversified doctrines, rules
of engagements (ROE), knowledge databases and
expert systems, as smart data used in the current
process. Many existing rules, expert systems and
smart data may be obselete, incomplete, or have low
confidence levels. Some models may be conflicted
with each other, even wrong or not adapative to a
local environment. There is a critical need to
research methodologies to better use, fuse and
improve on all these models to advance the art of
CID a higher symbolic level.
This paper evaluates Soar-RL as a tool for this
purpose due to the fact it can train and fuse the
system at a symbolic level. The complex CID
cognitive functions are mapped to the models
including decision-making, sensor fusion, analytic
processes and workflow initially and then Soar-RL
is applied to integrate them together.
CID decision-making requires a fusion of
existing rules. For example, as shown in Figure 2, a
state at time t can be a track profile of a flying object
with observable data containing longitude/latitude
(x/y position), altitude (z), speed, acceleration, IFF,
point of origin, heading, type, class, etc. The goal is
to classify the CID of the object as friendly, foe or
unknown. So an existing model can be “if an
unknown object is at the position x,y, there is a
probability of p
11
, p
12
or p
13
that the object’s point of
origin to be A, B or C respectively. There is
another model saying “if an unknown object’s point
of origin is from A or B there is probability of p
21
,
p
22
or p
23
that the object is a foe respectively. So
when an object is observed at (x,y), then the
probability of the object being a foe is the maximum
of the combined p
11*
p
21
,p
11*
p
22
,p
11*
p
23
,p
12*
p
21
,p
12*
p
22
, p
12*
p
23
,p
13*
p
21
,p
13*
p
22
, and p
13*
p
23.
Figure 2: Example of CID requires a fusion of existing
rules.