Table 1: Most Frequent 6-Grams Found in the SPAT Mes-
sages from a Controller. These Correspond to the Predomi-
nant Signal Timing Plans Deployed at This Controller.
a sequence such as: ’2,5’, ’2,6’, ’4,8’, ’3,7’, ’1,5’,
and so on. After creating a representation of the SPaT
data in this manner, we identify cycles by computing
n-grams.
Originally used in the field of computational lin-
guistics and probability, n-grams are a continuous se-
quence of n elements in a text. We find n-grams in
our SPaT representation to automatically detect signal
timing patterns. For example, a repeated sequence of
a 6-gram: (’2,5’, ’2,6’, ’4,8’, ’3,7’, ’1,5’, ’1,6’) indi-
cates this as the most dominant signal timing pattern.
Figure 6 shows examples of n-grams in SPaT data.
Table 1 presents the most frequent 6-grams, or timing
patterns, for the intersection in Figure 2.
3.2 Deriving Detector Assignments
We consider cycles from time periods with moderate
traffic activity. This is done by filtering out periods of
very low (e.g., night time) and very high (e.g., rush
hour) traffic. The rationale for doing this is that if
there is little traffic, the detectors would not be acti-
vated, and on the other hand, if there are many ve-
hicles at the intersection, then most detectors will be
activated too frequently, which may make the differ-
ences in activation counts less perceptible.
Our approach is based on the fundamental obser-
vation that the number of vehicle departures reported
on a lane will be more when the corresponding phase
is green or yellow as compared to the number of de-
partures when that phase is red. Thus, for example,
by assigning a positive vote to each vehicle departure
when phase 2 is green and by assigning a negative
vote to each vehicle departure when phase 2 is red, it
is expected that by the end of the signal cycle, all de-
tectors which actually belong to phase 2 will be left
with an overall positive voting score.
This scheme may be unable to disambiguate two
phases that are non-conflicting and are served simul-
taneously. For example, if phases 2 and 6 are served
simultaneously at all times, it will be virtually impos-
sible to assign detectors specifically to 2 or 6. Instead,
we will have a set of detectors that may belong to ei-
ther 2 or 6. If the timing plan is such that during some
part of the day the non-conflicting phases are served
separately, then it is possible to disambiguate these
detector mappings by considering the number of de-
parture votes over these special signal timing patterns.
A visualization of the number of departures for mul-
tiple cycles is presented in Figure 4 where certain de-
tectors report departures only when phase 2 is green.
This can be seen as an indication that those detectors
may be assigned to phase 2.
For each cluster of cycles, we compute the union
of the results obtained from the cycles belonging to
the same cluster. This effectively gives all the phases
a detector can be mapped to based on cycles with sim-
ilar behavior. Next, we take an intersection of the re-
sults from different clusters, to arrive at the final as-
signment of detectors to phases. This is because each
cluster corresponds to a potentially different combi-
nation of phases.
The details of each of these steps are presented
below.
Inference using a Single Cycle: A basis vector for
each phase is created during a cycle. The basis vector
is an array of numbers that are either +1 or -1, de-
pending respectively on whether the phase was active
or not during a time interval. Algorithm 1 presents
a method to infer partial detector-to-phase mappings
(or partial sparsification) using a single cycle. If we
aggregate the events to a resolution of 5 seconds and
we are analyzing a signal cycle of length 35 seconds,
then, a basis vector, b1 for phase 2 would have 7
entries, given by, for example, [-1,-1,-1,+1,+1,-1,-1].
This representation shows that phase 2 was active be-
tween seconds 20 through 30 and inactive for the rest
of the time. Figure 5 is a visual representation of
how the scores for each detector are computed. Algo-
rithm 1 is a description of the same and is the first step
in the sparsification process. The inference from one
cycle may have problems if two phases were served
simultaneously. For example, if phases 2 and 6 are
served simultaneously and detectors d2 and d6 belong
to phases 2 and 6 respectively, they are both identified
as potentially belonging to phase 2 or 6.
Inference from a Single Cluster: Any inference
from a single cycle is subject to errors. This is be-
cause traffic in a single cycle may be abnormally high
in a given phase or direction. Combining these re-
sults across multiple cycles has the benefit of reduc-
ing this error, especially if the results are consistent
across multiple cycles.
For combining across multiple cycles, it is useful
to cluster cycles with similar ordering of phases and
then combining the results within each cluster sepa-
VEHITS 2020 - 6th International Conference on Vehicle Technology and Intelligent Transport Systems
206