there are obviously times with no detector counts in
each section: red light phases. To compensate for this,
the flow per section is computed with regard to the 90s
duration of the control cycle:
flow =
3600 × counts
90s
4.2 Clustering
Traffic state estimation using sensors results in poten-
tially n-dimensional time series data. Here, we rely
on single detector stations that signalise per-lane oc-
cupancy. Based on that, feature extraction can de-
scribe the time-dependent behaviour and derive sta-
tistical statements. The basic assumption of our work
is that the behaviour of these detector-based time se-
ries responds to different types of incidents.
To turn this into a window-based analysis means
comparing known flow patterns with those that do not
match the already seen behaviour. Technically, we
approach this using clustering: The usual traffic be-
haviour with its different demands is expected to form
re-occurring feature samples in the n-dimensional
state space – while incidents produce significantly dif-
ferent samples. The detection is achieved by finding
structure in the data – those regions that are covered
by more dense observations are grouped together and
newly occurring groups can indicate incidents.
We focus on the distance and density aspect. The
field of unsupervised learning comes with a vari-
ety of different techniques (Xu and Wunsch, 2005;
Hastie et al., 2009) – but for the aspect of distance-
and density-based clustering, OPTICS and DBSCAN
are most commonly applied. They have advan-
tages for high dimensions, unknown number of clus-
ters, uneven cluster shapes and unbalanced data and
are almost parameter-free compared to other solu-
tions (Campello et al., 2020). Both methods are used.
DBSCAN
In line with our previous work, we apply DBSCAN
– Density Based Spatial Clustering of Applications
with Noise (Ester et al., 1996) in conjunction with
the Euclidean measure to calculate the distance be-
tween two points (time/flow values here). DBSCAN
uses this distances to calculate the density of data
points. The method requires only two parameters –
radius ε and minPts – to identify clusters based on
“density-reachability”: The ε-neighbourhood N
p
of a
point p contains all points within that radius. If there
are at least minPts of these points, they are “density-
reachable” and the first of three kinds of points:
core points. directly density-reachable points
density-reachable. points at the edge of a cluster that
are reachable form a core point
noise points. residual points that belong to no cluster
A cluster then contains at least one core point and
all is directly or indirectly reachable points – at least
minPts points in total.
OPTICS
A method closely related to DBSCAN is OPTICS
– Ordering Points To Identify the Clustering Struc-
ture (Ankerst et al., 1999). It requires the same pa-
rameters, although the main effect of selecting ε is to
reduce the algorithm’s complexity.
In contrast to DBSCAN, it aims at processing high
density points first: For each point p, the “core dis-
tance” CD
p
is calculated, which is the minimal radius
to encompass minPts − 1 other points. Furthermore,
the “reachability-distance” of p to another point q is
defined as max(CD
p
,dist(p,q)), with dist being the
real distance between p and q. Starting with a ran-
dom point, OPTICS continuously adds points from
the ε neighbourhood, based on the best reachability-
distance and assembles a cluster. Using this ordering,
it is even possible to find nested clusters.
4.3 Detection
The simulation is assumed not to start with an inci-
dent. A newly appearing cluster of flows is then re-
garded as an incident candidate. As we pursue an
intersection-centric approach in this context, we only
have information about the adjacent sections. Con-
sequently, we cannot rely on the validation strategies
from (Thomsen. et al., 2021), as they require infor-
mation about incidents of sections further away.
5 EVALUATION
The detection approach in Section 4 suggests a con-
siderable effort to determine the clustering settings
manually. Therefore, an automated parameter grid
search was applied. The evaluation of the approach
was mainly conducted using the Scikit-learn frame-
work (Pedregosa et al., 2011). The detector values
from the simulations in Aimsun Next where gathered
using a plug-in which was written for that purpose.
5.1 Incidents
Three representative incidents – a section closure, a
lane closure (left lane closed off), and a partial lane
Intersection-centric Urban Traffic Flow Clustering for Incident Detection in Organic Traffic Control
413