had very few anomalies in all experiments excluding
Exp.1, the reason is that its values of flow are very
low, generally behind 20 vehicles every 15 minutes;
thus, compared with the other sensors that have higher
variability in traffic flow the difference between its
observations is reduced. Although, when the obser-
vations of sensor R002 S2 are considered singularly,
as in Exp.1, the algorithm can recognize anomalous
peaks. Exp.2 demonstrates good performances in de-
tecting sensor faults or point anomalies in the major-
ity of sensors, but when the percentage of anomalies is
very high (above 5%) the detected anomalies also in-
clude unusual traffic conditions. Thus, when the per-
centage of anomalies for a sensor is low (less than
1%) a good solution to find unusual traffic conditions
is to perform anomaly detection for that single sensor,
as in Exp.1. The set of parameters of Exp.3 guaran-
tee the identification mainly of sensor faults. Indeed,
in Exp.4, both sensor faults and unusual traffic condi-
tions are identified.
6 CONCLUSIONS
This work describes the implementation of an al-
gorithm able to cluster spatio-temporal data and
recognize different types of anomalies: contextual
point anomalies and contextual collective anoma-
lies. The adopted algorithm combines ST-BOF and
ST-BDBCAN in cascade and has several parameters
which have to be heuristically optimized. Several
tests are needed to define the set of parameters suit-
able for the application and the type of anomalies
that need to be detected (e.g. sensor faults or sen-
sor unusual behavior). We released a Python imple-
mentation of the algorithm and tested it with differ-
ent configurations to find anomalies on traffic sensor
data in the city of Modena, Italy. The obtained re-
sults are promising and show the potential of consid-
ering the geographical features of the data in anomaly
detection. Thanks to this work, some unresolved
challenges can be highlighted: managing the spatio-
temporal distance is quite complicated and could ben-
efit from more sophisticated distance functions able
to capture the topology of the street, the traffic corre-
lations, and assign optimized weights to the features.
Still, this work could be a baseline for future improve-
ments. In order to reduce the execution time of the
algorithm, in the future, we will work on the imple-
mentation of Approx-ST-BDBCAN, which is the par-
allelized version of the algorithm described in (Dug-
gimpudi et al., 2019).
REFERENCES
Bachechi, C. and Po, L. (2019). Implementing an urban
dynamic traffic model. In 2019 IEEE/WIC/ACM In-
ternational Conference on Web Intelligence, WI 2019,
Thessaloniki, Greece, October 14-17, 2019, pages
312–316. ACM.
Bachechi, C., Rollo, F., and Po, L. (2020). Real-time data
cleaning in traffic sensor networks. In 17th IEEE/ACS
International Conference on Computer Systems and
Applications, AICCSA 2020, Antalya, Turkey, Novem-
ber 2-5, 2020, pages 1–8. IEEE.
Bachechi, C., Rollo, F., and Po, L. (2021). Detection and
classification of sensor anomalies for simulating urban
traffic scenarios. Clust. Comput. to appear.
Breunig, M. M., Kriegel, H., Ng, R. T., and Sander, J.
(2000). LOF: identifying density-based local outliers.
In Chen, W., Naughton, J. F., and Bernstein, P. A., edi-
tors, Proceedings of the 2000 ACM SIGMOD Interna-
tional Conference on Management of Data, May 16-
18, 2000, Dallas, Texas, USA, pages 93–104. ACM.
Celik, M., Dadaser-Celik, F., and Dokuz, A. (2011).
Anomaly detection in temperature data using dbscan
algorithm.
Desimoni, F., Ilarri, S., Po, L., Rollo, F., and Trillo-Lado,
R. (2020). Semantic traffic sensor data: The trafair
experience. Applied Sciences, 10(17).
Duan, L., Xu, L., Guo, F., Lee, J., and Yan, B. (2007). A
local-density based spatial clustering algorithm with
noise. Inf. Syst., 32(7):978–986.
Duggimpudi, M. B., Abbady, S., Chen, J., and Raghavan,
V. (2019). Spatio-temporal outlier detection algo-
rithms based on computing behavioral outlierness fac-
tor. Data & Knowledge Engineering, 122:1–24.
Ester, M., Kriegel, H.-P., Sander, J., and Xu, X. (1996).
A density-based algorithm for discovering clusters in
large spatial databases with noise. In Proceedings of
the Second International Conference on Knowledge
Discovery and Data Mining, KDD’96, page 226–231.
AAAI Press.
Gupta, M., Gao, J., Aggarwal, C. C., and Han, J. (2014).
Outlier detection for temporal data: A survey. IEEE
Trans. Knowl. Data Eng., 26(9):2250–2267.
Po, L., Rollo, F., Bachechi, C., and Corni, A. (2019a). From
sensors data to urban traffic flow analysis. In 2019
IEEE International Smart Cities Conference, ISC2
2019, Casablanca, Morocco, October 14-17, 2019,
pages 478–485. IEEE.
Po, L., Rollo, F., Viqueira, J. R. R., Lado, R. T., Bigi,
A., L
´
opez, J. C., Paolucci, M., and Nesi, P. (2019b).
TRAFAIR: understanding traffic flow to improve air
quality. In 2019 IEEE International Smart Cities Con-
ference, ISC2 2019, Casablanca, Morocco, October
14-17, 2019, pages 36–43. IEEE.
Rollo, F., Sudharsan, B., Po, L., and Breslin, J. (2021). Air
quality sensor network data acquisition, cleaning, vi-
sualization, and analytics: A real-world iot use case.
In Proceedings of the 2021 ACM International Joint
Conference on Pervasive and Ubiquitous Comput-
ing and Proceedings of the 2021 ACM International
WEBIST 2021 - 17th International Conference on Web Information Systems and Technologies
516