The clustering experiment carried out, indicated
that DBSCAN is more effective than K-means since
it detected arbitrary cluster shapes. In addition, DB-
SCAN performed better than K-means in the Dunn
Index (I
D
) (higher values of I
D
) and in the Davies-
Bouldin Index (I
DB
) (values closer to 0), in most of
the scenarios tested. Despite this, the values were not
very good, which might indicate that these indexes
are not the best suited to validate clusters with ar-
bitrary shapes. K-means outperformed DBSCAN in
Silhouette’s index (values closer to 1), C-index (val-
ues closer to 0), and Calinski-Harabasz (higher val-
ues of I
CH
). The detailed analysis of Silhouette’s in-
dex indicated that the points with lower I
S
values cor-
respond mostly to walls/buildings, which are objects
that do not have a spherical shape. Thus DBCV index
is the appropriate index to validate clusters with arbi-
trary shapes. Despite this, the index is very demand-
ing computationally to apply to the complete dataset.
This index is only suitable for small data subsets.
The experiment with five H3D test scenarios leads
to the conclusion that DBSCAN performs better when
the objects in the scene are rather distant. For objects
very close to each other, DBSCAN clustered different
objects into the same cluster.
The ground removal procedure proved to be an es-
sential requirement to separate the interest objects in
all scenarios tested. The method presented to con-
struct the reference bounding boxes for H3D objects
proved efficient and essential to compute the exter-
nal validation indexes. Regarding these indexes, DB-
SCAN performed better than K-means in the basic ex-
ternal indexes, in most test scenarios. The Composite
External Validation Index (CEVI) evaluated the over-
all results from Cluster, Box, and Label indexes, and
DBSCAN outperformed K-means in 3 of the scenar-
ios tested. The detailed analysis of DBSCAN clus-
ters indicated that pedestrian clusters have a low value
for the within-cluster variance (V
WC
). Therefore these
clusters have smaller dispersion than the vehicles
clusters. The analysis of H3D and BOSCH datasets
led to the conclusion that DBSCAN and K-means are
more useful in urban scenarios than in highway sce-
narios since there are more objects to cluster.
As a final remark, it can be concluded that clus-
tering methods are a useful technique for segmenting
LiDAR data. Further work should include research
on appropriate internal validation indexes to validate
arbitrarily shaped clusters for all sorts of dataset sizes.
ACKNOWLEDGEMENTS
This work is supported by European Structural
and Investment Funds in the FEDER component,
through the Operational Competitiveness and Interna-
tionalisation Programme (COMPETE 2020) [Project
nº 047264; Funding Reference: POCI-01-0247-
FEDER-047264].
REFERENCES
Arbelaitz, O., Gurrutxaga, I., Muguerza, J., P
´
erez, J. M.,
and Perona, I. (2013). An extensive comparative
study of cluster validity indices. Pattern Recognition,
46(1):243–256.
Azam, S., Munir, F., Rafique, A., Ko, Y., Sheri, A. M., and
Jeon, M. (2018). Object modeling from 3d point cloud
data for self-driving vehicles. In 2018 IEEE Intelligent
Vehicles Symposium (IV), pages 409–414.
Ester, M., Kriegel, H.-P., Sander, J., Xu, X., et al. (1996).
A density-based algorithm for discovering clusters in
large spatial databases with noise. In kdd, volume 96,
pages 226–231.
Halkidi, M., Batistakis, Y., and Vazirgiannis, M. (2001). On
clustering validation techniques. Journal of intelligent
information systems, 17(2):107–145.
Moulavi, D., Jaskowiak, P. A., Campello, R. J. G. B.,
Zimek, A., and Sander, J. (2014). Density-based clus-
tering validation. In SDM.
Patil, A., Malla, S., Gang, H., and Chen, Y.-T. (2019). The
h3d dataset for full-surround 3d multi-object detection
and tracking in crowded urban scenes. In Interna-
tional Conference on Robotics and Automation.
POCI-01-0247-FEDER-047264. Project nº 047264.
Ramos, J. (2022). Dunn’s index. MATLAB Central File
Exchange. Retrieved September 7, 2022.
Sander, J., Ester, M., Kriegel, H.-P., and Xu, X. (1998).
Density-based clustering in spatial databases: The al-
gorithm gdbscan and its applications. Data mining
and knowledge discovery, 2(2):169–194.
Syakur, M., Khotimah, B., Rochman, E., and Satoto, B. D.
(2018). Integration k-means clustering method and
elbow method for identification of the best customer
profile cluster. In IOP conference series: materials
science and engineering, volume 336, page 012017.
IOP Publishing.
Wang, C., Ji, M., Wang, J., Wen, W., Li, T., and Sun, Y.
(2019). An improved dbscan method for lidar data
segmentation with automatic eps estimation. Sensors,
19(1):172.
Xu, D. and Tian, Y. (2015). A comprehensive survey
of clustering algorithms. Annals of Data Science,
2(2):165–193.
Clustering LiDAR Data with K-means and DBSCAN
831