Authors:
Tobias Bolten
1
;
Felix Lentzen
1
;
Regina Pohle-Fröhlich
1
and
Klaus D. Tönnies
2
Affiliations:
1
Institute of Pattern Recognition, Niederrhein University of Applied Sciences, Reinarzstr. 49, Krefeld, Germany
;
2
Department of Simulation and Graphics, University of Magdeburg, Universitätsplatz 2, Magdeburg, Germany
Keyword(s):
Semantic Segmentation, 3D Space-time Event Cloud, PointNet++, Dynamic Vision Sensor.
Abstract:
Dynamic Vision Sensors are neuromorphic inspired cameras with pixels that operate independently and asynchronously from each other triggered by illumination changes within the scene. The output of these sensors is a stream with a sparse spatial but high temporal representation of triggered events occurring at a variable rate. Many prior approaches convert the stream into other representations, such as classic 2D frames, to adapt known computer vision techniques. However, the sensor output is natively and directly interpretable as a 3D space-time event cloud without this lossy conversion. Therefore, we propose the processing utilizing 3D point cloud approaches. We provide an evaluation of different deep neural network structures for semantic segmentation of these 3D space-time point clouds, based on PointNet++(Qi et al., 2017b) and three published successor variants. This evaluation on a publicly available dataset includes experiments in terms of different data preprocessing, the opti
mization of network meta-parameters and a comparison to the results obtained by a 2D frame-conversion based CNN-baseline. In summary, the 3D-based processing achieves better results in terms of quality, network size and required runtime.
(More)