
on richer event representation methods for the inputs
to improve the network. Our next goal would be to
study other types of datasets, such as the road scene
DSEC dataset (Gehrig et al., 2021), by adding static
and moving status to objects in the annotations, to
explore broader applications of event cameras in the
field of motion segmentation.
ACKNOWLEDGMENT
This work has been carried out within SIVALab, joint
laboratory between Renault and Heudiasyc UMR
UTC/CNRS.
REFERENCES
Alonso, I. and Murillo, A. C. (2019). Ev-segnet: Semantic
segmentation for event-based cameras. In IEEE Inter-
national Conference on Computer Vision and Pattern
Recognition Workshops (CVPRW).
Burner, L., Mitrokhin, A., Fermuller, C., and Aloimonos, Y.
(2022). Evimo2: An event camera dataset for motion
segmentation, optical flow, structure from motion, and
visual inertial odometry in indoor scenes with monoc-
ular or stereo algorithms. ArXiv, abs/2205.03467.
Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., and
Adam, H. (2018). Encoder-decoder with atrous sepa-
rable convolution for semantic image segmentation. In
Proceedings of the European conference on computer
vision (ECCV).
Chollet, F. (2017). Xception: Deep learning with depthwise
separable convolutions. In Proceedings of the IEEE
conference on computer vision and pattern recogni-
tion.
Gallego, G., Delbr
¨
uck, T., Orchard, G., Bartolozzi, C.,
Taba, B., Censi, A., Leutenegger, S., Davison, A. J.,
Conradt, J., Daniilidis, K., and Scaramuzza, D.
(2022). Event-based vision: A survey. IEEE Transac-
tions on Pattern Analysis and Machine Intelligence.
Gehrig, M., Aarents, W., Gehrig, D., and Scaramuzza, D.
(2021). Dsec: A stereo event camera dataset for driv-
ing scenarios. IEEE Robotics and Automation Letters.
Glover, A. and Bartolozzi, C. (2016). Event-driven ball de-
tection and gaze fixation in clutter. In IEEE/RSJ In-
ternational Conference on Intelligent Robots and Sys-
tems (IROS).
He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep
residual learning for image recognition. In IEEE Con-
ference on Computer Vision and Pattern Recognition
(CVPR).
Hutchinson, S., Hager, G., and Corke, P. (1996). A tuto-
rial on visual servo control. IEEE Transactions on
Robotics and Automation.
Lichtsteiner, P., Posch, C., and Delbruck, T. (2008). A 128×
128 120 db 15 µs latency asynchronous temporal con-
trast vision sensor. IEEE Journal of Solid-State Cir-
cuits.
Litzenberger, M., Posch, C., Bauer, D., Belbachir, A.,
Schon, P., Kohn, B., and Garn, H. (2006). Embed-
ded vision system for real-time object tracking using
an asynchronous transient vision sensor. In IEEE 12th
Digital Signal Processing Workshop & 4th IEEE Sig-
nal Processing Education Workshop.
Mitrokhin, A., Ferm
¨
uller, C., Parameshwara, C., and Aloi-
monos, Y. (2018). Event-based moving object detec-
tion and tracking. In IEEE/RSJ International Confer-
ence on Intelligent Robots and Systems (IROS).
Mitrokhin, A., Hua, Z., Ferm
¨
uller, C., and Aloimonos, Y.
(2020). Learning visual motion segmentation using
event surfaces. In IEEE/CVF Conference on Com-
puter Vision and Pattern Recognition (CVPR).
Mitrokhin, A., Ye, C., Ferm
¨
uller, C., Aloimonos, Y., and
Delbruck, T. (2019). Ev-imo: Motion segmenta-
tion dataset and learning pipeline for event cameras.
In IEEE/RSJ International Conference on Intelligent
Robots and Systems (IROS).
Parameshwara, C. M., Sanket, N. J., Singh, C. D.,
Ferm
¨
uller, C., and Aloimonos, Y. (2021). 0-mms:
Zero-shot multi-motion segmentation with a monoc-
ular event camera. In IEEE International Conference
on Robotics and Automation (ICRA). IEEE.
Piatkowska, E., Belbachir, A. N., Schraml, S., and Gelautz,
M. (2012). Spatiotemporal multiple persons track-
ing using dynamic vision sensor. In IEEE Computer
Society Conference on Computer Vision and Pattern
Recognition Workshops.
Rebecq, H., Gehrig, D., and Scaramuzza, D. (2018). Esim:
an open event camera simulator. In Conference on
robot learning. PMLR.
Ronneberger, O., Fischer, P., and Brox, T. (2015). U-net:
Convolutional networks for biomedical image seg-
mentation. In International Conference on Medical
Image Computing and Computer (MICCAI). Springer.
Sanket, N. J., Parameshwara, C. M., Singh, C. D., Kurut-
tukulam, A. V., Ferm
¨
uller, C., Scaramuzza, D., and
Aloimonos, Y. (2019). Evdodge: Embodied AI for
high-speed dodging on A quadrotor using event cam-
eras. CoRR, abs/1906.02919.
Stoffregen, T., Gallego, G., Drummond, T., Kleeman, L.,
and Scaramuzza, D. (2019). Event-based motion seg-
mentation by motion compensation. In Proceedings of
the IEEE/CVF International Conference on Computer
Vision.
Stoffregen, T. and Kleeman, L. (2018). Simultaneous op-
tical flow and segmentation (sofas) using dynamic vi-
sion sensor. arXiv preprint arXiv:1805.12326.
Stoffregen, T., Scheerlinck, C., Scaramuzza, D., Drum-
mond, T., Barnes, N., Kleeman, L., and Mahony,
R. (2020). Reducing the sim-to-real gap for event
cameras. In Vedaldi, A., Bischof, H., Brox, T., and
Frahm, J.-M., editors, Computer Vision – ECCV 2020.
Springer International Publishing.
Sun, Z., Messikommer, N., Gehrig, D., and Scaramuzza, D.
(2022). Ess: Learning event-based semantic segmen-
VISAPP 2024 - 19th International Conference on Computer Vision Theory and Applications
170