
ACKNOWLEDGEMENTS
This work was funded by the Carl Zeiss Stiftung,
Germany under the Sustainable Embedded AI project
(P2021-02-009) and partially funded by the Federal
Ministry of Education and Research Germany under
the project DECODE (01IW21001).
REFERENCES
Amir, A., Taba, B., Berg, D., Melano, T., McKinstry, J.,
Di Nolfo, C., Nayak, T., Andreopoulos, A., Garreau,
G., Mendoza, M., Kusnitz, J., Debole, M., Esser, S.,
Delbruck, T., Flickner, M., and Modha, D. (2017). A
low power, fully event-based gesture recognition sys-
tem. In Conference on Computer Vision and Pattern
Recognition (CVPR).
Cordone, L., Miramond, B., and Thierion, P. (2022). Object
detection with spiking neural networks on automotive
event data. In International Joint Conference on Neu-
ral Networks (IJCNN).
De Tournemire, P., Nitti, D., Perot, E., Migliore, D.,
and Sironi, A. (2020). A large scale event-based
detection dataset for automotive. arXiv preprint
arXiv:2001.08499.
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., and Fei-
Fei, L. (2009). Imagenet: A large-scale hierarchical
image database. In Conference on Computer Vision
and Pattern Recognition (CVPR).
DeVries, T. and Taylor, G. W. (2017). Improved regular-
ization of convolutional neural networks with cutout.
arXiv preprint arXiv:1708.04552.
Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., and
Tian, Y. (2021). Incorporating learnable membrane
time constant to enhance learning of spiking neural
networks. In International Conference on Computer
Vision (ICCV).
Fei-Fei, L., Fergus, R., and Perona, P. (2004). Learning gen-
erative visual models from few training examples: An
incremental bayesian approach tested on 101 object
categories. In Conference on Computer Vision and
Pattern Recognition Workshop (CVPRW).
Fong, R. and Vedaldi, A. (2019). Occlusions for effective
data augmentation in image classification. In Interna-
tional Conference on Computer Vision Workshop (IC-
CVW).
Gehrig, M. and Scaramuzza, D. (2023). Recurrent vision
transformers for object detection with event cameras.
In Conference on Computer Vision and Pattern Recog-
nition (CVPR).
Gerstner, W. and Kistler, W. M. (2002). Spiking Neu-
ron Models: Single Neurons, Populations, Plasticity.
Cambridge University Press.
Gu, F., Sng, W., Hu, X., and Yu, F. (2021). Eventdrop:
data augmentation for event-based learning. In Inter-
national Joint Conferences on Artificial Intelligence
(IIJCAI).
He, K., Zhang, X., Ren, S., and Sun, J. (2016). Identity
mappings in deep residual networks. In European
Conference on Computer Vision (ECCV).
Hoffer, E., Ben-Nun, T., Hubara, I., Giladi, N., Hoefler, T.,
and Soudry, D. (2019). Augment your batch: better
training with larger batches. In arXiv.
Huang, G., Liu, Z., Maaten, L. V. D., and Weinberger, K. Q.
(2017). Densely connected convolutional networks. In
Conference on Computer Vision and Pattern Recogni-
tion (CVPR).
Isobe, T., Han, J., Zhuz, F., Liy, Y., and Wang, S.
(2020). Intra-clip aggregation for video person re-
identification. In International Conference on Image
Processing (ICIP).
Krizhevsky, A. (2012). Learning multiple layers of features
from tiny images. University of Toronto.
Li, H., Liu, H., Ji, X., Li, G., and Shi, L. (2017). CIFAR10-
DVS: An event-stream dataset for object classifica-
tion. Frontiers in Neuroscience.
Li, J., Li, J., Zhu, L., Xiang, X., Huang, T., and Tian,
Y. (2022a). Asynchronous spatio-temporal memory
network for continuous event-based object detection.
Transactions on Image Processing.
Li, Y., Kim, Y., Park, H., Geller, T., and Panda, P. (2022b).
Neuromorphic data augmentation for training spiking
neural networks. In European Conference on Com-
puter Vision (ECCV).
Lin, T.-Y., Goyal, P., Girshick, R., He, K., and Dollar, P.
(2020). Focal loss for dense object detection. Trans-
actions on Pattern Analysis and Machine Intelligence
(PAMI).
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S.,
Fu, C.-Y., and Berg, A. C. (2016). SSD: Single shot
MultiBox detector. In European Conference on Com-
puter Vision (ECCV).
Loshchilov, I. and Hutter, F. (2017). Decoupled weight
decay regularization. In International Conference on
Learning Representations (ICLR).
Orchard, G., Jayawant, A., Cohen, G. K., and Thakor, N.
(2015). Converting static image datasets to spiking
neuromorphic datasets using saccades. Frontiers in
Neuroscience.
Perot, E., de Tournemire, P., Nitti, D., Masci, J., and
Sironi, A. (2020). Learning to detect objects with a 1
megapixel event camera. In Neural Information Pro-
cessing Systems (NeurIPS).
Shen, G., Zhao, D., and Zeng, Y. (2023). Eventmix: An
efficient data augmentation strategy for event-based
learning. Information Sciences.
Singh, K. K. and Lee, Y. J. (2017). Hide-and-seek: Forc-
ing a network to be meticulous for weakly-supervised
object and action localization. In International Con-
ference on Computer Vision (ICCV).
Sironi, A., Brambilla, M., Bourdis, N., Lagorce, X., and
Benosman, R. (2018). Hats: Histograms of averaged
time surfaces for robust event-based object classifica-
tion. In Conference on Computer Vision and Pattern
Recognition (CVPR).
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I.,
and Salakhutdinov, R. (2014). Dropout: A simple way
ICPRAM 2024 - 13th International Conference on Pattern Recognition Applications and Methods
358