though promising, this representation does not con-
siderably boost the performance of the models. More
investigations can be made in this area of multi-modal
methods. We also intend to focus on classifying
switches observed from larger distances.
6 CONCLUSIONS
In this paper we proposed an efficient approach for
switch classification using different neural networks
architectures on images taken from the perspective
of the train. The considered architectures, namely
ResNet-18, VGG-11 and MobileNet-V2, led to some
competitive results when compared to two of the few
existing approaches found to solve this task on the
considered dataset. Despite the high values of the
metrics obtained, the task of switch classification still
remains a difficult one. This paper represents a con-
siderable step forward towards solving this task.
ACKNOWLEDGEMENTS
This work was supported by two grants from Babes¸-
Bolyai University, projects numbers 6851/2021 and
18/2022.
REFERENCES
Agarap, A. F. (2018). Deep learning using rectified linear
units (relu). arXiv preprint arXiv:1803.08375.
Alexandrescu, A.-R. and Manole, A. (2022). A dynamic
approach for railway semantic segmentation. Studia
Universitatis Babes-Bolyai, Informatica, 67(1):61–
76.
Canny, J. (1986). A computational approach to edge de-
tection. IEEE Transactions on pattern analysis and
machine intelligence, 8(6):679–698.
Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler,
M., Benenson, R., Franke, U., Roth, S., and Schiele,
B. (2016). The cityscapes dataset for semantic urban
scene understanding. In Proceedings of the IEEE con-
ference on computer vision and pattern recognition,
pages 3213–3223.
Duda, R. O. and Hart, P. E. (1972). Use of the hough trans-
formation to detect lines and curves in pictures. Com-
munications of the ACM, 15(1):11–15.
Growitsch, C. and Wetzel, H. (2009). Testing for economies
of scope in european railways: an efficiency analysis.
Journal of Transport Economics and Policy (JTEP),
43(1):1–24.
He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep resid-
ual learning for image recognition. In Proceedings of
the IEEE conference on computer vision and pattern
recognition, pages 770–778.
Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D.,
Wang, W., Weyand, T., Andreetto, M., and Adam,
H. (2017). Mobilenets: Efficient convolutional neu-
ral networks for mobile vision applications. arXiv
preprint arXiv:1704.04861.
Huang, G., Liu, Z., Van Der Maaten, L., and Weinberger,
K. Q. (2017). Densely connected convolutional net-
works. In Proceedings of the IEEE conference on
computer vision and pattern recognition, pages 4700–
4708.
Jahan, K., Niemeijer, J., Kornfeld, N., and Roth, M. (2021).
Deep neural networks for railway switch detection
and classification using onboard camera images. In
2021 IEEE Symposium Series on Computational In-
telligence (SSCI), pages 01–07. IEEE.
Karak
¨
ose, M., Akın, E., and Yaman, O. (2016). Detection of
rail switch passages through image processing on rail-
way line and use of condition-monitoring approach.
International Conference on Advanced Technology &
Sciences.
Kingma, D. P. and Ba, J. (2015). Adam: A method for
stochastic optimization. In Bengio, Y. and LeCun,
Y., editors, 3rd International Conference on Learning
Representations, Conference Track Proceedings.
Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012). Im-
agenet classification with deep convolutional neural
networks. Advances in neural information processing
systems, 25:1097–1105.
Lin, T.-Y., Goyal, P., Girshick, R., He, K., and Doll
´
ar, P.
(2017). Focal loss for dense object detection. In
Proceedings of the IEEE international conference on
computer vision, pages 2980–2988.
Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P.,
Ramanan, D., Doll
´
ar, P., and Zitnick, C. L. (2014).
Microsoft coco: Common objects in context. In Euro-
pean conference on computer vision, pages 740–755.
Springer.
Pohlen, T., Hermans, A., Mathias, M., and Leibe, B. (2017).
Full-resolution residual networks for semantic seg-
mentation in street scenes. In Proceedings of the IEEE
Conference on Computer Vision and Pattern Recogni-
tion, pages 4151–4160.
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S.,
Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bern-
stein, M., et al. (2015). Imagenet large scale visual
recognition challenge. International journal of com-
puter vision, 115(3):211–252.
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., and
Chen, L.-C. (2018). Mobilenetv2: Inverted residu-
als and linear bottlenecks. In Proceedings of the IEEE
conference on computer vision and pattern recogni-
tion, pages 4510–4520.
Simonyan, K. and Zisserman, A. (2015). Very deep con-
volutional networks for large-scale image recognition.
In International Conference on Learning Representa-
tions.
Zendel, O., Murschitz, M., Zeilinger, M., Steininger, D.,
Abbasi, S., and Beleznai, C. (2019). Railsem19: A
dataset for semantic rail scene understanding. In Pro-
ceedings of the IEEE/CVF Conference on Computer
Vision and Pattern Recognition Workshops.
VISAPP 2023 - 18th International Conference on Computer Vision Theory and Applications
776