sMOTSA scores for these methods were 55.0 and
55.8 on the MOTS test set, which is about 78-79%
of the current state of the art which runs at 0.3 FPS.
The RReID system is able to cut ID switches by
45% while only computing ReID vectors for about
7% of all track instances, which helps it stay real-
time despite the added workload of the ReID net-
work. We have further experimented with and dis-
cussed using faster detectors. We hope that SORTS
and SORTS+RReID can be a strong baseline for real-
time segmentation multi-target tracking in the future.
REFERENCES
Bewley, A., Ge, Z., Ott, L., Ramos, F., and Upcroft, B.
(2016). Simple online and realtime tracking. In 2016
IEEE International Conference on Image Processing
(ICIP), pages 3464–3468.
Bibby, C. and Reid, I. (2010). Real-time tracking of multi-
ple occluding objects using level sets. In 2010 IEEE
Computer Society Conference on Computer Vision
and Pattern Recognition, pages 1307–1314.
Bolya, D., Zhou, C., Xiao, F., and Lee, Y. J. (2019).
YOLACT: Real-time instance segmentation. In ICCV.
Gao, F. and Han, L. (2012). Implementing the nelder-mead
simplex algorithm with adaptive parameters. Compu-
tational Optimization and Applications, 51:259–277.
He, K., Gkioxari, G., Doll
´
ar, P., and Girshick, R. (2017).
Mask R-CNN. In Proceedings of the IEEE interna-
tional conference on computer vision, pages 2961–
2969.
Koeferl, F., Link, J., and Eskofier, B. (2020). Application
of SORT on Multi-Object Tracking and Segmentation.
In 5th BMTT MOTChallenge Workshop: Multi-Object
Tracking and Segmentation.
Labbe, R. (2014). Kalman and bayesian fil-
ters in python. https://github.com/rlabbe/
Kalman-and-Bayesian-Filters-in-Python.
Lee, Y. and Park, J. (2020). Centermask: Real-time anchor-
free instance segmentation.
Lin, T., Maire, M., Belongie, S. J., Bourdev, L. D., Girshick,
R. B., Hays, J., Perona, P., Ramanan, D., Doll
´
ar, P.,
and Zitnick, C. L. (2014). Microsoft COCO: common
objects in context. CoRR, Arxiv: 1405.0312.
Luo, H., Gu, Y., Liao, X., Lai, S., and Jiang, W. (2019).
Bag of tricks and a strong baseline for deep person re-
identification. In The IEEE Conference on Computer
Vision and Pattern Recognition (CVPR) Workshops.
Luo, H., Jiang, W., Gu, Y., Liu, F., Liao, X., Lai, S., and
Gu, J. (2019). A strong baseline and batch normal-
ization neck for deep person re-identification. IEEE
Transactions on Multimedia, pages 1–1.
Milan, A., Leal-Taixe, L., Schindler, K., and Reid, I. (2015).
Joint tracking and segmentation of multiple targets. In
Proceedings of the IEEE Conference on Computer Vi-
sion and Pattern Recognition (CVPR).
Mohamed, E., Ewaisha, M., Siam, M., Rashed, H., Yoga-
mani, S. K., and Sallab, A. E. (2020). Instancemot-
seg: Real-time instance motion segmentation for au-
tonomous driving. CoRR, abs/2008.07008.
Nelder, J. A. and Mead, R. (1965). A simplex method
for function minimization. Computer Journal, 7:308–
313.
MOT Challenge (2020). Mot challenge website. https://
motchallenge.net/user account.php. Accessed 2020-
09-02.
Ren, S., He, K., Girshick, R. B., and Sun, J. (2015). Faster
R-CNN: towards real-time object detection with re-
gion proposal networks. CoRR, abs/1506.01497.
van der Walt, S., Colbert, S. C., and Varoquaux, G. (2011).
The NumPy Array: A structure for efficient numeri-
cal computation. Computing in Science Engineering,
13(2):22–30.
Voigtlaender, P., Krause, M., Osep, A., Luiten, J.,
Sekar, B. B. G., Geiger, A., and Leibe, B. (2019).
MOTS: Multi-Object Tracking and Segmentation.
arXiv:1902.03604[cs]. arXiv: 1902.03604.
Wang, Q., Zhang, L., Bertinetto, L., Hu, W., and Torr,
P. H. (2019). Fast online object tracking and segmen-
tation: A unifying approach. In Proceedings of the
IEEE/CVF Conference on Computer Vision and Pat-
tern Recognition (CVPR).
Wang, Z. (2019). SEG-YOLO: Real-Time Instance Segmen-
tation Using YOLOv3 and Fully Convolutional Net-
work. PhD thesis.
Wojke, N. and Bewley, A. (2018). Deep cosine metric learn-
ing for person re-identification. In 2018 IEEE Win-
ter Conference on Applications of Computer Vision
(WACV), pages 748–756. IEEE.
Wojke, N., Bewley, A., and Paulus, D. (2017). Simple on-
line and realtime tracking with a deep association met-
ric. In 2017 IEEE International Conference on Image
Processing (ICIP), pages 3645–3649. IEEE.
Wu, Y., Kirillov, A., Massa, F., Lo, W.-Y., and Gir-
shick, R. (2019). Detectron2. https://github.com/
facebookresearch/detectron2.
Xu, Z., Zhang, W., Tan, X., Yang, W., Su, X., Yuan, Y.,
Zhang, H., Wen, S., Ding, E., and Huang, L. (2020).
Pointtrack++ for effective online multi-object tracking
and segmentation. In CVPR Workshops.
Yang, F., Chang, X., Dang, C., Zheng, Z., Sakti, S.,
Nakamura, S., and Wu, Y. (2020). ReMOTS: Self-
supervised refining multi-object tracking and segmen-
tation.
Yeo, D., Son, J., Han, B., and Hee Han, J. (2017).
Superpixel-based tracking-by-segmentation using
markov chains. In Proceedings of the IEEE Confer-
ence on Computer Vision and Pattern Recognition
(CVPR).
Zhao, T., Nevatia, R., and Wu, B. (2008). Segmentation
and tracking of multiple humans in crowded environ-
ments. IEEE Transactions on Pattern Analysis and
Machine Intelligence, 30(7):1198–1211.
VISAPP 2021 - 16th International Conference on Computer Vision Theory and Applications
784