ference on Computer Vision and Pattern Recognition
(CVPR), pages 770–778.
J
´
egou, H., Douze, M., Schmid, C., and P
´
erez, P. (2010).
Aggregating local descriptors into a compact image
representation. In 2010 IEEE Computer Society Con-
ference on Computer Vision and Pattern Recognition,
pages 3304–3311.
Kalantidis, Y., Mellina, C., and Osindero, S. (2016). Cross-
Dimensional Weighting for Aggregated Deep Convo-
lutional Features. In Computer Vision – ECCV 2016
Workshops, pages 685–701. Springer, Cham, Switzer-
land.
Lowe, D. G. (2004). Distinctive Image Features from Scale-
Invariant Keypoints. Int. J. Comput. Vision, 60(2):91–
110.
Lowry, S., S
¨
underhauf, N., Newman, P., Leonard, J. J.,
Cox, D., Corke, P., and Milford, M. J. (2016). Vi-
sual place recognition: A survey. IEEE Transactions
on Robotics, 32(1):1–19.
Maddern, W., Pascoe, G., Linegar, C., and Newman, P.
(2017). 1 year, 1000 km: The oxford robotcar
dataset. The International Journal of Robotics Re-
search, 36(1):3–15.
Masone, C. and Caputo, B. (2021). A survey on deep visual
place recognition. IEEE Access, 9:19516–19547.
Milford, M. J. and Wyeth, G. F. (2012). Seqslam: Vi-
sual route-based navigation for sunny summer days
and stormy winter nights. In 2012 IEEE International
Conference on Robotics and Automation, pages 1643–
1649.
Olid, D., F
´
acil, J. M., and Civera, J. (2018). Single-view
place recognition under seasonal changes. CoRR,
abs/1808.06516.
Pandey, G., McBride, J. R., and Eustice, R. M. (2011). Ford
campus vision and lidar data set. The International
Journal of Robotics Research, 30(13):1543–1552.
Pion, N., Humenberger, M., Csurka, G., Cabon, Y., and Sat-
tler, T. (2020). Benchmarking image retrieval for vi-
sual localization. In Int. Conf. on 3D Vision (3DV).
Pitropov, M., Garcia, D. E., Rebello, J., Smart, M., Wang,
C., Czarnecki, K., and Waslander, S. (2021). Canadian
adverse driving conditions dataset. The International
Journal of Robotics Research, 40(4-5):681–690.
Pronobis, A. and Caputo, B. (2009). Cold: The cosy
localization database. The International Journal of
Robotics Research, 28(5):588–594.
Radenovi
´
c, F., Tolias, G., and Chum, O. (2019). Fine-tuning
cnn image retrieval with no human annotation. IEEE
Transactions on Pattern Analysis and Machine Intel-
ligence, 41(7):1655–1668.
Radenovic, F., Iscen, A., Tolias, G., Avrithis, Y., and Chum,
O. (2018). Revisiting oxford and paris: Large-scale
image retrieval benchmarking. In 2018 IEEE/CVF
Conference on Computer Vision and Pattern Recog-
nition, pages 5706–5715.
Radenovi
´
c, F., Tolias, G., and Chum, O. (2016). CNN
image retrieval learns from BoW: Unsupervised fine-
tuning with hard examples. In ECCV.
Radenovi
´
c, F., Tolias, G., and Chum, O. (2018). Deep shape
matching. In ECCV.
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh,
S., Ma, S., Huang, Z., Karpathy, A., Khosla, A.,
Bernstein, M., Berg, A. C., and Fei-Fei, L. (2015).
ImageNet Large Scale Visual Recognition Challenge.
International Journal of Computer Vision (IJCV),
115(3):211–252.
Sarlin, P., Cadena, C., Siegwart, R., and Dymczyk, M.
(2019). From coarse to fine: Robust hierarchical lo-
calization at large scale. In CVPR.
Sattler, T., Maddern, W., Toft, C., Torii, A., Hammarstrand,
L., Stenborg, E., Safari, D., Okutomi, M., Pollefeys,
M., Sivic, J., Kahl, F., and Pajdla, T. (2018). Bench-
marking 6dof outdoor visual localization in changing
conditions. In 2018 IEEE/CVF Conference on Com-
puter Vision and Pattern Recognition, pages 8601–
8610.
Sattler, T., Maddern, W., Toft, C., Torii, A., Hammarstrand,
L., Stenborg, E., Safari, D., Okutomi, M., Pollefeys,
M., Sivic, J., Kahl, F., and Pajdla, T. (2020). Bench-
marking 6DOF outdoor visual localization in chang-
ing conditions. In Int. Conf. on 3D Vision (3DV).
Sattler, T., Weyand, T., Leibe, B., and Kobbelt, L. (2012).
Image retrieval for image-based localization revisited.
In Proceedings of the British Machine Vision Confer-
ence, pages 76.1–76.12. BMVA Press.
Simonyan, K. and Zisserman, A. (2015). Very deep con-
volutional networks for large-scale image recognition.
In International Conference on Learning Representa-
tions.
Taira, H., Okutomi, M., Sattler, T., Cimpoi, M., Pollefeys,
M., Sivic, J., Pajdla, T., and Torii, A. (2021). Inloc: In-
door visual localization with dense matching and view
synthesis. IEEE Transactions on Pattern Analysis and
Machine Intelligence, 43(4):1293–1307.
Tolias, G., Sicre, R., and J
´
egou, H. (2016a). Particular
Object Retrieval With Integral Max-Pooling of CNN
Activations. In ICL 2016 - RInternational Confer-
ence on Learning Representations, International Con-
ference on Learning Representations, pages 1–12, San
Juan, Puerto Rico.
Tolias, G., Sicre, R., and J
´
egou, H. (2016b). Particular ob-
ject retrieval with integral max-pooling of cnn activa-
tions.
Williams, B., Klein, G., and Reid, I. (2011). Automatic
relocalization and loop closing for real-time monocu-
lar slam. IEEE Transactions on Pattern Analysis and
Machine Intelligence, 33(9):1699–1712.
Xu, M., Snderhauf, N., and Milford, M. (2002). Vision for
mobile robot navigation: A survey. TPAMI.
Yandex, A. B. and Lempitsky, V. (2015). Aggregating local
deep features for image retrieval. In 2015 IEEE In-
ternational Conference on Computer Vision (ICCV),
pages 1269–1277.
Zhang, X., Wang, L., and Su, Y. (2020). Visual place recog-
nition: A survey from deep learning perspective. Pat-
tern Recognition, page 107760.
Evaluation of Long-term Deep Visual Place Recognition
447