
Kingma, D. P. and Ba, J. (2015). Adam: A Method for
Stochastic Optimization. In Proceedings of ICLR.
Krizhevsky, A. (2009). Learning multiple layers of fea-
tures from tiny images. Technical report, University
of Toronto.
Lee, S., Seong, H., Lee, S., and Kim, E. (2022). Correla-
tion verification for image retrieval. In Proceedings of
CVPR, pages 5374–5384. IEEE.
Liu, W., Wen, Y., Yu, Z., Li, M., Raj, B., and Song, L.
(2017). SphereFace: Deep Hypersphere Embedding
for Face Recognition. In Proceedings of CVPR, pages
6738–6746, Los Alamitos, CA, USA. IEEE.
Liu, W., Wen, Y., Yu, Z., and Yang, M. (2016). Large-
margin softmax loss for convolutional neural net-
works. In Proceedings of ICML, pages 507–516.
JMLR.org.
Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin,
S., and Guo, B. (2021). Swin Transformer: Hierar-
chical vision transformer using shifted windows. In
Proceedings of ICCV, pages 10012–10022. IEEE.
Min, W., Mei, S., Li, Z., and Jiang, S. (2020). A
two-stage triplet network training framework for im-
age retrieval. IEEE Transactions on Multimedia,
22(12):3128–3138.
Murphy, K. P. (2012). Machine Learning: A Probabilistic
Perspective. The MIT Press.
Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., and
Ng, A. Y. (2011). Reading digits in natural images
with unsupervised feature learning. In Proceedings of
NIPS 2011 Workshop on Deep Learning and Unsuper-
vised Feature Learning. Curran Associates, Inc.
Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E.,
DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., and
Lerer, A. (2019). PyTorch: An Imperative Style,
High-Performance Deep Learning Library. In Pro-
ceedings of NeurIPS, pages 8024–8035. Curran As-
sociates, Inc.
Patel, Y., Tolias, G., and Matas, J. (2022). Recall@k surro-
gate loss with large batches and similarity mixup. In
Proceedings of CVPR, pages 7502–7511. IEEE.
Philbin, J., Chum, O., Isard, M., Sivic, J., and Zisserman, A.
(2007). Object retrieval with large vocabularies and
fast spatial matching. In Proceedings of CVPR, pages
1–8. IEEE.
Polley, S., Mondal, S., Mannam, V. S., Kumar, K., Patra,
S., and N
¨
urnberger, A. (2022). X-vision: Explainable
image retrieval by re-ranking in semantic space. In
Proceedings of CIKM, pages 4955–4959, New York,
NY, USA. Association for Computing Machinery.
Radenovi
´
c, F., Tolias, G., and Chum, O. (2019). Fine-
Tuning CNN Image Retrieval with No Human Anno-
tation. IEEE Transactions on Pattern Analysis and
Machine Intelligence, 41(7):1655–1668.
Revaud, J., Almaz
´
an, J., Rezende, R. S., and Souza, C. R. d.
(2019). Learning with Average Precision: Training
Image Retrieval with a Listwise Loss. In Proceedings
of ICCV, pages 5107–5116. IEEE.
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S.,
Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bern-
stein, M., et al. (2015). ImageNet Large Scale Vi-
sual Recognition Challenge. International Journal of
Computer Vision, 115:211–252.
Schroff, F., Kalenichenko, D., and Philbin, J. (2015).
FaceNet: A unified embedding for face recognition
and clustering. In Proceedings of CVPR, pages 815–
823. IEEE.
Sohn, K. (2016). Improved Deep Metric Learning with
Multi-class N-pair Loss Objective. In Proceedings of
NIPS, volume 29. Curran Associates, Inc.
Suh, Y., Han, B., Kim, W., and Lee, K. M. (2019). Stochas-
tic Class-Based Hard Example Mining for Deep Met-
ric Learning. In Proceedings of CVPR, pages 7244–
7252. IEEE.
Tang, Y., Bai, W., Li, G., Liu, X., and Zhang, Y. (2022).
CROLoss: Towards a Customizable Loss for Retrieval
Models in Recommender Systems. In Proceedings of
CIKM, pages 1916–1924, New York, NY, USA. As-
sociation for Computing Machinery.
Vassileios Balntas, Edgar Riba, D. P. and Mikolajczyk,
K. (2016). Learning local feature descriptors with
triplets and shallow convolutional neural networks. In
Proceedings of BMVC, pages 119.1–119.11. BMVA
Press.
Wang, F., Cheng, J., Liu, W., and Liu, H. (2018a). Addi-
tive margin softmax for face verification. IEEE Signal
Processing Letters, 25(7):926–930.
Wang, H., Wang, Y., Zhou, Z., Ji, X., Li, Z., Gong, D.,
Zhou, J., and Liu, W. (2018b). CosFace: Large Mar-
gin Cosine Loss for Deep Face Recognition. In Pro-
ceedings of CVPR, pages 5265–5274. IEEE.
Wen, Y., Zhang, K., Li, Z., and Qiao, Y. (2016). A dis-
criminative feature learning approach for deep face
recognition. In Proceedings of ECCV, pages 499–515.
Springer.
Wu, C.-Y., Manmatha, R., Smola, A. J., and Kr
¨
ahenb
¨
uhl,
P. (2017). Sampling matters in deep embedding learn-
ing. In Proceedings of ICCV, pages 2859–2867. IEEE.
Wu, H., Wang, M., Zhou, W., and Li, H. (2021). Learn-
ing deep local features with multiple dynamic atten-
tions for large-scale image retrieval. In Proceedings
of ICCV, pages 11416–11425. IEEE.
Yadan, O. (2019). Hydra - a framework for elegantly con-
figuring complex applications. Github.
Yu, B. and Tao, D. (2019). Deep metric learning with tuplet
margin loss. In Proceedings of ICCV, pages 6489–
6498. IEEE.
Zhu, Q., Zhang, P., Wang, Z., and Ye, X. (2019). A New
Loss Function for CNN Classifier Based on Prede-
fined Evenly-Distributed Class Centroids. IEEE Ac-
cess, 8:10888–10895.
Class Anchor Margin Loss for Content-Based Image Retrieval
853