
data with deep generative models. Nature communi-
cations, 9(1):2002.
Ding, X., Chen, H., Zhang, X., Han, J., and Ding, G.
(2022). Repmlpnet: Hierarchical vision mlp with
re-parameterized locality. In Proceedings of the
IEEE/CVF conference on computer vision and pattern
recognition, pages 578–587.
El-Nouby, A., Neverova, N., Laptev, I., and J
´
egou, H.
(2021). Training vision transformers for image re-
trieval. arXiv preprint arXiv:2102.05644.
Espadoto, M., Hirata, N. S. T., and Telea, A. C. (2021).
Self-supervised dimensionality reduction with neural
networks and pseudo-labeling. In Proceedings.
Fu, Z., Li, Y., Mao, Z., Wang, Q., and Zhang, Y. (2021).
Deep metric learning with self-supervised ranking. In
Proceedings of the AAAI Conference on Artificial In-
telligence, volume 35, pages 1370–1378.
Gisbrecht, A., Schulz, A., and Hammer, B. (2015). Para-
metric nonlinear dimensionality reduction using ker-
nel t-sne. Neurocomputing, 147:71–82. Advances
in Self-Organizing Maps Subtitle of the special is-
sue: Selected Papers from the Workshop on Self-
Organizing Maps 2012 (WSOM 2012).
Gkelios, S., Sophokleous, A., Plakias, S., Boutalis, Y., and
Chatzichristofis, S. A. (2021). Deep convolutional
features for image retrieval. Expert Systems with Ap-
plications, 177:114940.
Goodfellow, I. (2016). Deep learning.
Hadsell, R., Chopra, S., and LeCun, Y. (2006). Dimension-
ality reduction by learning an invariant mapping. In
2006 IEEE computer society conference on computer
vision and pattern recognition (CVPR’06), volume 2,
pages 1735–1742. IEEE.
Hornik, K., Stinchcombe, M., and White, H. (1989). Multi-
layer feedforward networks are universal approxima-
tors. Neural networks, 2(5):359–366.
Jain, A. K., Murty, M. N., and Flynn, P. J. (1999). Data
clustering: a review. ACM Comput. Surv., 31(3).
Kawai, V. A. S., Leticio, G. R., Valem, L. P., and Pedronette,
D. C. G. (2024a). Neighbor embedding projection and
rank-based manifold learning for image retrieval. In
2024 37th SIBGRAPI Conference on Graphics, Pat-
terns and Images (SIBGRAPI), pages 1–6.
Kawai, V. S., Valem, L. P., Baldassin, A., Borin, E., Pe-
dronette, D. C. G. a., and Latecki, L. J. (2024b). Rank-
based hashing for effective and efficient nearest neigh-
bor search for image retrieval. ACM Trans. Multime-
dia Comput. Commun. Appl., 20(10).
Krizhevsky, A. and Hinton, G. (2009). Learning multiple
layers of features from tiny images. Technical Re-
port 0, University of Toronto, Toronto, Ontario.
LeCun, Y., Bottou, L., Bengio, Y., and Haffner, P. (1998).
Gradient-based learning applied to document recogni-
tion. Proceedings of the IEEE, 86(11):2278–2324.
Leticio, G. R., Kawai, V. S., Valem, L. P., Pedronette, D.
C. G., and da S. Torres, R. (2024). Manifold informa-
tion through neighbor embedding projection for image
retrieval. Pattern Recognition Letters, 183:17–25.
Li, X., Yang, J., and Ma, J. (2021). Recent developments of
content-based image retrieval (cbir). Neurocomputing,
452:675–689.
Liu, G.-H. and Yang, J.-Y. (2013). Content-based image
retrieval using color difference histogram. Pattern
recognition, 46(1):188–198.
Liu, Z., Mao, H., Wu, C.-Y., Feichtenhofer, C., Darrell, T.,
and Xie, S. (2022). A convnet for the 2020s. In Pro-
ceedings of the IEEE/CVF conference on computer vi-
sion and pattern recognition, pages 11976–11986.
Manning, C. D. (2008). Introduction to information re-
trieval. Syngress Publishing,.
McInnes, L., Healy, J., Saul, N., and Großberger, L. (2018).
Umap: Uniform manifold approximation and projec-
tion. Journal of Open Source Software, 3(29):861.
Oquab, M., Darcet, T., Moutakanni, T., Vo, H., Szafraniec,
M., Khalidov, V., Fernandez, P., Haziza, D., Massa, F.,
El-Nouby, A., et al. (2023). Dinov2: Learning robust
visual features without supervision. arXiv preprint
arXiv:2304.07193.
Roman-Rangel, E. and Marchand-Maillet, S. (2019). In-
ductive t-sne via deep learning to visualize multi-label
images. Engineering Applications of Artificial Intelli-
gence, 81:336–345.
Sainburg, T., McInnes, L., and Gentner, T. Q. (2021).
Parametric umap embeddings for representation and
semisupervised learning. Neural Computation,
33(11):2881–2907.
Szubert, B., Cole, J. E., Monaco, C., and Drozdov, I.
(2019). Structure-preserving visualisation of high
dimensional single-cell datasets. Scientific reports,
9(1):8914.
Tang, C., Zhao, Y., Wang, G., Luo, C., Xie, W., and Zeng,
W. (2022). Sparse mlp for image recognition: Is self-
attention really necessary? In Proceedings of the
AAAI conference on artificial intelligence, volume 36,
pages 2344–2351.
Tolstikhin, I. O., Houlsby, N., Kolesnikov, A., Beyer, L.,
Zhai, X., Unterthiner, T., Yung, J., Steiner, A., Key-
sers, D., Uszkoreit, J., et al. (2021). Mlp-mixer: An
all-mlp architecture for vision. Advances in neural in-
formation processing systems, 34:24261–24272.
Van der Maaten, L. and Hinton, G. (2008). Visualizing data
using t-sne. Journal of machine learning research,
9(11).
Wan, J., Wang, D., Hoi, S. C. H., Wu, P., Zhu, J., Zhang, Y.,
and Li, J. (2014). Deep learning for content-based im-
age retrieval: A comprehensive study. In Proceedings
of the 22nd ACM international conference on Multi-
media, pages 157–166.
Wang, X., Han, X., Huang, W., Dong, D., and Scott,
M. R. (2019a). Multi-similarity loss with general pair
weighting for deep metric learning. In Proceedings
of the IEEE/CVF conference on computer vision and
pattern recognition, pages 5022–5030.
Wang, X., Hua, Y., Kodirov, E., Hu, G., Garnier, R., and
Robertson, N. M. (2019b). Ranked list loss for deep
metric learning. In Proceedings of the IEEE/CVF con-
ference on computer vision and pattern recognition,
pages 5207–5216.
Xiao, H., Rasul, K., and Vollgraf, R. (2017). Fashion-
mnist: a novel image dataset for benchmarking ma-
chine learning algorithms. arXiv e-prints, pages
arXiv–1708.
Inductive Self-Supervised Dimensionality Reduction for Image Retrieval
391