train very simple and yet accurate classifiers. Re-
tail products keep changing in appearance with new
packaging and offers. Finetuning a classifier every-
time with addition of new products is costly process.
A image representations that allows us to just train lo-
gistic regression classifier makes accommodating new
product additions very simple.
REFERENCES
Chen, T., Kornblith, S., Norouzi, M., and Hinton, G. (2020).
A simple framework for contrastive learning of visual
representations. arXiv preprint arXiv:2002.05709.
Chen, X. and He, K. (2021). Exploring simple siamese rep-
resentation learning. In Proceedings of the IEEE/CVF
Conference on Computer Vision and Pattern Recogni-
tion, pages 15750–15758.
Geng, W., Han, F., Lin, J., Zhu, L., Bai, J., Wang, S., He,
L., Xiao, Q., and Lai, Z. (2018). Fine-grained gro-
cery product recognition by one-shot learning. pages
1706–1714.
Khosla, P., Teterwak, P., Wang, C., Sarna, A., Tian, Y.,
Isola, P., Maschinot, A., Liu, C., and Krishnan,
D. (2020). Supervised contrastive learning. arXiv
preprint arXiv:2004.11362.
Leutenegger, S., Chli, M., and Siegwart, R. Y. (2011).
Brisk: Binary robust invariant scalable keypoints. In
2011 International Conference on Computer Vision,
pages 2548–2555.
Lowe, D. G. (2004). Distinctive image features from scale-
invariant keypoints. volume 60, pages 91–110.
Mahajan, D., Girshick, R., Ramanathan, V., He, K., Paluri,
M., Li, Y., Bharambe, A., and van der Maaten, L.
(2018). Exploring the limits of weakly supervised
pretraining. In Ferrari, V., Hebert, M., Sminchisescu,
C., and Weiss, Y., editors, Computer Vision – ECCV
2018, pages 185–201, Cham. Springer International
Publishing.
Merler, M., Galleguillos, C., and Belongie, S. (2007). Rec-
ognizing groceries in situ using in vitro training data.
In 2007 IEEE Conference on Computer Vision and
Pattern Recognition, pages 1–8.
Srivastava, M. M. (2020). Bag of tricks for retail product
image classification. In Campilho, A., Karray, F., and
Wang, Z., editors, Image Analysis and Recognition,
pages 71–82, Cham. Springer International Publish-
ing.
Tonioni, A. and Stefano, L. D. (2019). Domain invariant
hierarchical embedding for grocery products recog-
nition. Computer Vision and Image Understanding,
182:81–92.
Xie, Q., Luong, M.-T., Hovy, E., and Le, Q. V. (2020). Self-
training with noisy student improves imagenet classi-
fication.
Xie, S., Girshick, R., Dollar, P., Tu, Z., and He, K. (2017).
Aggregated residual transformations for deep neural
networks. pages 5987–5995.
Zbontar, J., Jing, L., Misra, I., Lecun, Y., and Deny, S.
(2021). Barlow twins: Self-supervised learning via
redundancy reduction. In Meila, M. and Zhang, T.,
editors, Proceedings of the 38th International Confer-
ence on Machine Learning, volume 139 of Proceed-
ings of Machine Learning Research, pages 12310–
12320. PMLR.
Zoph, B., Ghiasi, G., Lin, T.-Y., Cui, Y., Liu, H., Cubuk,
E. D., and Le, Q. V. (2020). Rethinking pre-training
and self-training.
Using Contrastive Learning and Pseudolabels to Learn Representations for Retail Product Image Classification
663