Recall@1 compared to the standard approach by up
to 5%. We evaluate the performance of a trained em-
bedding function in the wild, e.g., in different mix-
tures of known and unknown SKUs. We conclude that
the proposed approach, combined with the proposed
mining strategy, can distinguish grocery products in
the wild - even if they are unknown at training time.
REFERENCES
Bastan, M. and Yilmaz, O. (2016). Multi-View Product
Image Search Using Deep ConvNets Representations.
arXiv:1608.03462.
Baz, I., Yoruk, E., and Cetin, M. (2016). Context-aware hy-
brid classification system for fine-grained retail prod-
uct recognition. In IVMSP, pages 1–5. IEEE.
Bendale, A. and Boult, T. E. (2016). Towards Open Set
Deep Networks. In CVPR, pages 1563–1572. IEEE.
Cheng, L., Zhou, X., Zhao, L., Li, D., Shang, H., Zheng, Y.,
Pan, P., and Xu, Y. (2020). Weakly Supervised Learn-
ing with Side Information for Noisy Labeled Images.
In ECCV, pages 306–321. Springer.
Deng, J., Guo, J., Xue, N., and Zafeiriou, S. (2018). Ar-
cFace: Additive Angular Margin Loss for Deep Face
Recognition. arXiv:1801.07698.
Filax, M., Gonschorek, T., and Ortmeier, F. (2019). Data
for Image Recognition Tasks: An Efficient Tool for
Fine-Grained Annotations. In ICPRAM, pages 900–
907. SciTePress.
Franco, A., Maltoni, D., and Papi, S. (2017). Grocery
product detection and recognition. Expert Syst. Appl.,
81:163–176.
George, M. and Floerkemeier, C. (2014). Recognizing
Products: A Per-exemplar Multi-label Image Classifi-
cation Approach. In ECCV, pages 440–455. Springer.
George, M., Mircic, D., Soros, G., Floerkemeier, C.,
and Mattern, F. (2015). Fine-Grained Product Class
Recognition for Assisted Shopping. In ICCVW, pages
546–554. IEEE.
He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep Resid-
ual Learning for Image Recognition. In CVPR, pages
770–778. IEEE.
Hermans, A., Beyer, L., and Leibe, B. (2017). In De-
fense of the Triplet Loss for Person Re-Identification.
arXiv:1703.07737.
Ioffe, S. and Szegedy, C. (2015). Batch normalization: Ac-
celerating deep network training by reducing internal
covariate shift. arXiv:1502.03167.
Karlinsky, L., Shtok, J., Tzur, Y., and Tzadok, A. (2017).
Fine-Grained Recognition of Thousands of Object
Categories with Single-Example Training. In CVPR,
pages 965–974. IEEE.
Kingma, D. P. and Ba, J. L. (2019). Adam: A method for
stochastic optimization. arXiv:1412.6980.
Lowe, D. (1999). Object recognition from local scale-
invariant features. In ICCV, pages 1150–1157. IEEE.
Merler, M., Galleguillos, C., and Belongie, S. (2007). Rec-
ognizing Groceries in situ Using in vitro Training
Data. In CVPR, pages 1–8. IEEE.
Mittal, T., Laasya, B., and Dinesh Babu, J. (2018). A Logo-
Based Approach for Recognising Multiple Products
on a Shelf. In IntelliSys, pages 15–22. Springer.
Mumani, A. and Stone, R. (2018). State of the art of user
packaging interaction (UPI). Packag. Technol. Sci.,
31(6):401–419.
Rallapalli, S., Ganesan, A., Chintalapudi, K., Padmanab-
han, V. N., and Qiu, L. (2014). Enabling physical ana-
lytics in retail stores using smart glasses. In MobiCom,
pages 115–126. ACM Press.
Rettie, R. and Brewer, C. (2000). The verbal and vi-
sual components of package design. J. Prod. Brand
Manag., 9(1):56–70.
Scheirer, W. J., de Rezende Rocha, A., Sapkota, A., and
Boult, T. E. (2013). Toward Open Set Recognition.
TPAMI, 35(7):1757–1772.
Schroff, F., Kalenichenko, D., and Philbin, J. (2015).
FaceNet: A unified embedding for face recognition
and clustering. In CVPR, pages 815–823. IEEE.
Simonyan, K. and Zisserman, A. (2015). Very Deep Con-
volutional Networks for Large-Scale Image Recogni-
tion. arXiv:1409.1556.
Song, H. O., Xiang, Y., Jegelka, S., and Savarese, S. (2016).
Deep Metric Learning via Lifted Structured Feature
Embedding. In CVPR, pages 4004–4012. IEEE.
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I.,
and Salakhutdinov, R. (2014). Dropout: A simple way
to prevent neural networks from overfitting. JMLR,
15:1929–1958.
Tonioni, A. and Di Stefano, L. (2017). Product Recognition
in Store Shelves as a Sub-Graph Isomorphism Prob-
lem. In ICIAP, pages 682–693. Springer.
Tonioni, A. and Di Stefano, L. (2019). Domain invariant
hierarchical embedding for grocery products recogni-
tion. Comput. Vis. Image Underst., 182:81–92.
Tonioni, A., Serra, E., and Di Stefano, L. (2018). A
deep learning pipeline for product recognition on store
shelves. In IPAS, pages 25–31. IEEE.
van der Maaten, L. (2013). Barnes-Hut-SNE.
arXiv:1301.3342.
Varadarajan, S. and Srivastava, M. M. (2018). Weakly
Supervised Object Localization on grocery
shelves using simple FCN and Synthetic Dataset.
arXiv:1803.06813.
Wang, J.-G., Li, J., Yau, W.-Y., and Sung, E. (2010). Boost-
ing dense SIFT descriptors and shape contexts of face
images for gender recognition. In CVPRW, pages 96–
102. IEEE.
Winlock, T., Christiansen, E., and Belongie, S. (2010). To-
ward real-time grocery detection for the visually im-
paired. In CVPRW, pages 49–56. IEEE.
Wu, C.-Y., Manmatha, R., Smola, A. J., and Krahenbuhl, P.
(2017). Sampling Matters in Deep Embedding Learn-
ing. In ICCV, pages 2859–2867. IEEE.
Xiong, B. and Grauman, K. (2016). Text detection in stores
using a repetition prior. In WACV, pages 1–9. IEEE.
Grocery Recognition in the Wild: A New Mining Strategy for Metric Learning
505