Li, L.-J. and Fei-Fei, L. (2010). Optimol: automatic on-
line picture collection via incremental model learning.
International Journal of Computer Vision, 88(2):147–
168.
Lin, T.-Y., RoyChowdhury, A., and Maji, S. (2015). Bilin-
ear cnn models for fine-grained visual recognition. In
IEEE International Conference on Computer Vision,
pages 1449–1457.
Luo, J. and Nascimento, M. A. (2003). Content based
sub-image retrieval via hierarchical tree matching.
In 1st ACM International Workshop on Multimedia
Databases, pages 63–69.
Nicholson, B., Zhang, J., Sheng, V. S., and Wang, Z. (2015).
Label noise correction methods. In IEEE Interna-
tional Conference on Data Science and Advanced An-
alytics, pages 1–9.
Ravi, S. and Larochelle, H. (2016). Optimization as a model
for few-shot learning.
Rodner, E., Simon, M., Brehm, G., Pietsch, S., W
¨
agele,
J. W., and Denzler, J. (2015). Fine-grained recognition
datasets for biodiversity analysis. In CVPR Workshop
on Fine-grained Visual Classification.
Rolnick, D., Veit, A., Belongie, S., and Shavit, N. (2017).
Deep learning is robust to massive label noise. arXiv
preprint arXiv:1705.10694.
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S.,
Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bern-
stein, M., Berg, A. C., and Fei-Fei, L. (2015). Ima-
geNet Large Scale Visual Recognition Challenge. In-
ternational Journal of Computer Vision, 115(3):211–
252.
Schroff, F., Criminisi, A., and Zisserman, A. (2010). Har-
vesting image databases from the web. IEEE Trans-
actions on Pattern Analysis and Machine Intelligence,
33(4):754–766.
Simon, M., Gao, Y., Darrell, T., Denzler, J., and Rodner, E.
(2017). Generalized orderless pooling performs im-
plicit salient matching. In IEEE International Confer-
ence on Computer Vision, pages 4970–4979.
Snell, J., Swersky, K., and Zemel, R. (2017). Prototypical
networks for few-shot learning. In Advances in neural
information processing systems, pages 4077–4087.
Sung, F., Yang, Y., Zhang, L., Xiang, T., Torr, P. H., and
Hospedales, T. M. (2018). Learning to compare: Re-
lation network for few-shot learning. In Proceedings
of the IEEE Conference on Computer Vision and Pat-
tern Recognition, pages 1199–1208.
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., and Wojna,
Z. (2016). Rethinking the inception architecture for
computer vision. In IEEE Conference on Computer
Vision and Pattern Recognition, pages 2818–2826.
Van Horn, G., Mac Aodha, O., Song, Y., Cui, Y., Sun,
C., Shepard, A., Adam, H., Perona, P., and Belongie,
S. (2018). The inaturalist species classification and
detection dataset. In Proceedings of the IEEE con-
ference on computer vision and pattern recognition,
pages 8769–8778.
Wah, C., Branson, S., Welinder, P., Perona, P., and Be-
longie, S. (2011). The caltech-ucsd birds-200-2011
dataset. Technical Report CNS-TR-2011-001, Cali-
fornia Institute of Technology.
Wang, B., Li, Z., Li, M., and Ma, W.-Y. (2006). Large-scale
duplicate detection for web image search. In IEEE
International Conference on Multimedia and Expo,
pages 353–356.
Wang, J., Song, Y., Leung, T., Rosenberg, C., Wang, J.,
Philbin, J., Chen, B., and Wu, Y. (2014). Learn-
ing fine-grained image similarity with deep ranking.
In IEEE Conference on Computer Vision and Pattern
Recognition, pages 1386–1393.
Wang, Z., Bovik, A. C., Sheikh, H. R., and Simoncelli, E. P.
(2004). Image quality assessment: from error visi-
bility to structural similarity. IEEE Transactions on
Image Processing, 13(4):600–612.
Xiao, T., Xia, T., Yang, Y., Huang, C., and Wang, X. (2015).
Learning from massive noisy labeled data for image
classification. In IEEE Conference on Computer Vi-
sion and Pattern Recognition, pages 2691–2699.
Xu, Z., Huang, S., Zhang, Y., and Tao, D. (2015). Augment-
ing strong supervision using web data for fine-grained
categorization. In IEEE International Conference on
Computer Vision.
Zhang, C., Yao, Y., Zhang, J., Chen, J., Huang, P., Zhang,
J., and Tang, Z. (2020). Web-supervised network for
fine-grained visual classification. In IEEE Interna-
tional Conference on Multimedia and Expo, pages 1–
6.
Zhang, W. and Tan, X. (2019). Combining outlier detection
and reconstruction error minimization for label noise
reduction. In IEEE International Conference on Big
Data and Smart Computing, pages 1–4.
Zheng, H., Fu, J., Mei, T., and Luo, J. (2017). Learn-
ing multi-attention convolutional neural network for
fine-grained image recognition. In IEEE International
Conference on Computer Vision, pages 5209–5217.
Zhuang, B., Liu, L., Li, Y., Shen, C., and Reid, I. (2017).
Attend in groups: a weakly-supervised deep learning
framework for learning from web data. In IEEE Con-
ference on Computer Vision and Pattern Recognition,
pages 1878–1887.
Lightweight Filtering of Noisy Web Data: Augmenting Fine-grained Datasets with Selected Internet Images
477