ing it a useful solution to segment a dataset without
requiring dense annotation or specific training. This is
also a promising improvement for weakly-supervised
segmentation frameworks.
As a perspective, we consider to push our exper-
iments further in order to assess more precisely the
complementarity of our criteria and optimize their ag-
gregation (using for example voting or learning strate-
gies), on other datasets of the literature, where the se-
mantic differences between the classes could be more
important. In this context, other criteria might also be
evaluated to take into account the specificity of each
dataset. Finally, we also plan to evaluate the impact of
embedding our criteria in weakly-supervised segmen-
tation schemes, where they can be easily integrated.
ACKNOWLEDGEMENTS
This work was carried out at the LIPADE and funded
by Magellium, with the support of the French Defense
Innovation Agency (AID).
REFERENCES
Ahn, J., Cho, S., and Kwak, S. (2019). Weakly supervised
learning of instance segmentation with inter-pixel re-
lations. In CVPR, pages 2209–2218.
Arbel
´
aez, P., Pont-Tuset, J., Barron, J. T., Marques, F., and
Malik, J. (2014). Multiscale combinatorial grouping.
In CVPR, pages 328–335.
Chen, L.-C., Zhu, Y., Papandreou, G., Schroff, F., and
Adam, H. (2018). Encoder-decoder with atrous sep-
arable convolution for semantic image segmentation.
In ECCV, pages 801–818.
Chen, Y., Chan, A. B., and Wang, G. (2012). Adaptive
figure-ground classification. In CVPR, pages 654–
661.
Dai, J., He, K., and Sun, J. (2015). Boxsup: Exploiting
bounding boxes to supervise convolutional networks
for semantic segmentation. In ICCV, pages 1635–
1643.
Del
´
earde, R., Kurtz, C., Dejean, P., and Wendling, L.
(2021). Force banner for the recognition of spatial
relations. In ICPR 2020, pages XX–XX.
Everingham, M., Van Gool, L., Williams, C. K., Winn,
J., and Zisserman, A. (2010). The PASCAL visual
object classes (VOC) challenge. Int J Comput Vis,
88(2):303–338.
Guillaumin, M., K
¨
uttel, D., and Ferrari, V. (2014). Imagenet
auto-annotation with segmentation propagation. Int J
Comput Vis, 110(3):328–348.
He, K., Gkioxari, G., Doll
´
ar, P., and Girshick, R. (2017).
Mask R-CNN. In ICCV, pages 2961–2969.
Hong, S., Oh, J., Lee, H., and Han, B. (2016). Learn-
ing transferable knowledge for semantic segmentation
with deep convolutional neural network. In CVPR,
pages 3204–3212.
Hsu, C.-C., Hsu, K.-J., Tsai, C.-C., Lin, Y.-Y., and Chuang,
Y.-Y. (2019). Weakly supervised instance segmenta-
tion using the bounding box tightness prior. In NIPS,
pages 6586–6597.
Jurdi, R. E., Petitjean, C., Honeine, P., and Abdallah, F.
(2020). BB-UNet: U-Net with bounding box prior.
IEEE J Sel Top Signal Process.
Khoreva, A., Benenson, R., Hosang, J., Hein, M., and
Schiele, B. (2017). Simple does it: Weakly supervised
instance and semantic segmentation. In CVPR, pages
876–885.
Kolesnikov, A. and Lampert, C. H. (2016). Seed, expand
and constrain: Three principles for weakly-supervised
image segmentation. In ECCV, pages 695–711.
Lempitsky, V., Kohli, P., Rother, C., and Sharp, T. (2009).
Image segmentation with a bounding box prior. In
ICCV, pages 277–284.
Li, Q., Arnab, A., and Torr, P. H. (2018). Weakly-and semi-
supervised panoptic segmentation. In ECCV, pages
102–118.
Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ra-
manan, D., Doll
´
ar, P., and Zitnick, C. L. (2014). Mi-
crosoft COCO: Common objects in context. In ECCV,
pages 740–755.
Papandreou, G., Chen, L.-C., Murphy, K. P., and Yuille,
A. L. (2015). Weakly-and semi-supervised learning
of a deep convolutional network for semantic image
segmentation. pages 1742–1750.
Pascal, L., Bost, X., and Huet, B. (2019). Semantic and
visual similarities for efficient knowledge transfer in
cnn training. In CBMI, pages 1–6.
Rother, C., Kolmogorov, V., and Blake, A. (2004). ”Grab-
Cut” interactive foreground extraction using iterated
graph cuts. ACM Trans Graph, 23(3):309–314.
Sun, R., Zhu, X., Wu, C., Huang, C., Shi, J., and Ma, L.
(2019). Not all areas are equal: Transfer learning for
semantic segmentation via hierarchical region selec-
tion. In CVPR, pages 4360–4369.
Wang, Y., Zhang, J., Kan, M., Shan, S., and Chen, X.
(2020). Self-supervised equivariant attention mech-
anism for weakly supervised semantic segmentation.
In CVPR, pages 12275–12284.
Wu, Z., Shen, C., and Van Den Hengel, A. (2019). Wider or
deeper: Revisiting the resnet model for visual recog-
nition. Pattern Recognit, 90:119–133.
Yang, K., Russakovsky, O., and Deng, J. (2019). Spa-
tialSense: An adversarially crowdsourced benchmark
for spatial relation recognition. In ICCV.
Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., and Tor-
ralba, A. (2016). Learning deep features for discrimi-
native localization. In CVPR, pages 2921–2929.
Zhou, Y., Zhu, Y., Ye, Q., Qiu, Q., and Jiao, J. (2018).
Weakly supervised instance segmentation using class
peak response. In CVPR, pages 3791–3800.
Zhu, G. and Iglesias, C. A. (2016). Computing seman-
tic similarity of concepts in knowledge graphs. IEEE
Trans Knowl Data Eng, 29(1):72–85.
Segment My Object: A Pipeline to Extract Segmented Objects in Images based on Labels or Bounding Boxes
625