
Lee, J., Choi, J., Mok, J., and Yoon, S. (2021a). Reduc-
ing information bottleneck for weakly supervised se-
mantic segmentation. Advances in Neural Information
Processing Systems (NeurIPS), 34:27408–27421.
Lee, J., Kim, E., and Yoon, S. (2021b). Anti-
adversarially manipulated attributions for weakly
and semi-supervised semantic segmentation. In
IEEE/CVF Conference on Computer Vision and Pat-
tern Recognition (CVPR), pages 4071–4080.
Lee, M., Kim, D., and Shim, H. (2022). Threshold matters
in wsss: Manipulating the activation for the robust and
accurate segmentation model against thresholds. In
IEEE/CVF Conference on Computer Vision and Pat-
tern Recognition (CVPR), pages 4330–4339.
Lee, S., Lee, M., Lee, J., and Shim, H. (2021c). Rail-
road is not a train: Saliency as pseudo-pixel supervi-
sion for weakly supervised semantic segmentation. In
IEEE/CVF Conference on Computer Vision and Pat-
tern Recognition (CVPR), pages 5495–5505.
Li, Y., Duan, Y., Kuang, Z., Chen, Y., Zhang, W., and
Li, X. (2022). Uncertainty estimation via response
scaling for pseudo-mask noise mitigation in weakly-
supervised semantic segmentation. In AAAI Con-
ference on Artificial Intelligence, volume 36, pages
1447–1455.
Li, Y., Kuang, Z., Liu, L., Chen, Y., and Zhang, W. (2021).
Pseudo-mask matters in weakly-supervised semantic
segmentation. In IEEE/CVF International Conference
on Computer Vision (CVPR), pages 6964–6973.
Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ra-
manan, D., Doll
´
ar, P., and Zitnick, C. L. (2014). Mi-
crosoft COCO: Common objects in context. In Euro-
pean Conference on Computer Vision (ECVV), pages
740–755. Springer.
Liu, S., Liu, K., Zhu, W., Shen, Y., and Fernandez-Granda,
C. (2022a). Adaptive early-learning correction for
segmentation from noisy annotations. In IEEE/CVF
Conference on Computer Vision and Pattern Recogni-
tion (CVPR), pages 2606–2616.
Liu, S., Zhi, S., Johns, E., and Davison, A. J. (2022b). Boot-
strapping semantic segmentation with regional con-
trast. In International Conference on Learning Rep-
resentations (ICLR).
Mo, Y., Wu, Y., Yang, X., Liu, F., and Liao, Y. (2022). Re-
view the state-of-the-art technologies of semantic seg-
mentation based on deep learning. Neurocomputing,
493:626–646.
Olsson, V., Tranheden, W., Pinto, J., and Svensson, L.
(2021). Classmix: Segmentation-based data augmen-
tation for semi-supervised learning. In IEEE/CVF In-
ternational Conference on Computer Vision (ICCV),
pages 1369–1378.
Oord, A. v. d., Li, Y., and Vinyals, O. (2018). Representa-
tion learning with contrastive predictive coding. arXiv
preprint arXiv:1807.03748.
Qin, J., Wu, J., Xiao, X., Li, L., and Wang, X. (2022).
Activation modulation and recalibration scheme for
weakly supervised semantic segmentation. In AAAI
Conference on Artificial Intelligence, volume 36,
pages 2117–2125.
Rong, S., Tu, B., Wang, Z., and Li, J. (2023). Boundary-
enhanced co-training for weakly supervised seman-
tic segmentation. In IEEE/CVF Conference on Com-
puter Vision and Pattern Recognition (CVPR), pages
19574–19584.
Rossetti, S., Zappia, D., Sanzari, M., Schaerf, M., and Pirri,
F. (2022). Max pooling with vision transformers rec-
onciles class and shape in weakly supervised semantic
segmentation. In European Conference on Computer
Vision (ECCV), pages 446–463. Springer.
Ru, L., Zheng, H., Zhan, Y., and Du, B. (2023). Token con-
trast for weakly-supervised semantic segmentation. In
IEEE/CVF Conference on Computer Vision and Pat-
tern Recognition (CVPR), pages 3093–3102.
Samek, W., Montavon, G., Lapuschkin, S., Anders, C. J.,
and M
¨
uller, K.-R. (2021). Explaining deep neural net-
works and beyond: A review of methods and applica-
tions. Proceedings of the IEEE, 109(3):247–278.
Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R.,
Parikh, D., and Batra, D. (2017). Grad-CAM: Visual
explanations from deep networks via gradient-based
localization. In IEEE/CVF International Conference
on Computer Vision (ICCV), pages 618–626.
Shen, W., Peng, Z., Wang, X., Wang, H., Cen, J., Jiang, D.,
Xie, L., Yang, X., and Tian, Q. (2023). A survey on
label-efficient deep image segmentation: Bridging the
gap between weak supervision and dense prediction.
IEEE Transactions on Pattern Analysis and Machine
Intelligence, 45(8):9284–9305.
Simonyan, K., Vedaldi, A., and Zisserman, A. (2013).
Deep inside convolutional networks: Visualising im-
age classification models and saliency maps. arXiv
preprint arXiv:1312.6034.
Wang, Y., Wang, H., Shen, Y., Fei, J., Li, W., Jin, G., Wu, L.,
Zhao, R., and Le, X. (2022). Semi-supervised seman-
tic segmentation using unreliable pseudo-labels. In
IEEE/CVF Conference on Computer Vision and Pat-
tern Recognition (CVPR), pages 4248–4257.
Wang, Y., Zhang, J., Kan, M., Shan, S., and Chen, X.
(2020). Self-supervised equivariant attention mecha-
nism for weakly supervised semantic segmentation. In
IEEE/CVF Conference on Computer Vision and Pat-
tern Recognition (CVPR), pages 12275–12284.
Xu, L., Ouyang, W., Bennamoun, M., Boussaid, F., and Xu,
D. (2022). Multi-class token transformer for weakly
supervised semantic segmentation. In IEEE/CVF
Conference on Computer Vision and Pattern Recog-
nition (CVPR), pages 4300–4309.
Yang, L., Qi, L., Feng, L., Zhang, W., and Shi,
Y. (2023). Revisiting weak-to-strong consistency
in semi-supervised semantic segmentation. In
IEEE/CVF Conference on Computer Vision and Pat-
tern Recognition (CVPR), pages 7236–7246.
Yang, Z., Fu, K., Duan, M., Qu, L., Wang, S., and
Song, Z. (2024). Separate and conquer: Decoupling
co-occurrence via decomposition and representation
for weakly supervised semantic segmentation. In
IEEE/CVF Conference on Computer Vision and Pat-
tern Recognition (CVPR), pages 3606–3615.
Yi, S., Ma, H., Wang, X., Hu, T., Li, X., and Wang, Y.
(2022). Weakly-supervised semantic segmentation
VISAPP 2025 - 20th International Conference on Computer Vision Theory and Applications
164