Beyond Data Augmentations: Generalization Abilities of Few-Shot Segmentation Models

Muhammad Ahsan; Guy Ben-Yosef; Gemma Roig

doi:10.5220/0013179200003912

Beyond Data Augmentations: Generalization Abilities of Few-Shot Segmentation Models

Muhammad Ahsan, Guy Ben-Yosef, Gemma Roig

2025

Abstract

Few-shot learning in semantic segmentation has gained significant attention recently for its adaptability in applications where only a few or no examples are available as support for training. Here we advocate for a new testing paradigm, we coin it half-shot learning (HSL), which evaluates model’s ability to generalise to new categories when support objects are partially viewed, significantly cropped, occluded, noised, or aggressively transformed. This new paradigm introduces challenges that will spark advances in the field, allowing us to benchmark existing models and analyze their acquired sense of objectness. Humans are remarkably exceptional at recognizing objects even when partially obstructed. HSL seeks to bridge the gap between human-like perception and machine learning models by forcing them to recognize objects from incomplete, fragmented, or noisy views - just as humans do. We propose a highly augmented image set for HSL that is built by intentionally manipulating PASCAL-5i and COCO-20i to fit this paradigm. Our results reveal the shortcomings of state-of-the-art few-shot learning models and suggest improvements through data augmentation or the incorporation of additional attention-based modules to enhance the generalization capabilities of few-shot semantic segmentation (FSS). To improve the training method, we propose a channel and spatial attention module (Woo et al., 2018), where an FSS model is retrained with attention module and tested against the highly augmented support information. Our experiments demonstrate that an FSS model trained with the proposed method achieves significantly a higher accuracy (approximately 5%) when exposed to limited or highly cropped support data.

Download

Paper Citation

in Harvard Style

Ahsan M., Ben-Yosef G. and Roig G. (2025). Beyond Data Augmentations: Generalization Abilities of Few-Shot Segmentation Models. In Proceedings of the 20th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 2: VISAPP; ISBN 978-989-758-728-3, SciTePress, pages 430-438. DOI: 10.5220/0013179200003912

in Bibtex Style

@conference{visapp25,
author={Muhammad Ahsan and Guy Ben-Yosef and Gemma Roig},
title={Beyond Data Augmentations: Generalization Abilities of Few-Shot Segmentation Models},
booktitle={Proceedings of the 20th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 2: VISAPP},
year={2025},
pages={430-438},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013179200003912},
isbn={978-989-758-728-3},
}

in EndNote Style

TY - CONF

JO - Proceedings of the 20th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 2: VISAPP
TI - Beyond Data Augmentations: Generalization Abilities of Few-Shot Segmentation Models
SN - 978-989-758-728-3
AU - Ahsan M.
AU - Ben-Yosef G.
AU - Roig G.
PY - 2025
SP - 430
EP - 438
DO - 10.5220/0013179200003912
PB - SciTePress