Beyond Data Augmentations: Generalization Abilities of Few-Shot Segmentation Models
Muhammad Ahsan, Guy Ben-Yosef, Gemma Roig
2025
Abstract
Few-shot learning in semantic segmentation has gained significant attention recently for its adaptability in applications where only a few or no examples are available as support for training. Here we advocate for a new testing paradigm, we coin it half-shot learning (HSL), which evaluates model’s ability to generalise to new categories when support objects are partially viewed, significantly cropped, occluded, noised, or aggressively transformed. This new paradigm introduces challenges that will spark advances in the field, allowing us to benchmark existing models and analyze their acquired sense of objectness. Humans are remarkably exceptional at recognizing objects even when partially obstructed. HSL seeks to bridge the gap between human-like perception and machine learning models by forcing them to recognize objects from incomplete, fragmented, or noisy views - just as humans do. We propose a highly augmented image set for HSL that is built by intentionally manipulating PASCAL-5i and COCO-20i to fit this paradigm. Our results reveal the shortcomings of state-of-the-art few-shot learning models and suggest improvements through data augmentation or the incorporation of additional attention-based modules to enhance the generalization capabilities of few-shot semantic segmentation (FSS). To improve the training method, we propose a channel and spatial attention module (Woo et al., 2018), where an FSS model is retrained with attention module and tested against the highly augmented support information. Our experiments demonstrate that an FSS model trained with the proposed method achieves significantly a higher accuracy (approximately 5%) when exposed to limited or highly cropped support data.
DownloadPaper Citation
in Harvard Style
Ahsan M., Ben-Yosef G. and Roig G. (2025). Beyond Data Augmentations: Generalization Abilities of Few-Shot Segmentation Models. In Proceedings of the 20th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 2: VISAPP; ISBN 978-989-758-728-3, SciTePress, pages 430-438. DOI: 10.5220/0013179200003912
in Bibtex Style
@conference{visapp25,
author={Muhammad Ahsan and Guy Ben-Yosef and Gemma Roig},
title={Beyond Data Augmentations: Generalization Abilities of Few-Shot Segmentation Models},
booktitle={Proceedings of the 20th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 2: VISAPP},
year={2025},
pages={430-438},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013179200003912},
isbn={978-989-758-728-3},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 20th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 2: VISAPP
TI - Beyond Data Augmentations: Generalization Abilities of Few-Shot Segmentation Models
SN - 978-989-758-728-3
AU - Ahsan M.
AU - Ben-Yosef G.
AU - Roig G.
PY - 2025
SP - 430
EP - 438
DO - 10.5220/0013179200003912
PB - SciTePress