Rescuing Easy Samples in Self-Supervised Pretraining
Qin Wang, Kai Krajsek, Hanno Scharr
2025
Abstract
Many recent self-supervised pretraining methods use augmented versions of the same image as samples for their learning schemes. We observe that ’easy’ samples, i.e. samples being too similar to each other after augmentation, have only limited value as learning signal. We therefore propose to rescue easy samples and make them harder. To do so, we select the top k easiest samples using cosine similarity, strongly augment them, forward-pass them through the model, calculate cosine similarity of the output as loss, and add it to the original loss in a weighted fashion. This method can be adopted to all contrastive or other augmented-pair based learning methods, whether they involve negative pairs or not, as it changes handling of easy positives, only. This simple but effective approach introduces greater variability into such self-supervised pretraining processes, significantly increasing the performance on various downstream tasks as observed in our experiments. We pretrain models of different sizes, i.e. ResNet-50, ViT-S, ViT-B, or ViT-L, using ImageNet with SimCLR, MoCo v3, or DINOv2 training schemes. Here, e.g., we consistently find to improve results for ImageNet top-1 accuracy with a linear classifier establishing new SOTA for this task.
DownloadPaper Citation
in Harvard Style
Wang Q., Krajsek K. and Scharr H. (2025). Rescuing Easy Samples in Self-Supervised Pretraining. In Proceedings of the 20th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 2: VISAPP; ISBN 978-989-758-728-3, SciTePress, pages 400-409. DOI: 10.5220/0013167900003912
in Bibtex Style
@conference{visapp25,
author={Qin Wang and Kai Krajsek and Hanno Scharr},
title={Rescuing Easy Samples in Self-Supervised Pretraining},
booktitle={Proceedings of the 20th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 2: VISAPP},
year={2025},
pages={400-409},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013167900003912},
isbn={978-989-758-728-3},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 20th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 2: VISAPP
TI - Rescuing Easy Samples in Self-Supervised Pretraining
SN - 978-989-758-728-3
AU - Wang Q.
AU - Krajsek K.
AU - Scharr H.
PY - 2025
SP - 400
EP - 409
DO - 10.5220/0013167900003912
PB - SciTePress