Adaptive Prompt Tuning: Vision Guided Prompt Tuning with Cross-Attention for Fine-Grained Few-Shot Learning

Eric Brouwer; Eric Brouwer; Jan van Woerden; Gertjan Burghouts; Matias Valdenegro-Toro; Marco Zullich

Research.Publish.Connect.

*Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

*Please fill out at least one Field.

Name:
Country:
Subject:

Advanced Search Affiliations Search

If you're looking for an exact phrase use quotation marks on text fields.

Proceedings

Proceedings Search *Please fill out at least one Field. *Value must be an number!

Title:
ISBN:
Year:
Acronym:
Subject:

Advanced Search Proceedings Search

If you're looking for an exact phrase use quotation marks on text fields.

Papers

Papers Search *Please fill out at least one Field.

Title:
Author:
Affiliation:
Subject:

Advanced Search Papers Search

If you're looking for an exact phrase use quotation marks on text fields.

Authors

Authors Search *Please fill out at least one Field.

Name:
Affiliation:
Country:
Conference:
Subject:

Advanced Search Authors Search

If you're looking for an exact phrase use quotation marks on text fields.

Advanced Search

Paper

Adaptive Prompt Tuning: Vision Guided Prompt Tuning with Cross-Attention for Fine-Grained Few-Shot Learning

Topics: Deep Learning for Visual Understanding ; Features Extraction; Few-Shot Learning; Machine Learning Technologies for Vision

In Proceedings of the 20th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 2 VISAPP: VISAPP, 114-125, 2025 , Porto, Portugal

Authors: Eric Brouwer ^{1

;

2} ; Jan Erik van Woerden ² ; Gertjan Burghouts ² ; Matias Valdenegro-Toro ¹ and Marco Zullich ¹

Affiliations: ¹ Faculty of Science and Engineering, University of Groningen, Nijenborgh 9, 9747 AG, Groningen, The Netherlands ; ² TNO, Oude Waalsdorperweg 63, 2597 AK, Den Haag, The Netherlands

Keyword(s): CLIP, Visual Prompt Tuning, Few-Shot Learning, Fine-Grained Image Recognition, Adaptive Inference, Uncertainty Quantification, Monte-Carlo Dropout, Expected Calibration Error.

Abstract: Few-shot, fine-grained classification in computer vision poses significant challenges due to the need to differentiate subtle class distinctions with limited data. This paper presents a novel method that enhances the Contrastive Language-Image Pre-Training (CLIP) model through adaptive prompt tuning, guided by real-time visual inputs. Unlike existing techniques such as Context Optimization (CoOp) and Visual Prompt Tuning (VPT), which are constrained by static prompts or visual token reliance, the proposed approach leverages a cross-attention mechanism to dynamically refine text prompts for the image at hand. This enables an image-specific alignment of textual features with image patches extracted from the Vision Transformer, making the model more effective for datasets with high intra-class variance and low inter-class differences. The method is evaluated on several datasets, including CUBirds, Oxford Flowers, and FGVC Aircraft, showing significant performance gains over static promp t tuning approaches. To ensure these performance gains translate into trustworthy predictions, we integrate Monte-Carlo Dropout in our approach to improve the reliability of the model predictions and uncertainty estimates. This integration provides valuable insights into the model’s predictive confidence, helping to identify when predictions can be trusted and when additional verification is necessary. This dynamic approach offers a robust solution, advancing the state-of-the-art for few-shot fine-grained classification. (More)

CC BY-NC-ND 4.0

Guest: Register as new SciTePress user now for free.

SciTePress user: please login.

My Papers

You are not signed in, therefore limits apply to your IP address 3.17.156.160

In the current month:

Recent papers: 100 available of 100 total

2⁺ years older papers: 200 available of 200 total

Paper citation in several formats:

Brouwer, E., van Woerden, J. E., Burghouts, G., Valdenegro-Toro, M. and Zullich, M. (2025). Adaptive Prompt Tuning: Vision Guided Prompt Tuning with Cross-Attention for Fine-Grained Few-Shot Learning. In Proceedings of the 20th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 2: VISAPP; ISBN 978-989-758-728-3; ISSN 2184-4321, SciTePress, pages 114-125. DOI: 10.5220/0013163700003912

@conference{visapp25,
author={Eric Brouwer and Jan Erik {van Woerden} and Gertjan Burghouts and Matias Valdenegro{-}Toro and Marco Zullich},
title={Adaptive Prompt Tuning: Vision Guided Prompt Tuning with Cross-Attention for Fine-Grained Few-Shot Learning},
booktitle={Proceedings of the 20th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 2: VISAPP},
year={2025},
pages={114-125},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013163700003912},
isbn={978-989-758-728-3},
issn={2184-4321},
}

TY - CONF

JO - Proceedings of the 20th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 2: VISAPP
TI - Adaptive Prompt Tuning: Vision Guided Prompt Tuning with Cross-Attention for Fine-Grained Few-Shot Learning
SN - 978-989-758-728-3
IS - 2184-4321
AU - Brouwer, E.
AU - van Woerden, J.
AU - Burghouts, G.
AU - Valdenegro-Toro, M.
AU - Zullich, M.
PY - 2025
SP - 114
EP - 125
DO - 10.5220/0013163700003912
PB - SciTePress