Applying Positional Encoding to Enhance Vision-Language Transformers Topics: Deep Learning for Visual Understanding ; Machine Learning Technologies for Vision In Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 5 VISAPP: VISAPP, 838-845, 2023 , Lisbon, Portugal