
6.2 Limitations
The conducted research highlights the promising po-
tential of Vision-Language Models (VLMs) for au-
tomating image validation in e-commerce. However,
certain limitations of this approach should be ac-
knowledged.
One limitation is the accuracy of the validation it-
self, which is not error-free. Consequently, the valida-
tor may work well as a module for suggesting qual-
ity improvements and flagging image defects for cat-
alog administrators, but its use for definitively reject-
ing defective images requires detailed testing before
implementation in a specific catalog. Similarly, any
update to the model version in a production environ-
ment should also be preceded by prior research, as
results may vary.
Additionally, VLMs have significant computa-
tional requirements, which may pose a barrier for
smaller enterprises. Further extensions of these mod-
els to accommodate industry-specific requirements
could negatively affect their performance unless they
are optimized for computational load and infrastruc-
ture accessibility.
REFERENCES
Altman, D. G. and Bland, J. M. (1994a). Statistics notes:
Diagnostic tests 1: sensitivity and specificity. BMJ,
308(6943):1552.
Altman, D. G. and Bland, J. M. (1994b). Statistics notes:
Diagnostic tests 2: predictive values. BMJ, 309:102.
Appelbaum, D., Kogan, A., Vasarhelyi, M., and Yan,
Z. (2017). Impact of business analytics and enter-
prise systems on managerial accounting. Interna-
tional Journal of Accounting Information Systems,
25:29–44.
Ballou, D. P., Madnick, S. E., and Wang, R. Y. (2004). Spe-
cial section: Assuring information quality. J. Manag.
Inf. Syst., 20:9–11.
Biryukov, A. (2020). Data quality as a service. Journal Of
Applied Informatics, 15:120–132.
Cao, M. and Zhang, Q. (2011). Supply chain collabo-
ration: Impact on collaborative advantage and firm
performance. Journal of operations management,
29(3):163–180.
Di, W., Sundaresan, N., Piramuthu, R., and Bhardwaj, A.
(2014). Is a picture really worth a thousand words?
- on the role of images in e-commerce. In Proceed-
ings of the 7th ACM international conference on Web
search and data mining, WSDM ’14, page 633–642,
New York, NY, USA. Association for Computing Ma-
chinery.
Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M. W.,
and Keutzer, K. (2022). A Survey of Quantiza-
tion Methods for Efficient Neural Network Inference,
pages 291–326. Chapman and Hall/CRC.
Haug, A., Zachariassen, F., and Van Liempd, D. (2011).
The costs of poor data quality. Journal of Industrial
Engineering and Management, page 168–193.
Hole, Y., Pawar, S., and Bhaskar, M. P. (2018). Service
marketing and quality strategies. Periodicals of Engi-
neering and Natural Sciences (PEN), 6(1):182–196.
Liu, H., Li, C., Wu, Q., and Lee, Y. J. (2023). Visual in-
struction tuning. ArXiv, abs/2304.08485.
Michalski, R. (2020). The role of virtual package shapes in
digital product presentation. In Rebelo, F. and Soares,
M., editors, Advances in Ergonomics in Design, page
24–30, Cham. Springer International Publishing.
Muszy
´
nski, K., Niemir, M., and Skwarek, S. (2022).
Searching for ai solutions to improve the quality of
master data affecting consumer safety. In Business
Logistics in Modern Management, page 121–140, Os-
ijek, Croatia. Faculty of Economics in Osijek.
Niemir, M. and Mrugalska, B. (2022). Product data quality
in e-commerce: Key success factors and challenges.
In Production Management and Process Control.
Ouni, S., Kamoun, K., and AlAttas, M. (2022). Se-
mantic Image Quality Assessment Using Conventional
Neural Network for E-Commerce Catalogue Manage-
ment, page 89–113. Springer International Publishing,
Cham.
Powers, D. M. W. (2011). Evaluation: from precision, recall
and f-measure to roc, informedness, markedness and
correlation. ArXiv, abs/2010.16061.
Qalati, S. A., Vela, E. G., Li, W., Dakhan, S. A., Hong Thuy,
T. T., and Merani, S. H. (2021). Effects of perceived
service quality, website quality, and reputation on pur-
chase intention: The mediating and moderating roles
of trust and perceived risk in online shopping. Cogent
Business & Management, 8(1):1869363.
Russom, P. (2011). Big data analytics. TDWI best practices
report, fourth quarter, 19(4):1–34.
Szymkowski, M. and Niemir, M. (2024). Convolutional
neural networks and vision transformers in product
gs1 gpc brick code recognition. In Sheng, B., Bi,
L., Kim, J., Magnenat-Thalmann, N., and Thalmann,
D., editors, Advances in Computer Graphics, Lecture
Notes in Computer Science, page 440–450, Cham.
Springer Nature Switzerland.
Wang, R. Y. and Strong, D. (1996). Beyond accuracy: What
data quality means to data consumers. Journal of
Management Information Systems, 12(4):5–33.
Zhang, J., Huang, J., Jin, S., and Lu, S. (2024).
Vision-language models for vision tasks: A survey.
arXiv:2304.00685.
Vision-Language Models for E-commerce: Detecting Non-Compliant Product Images in Online Catalogs
1123