0.93 3397.69 0.89*
SAM2 0.93 3524.69 0.85
3 CONCLUSIONS
In general, we proved that employing AI models for
ultrasound similarity search resulted in reliable
performance, offering several usage possibilities in
medical applications.
In this image similarity task considering diagnosis
label match statistics, DINO features performed best,
and for organ label match statistics DreamSim
encoder performed best for fetal dataset. Our ResNet
was outstanding for early pregnancy organ label
matches. One likely explanation for this is that the
model’s training task was to separate the organs in
early pregnancy cases.
From this we can hypothesize that training an
encoder model on the whole ultrasound dataset using
general target (e.g. reconstruction) could enhance the
similarity search performance. This can be realized
by training a model from scratch, or finetune an
already evaluated encoder model using specific
techniques developed to tailor these models to custom
datasets.
ACKNOWLEDGEMENTS
Data collection and annotation activities were part of
the SUOG Project (www.suog.org), an EIT Health
Innovation supported project.
REFERENCES
Vaswani, A. (2017). Attention is all you need. Advances in
Neural Information Processing Systems.
Gkelios, S., Boutalis, Y., & Chatzichristofis, S. A. (2021,
July). Investigating the vision transformer model for
image retrieval tasks. In 2021 17th International
Conference on Distributed Computing in Sensor
Systems (DCOSS) (pp. 367-373). IEEE.
Agrawal, S., Chowdhary, A., Agarwala, S., Mayya, V., &
Kamath S, S. (2022). Content-based medical image
retrieval system for lung diseases using deep
CNNs. International Journal of Information
Technology, 14(7), 3619-3627.
Qayyum, A., Anwar, S. M., Awais, M., & Majid, M. (2017).
Medical image retrieval using deep convolutional neural
network. Neurocomputing, 266, 8-20.
Fu, S., Tamir, N., Sundaram, S., Chai, L., Zhang, R., Dekel,
T., & Isola, P. (2023). Dreamsim: Learning new
dimensions of human visual similarity using synthetic
data. arXiv preprint arXiv:2306.09344.
Jing, Z., Su, Y., Han, Y., Yuan, B., Liu, C., Xu, H., & Chen,
K. (2024). When Large Language Models Meet Vector
Databases: A Survey. arXiv preprint
arXiv:2402.01763.
Madugunki, M., Bormane, D. S., Bhadoria, S., & Dethe, C.
G. (2011, April). Comparison of different CBIR
techniques. In 2011 3rd International Conference on
Electronics Computer Technology (Vol. 4, pp. 372-
375). IEEE.
Kokare, M., Chatterji, B. N., & Biswas, P. K. (2003,
October). Comparison of similarity metrics for texture
image retrieval. In TENCON 2003. Conference on
convergent technologies for Asia-Pacific region (Vol.
2, pp. 571-575). IEEE.
Deselaers, T., Keysers, D., & Ney, H. (2008). Features for
image retrieval: an experimental
comparison. Information retrieval, 11, 77-107.
Yang, X., He, X., Zhang, H., Ma, Y., Bian, J., & Wu, Y.
(2020). Measurement of semantic textual similarity in
clinical texts: comparison of transformer-based
models. JMIR medical informatics, 8(11), e19735.
Caron, M., Touvron, H., Misra, I., Jégou, H., Mairal, J.,
Bojanowski, P., & Joulin, A. (2021). Emerging