Authors:
Ayato Takama
;
Satoshi Kamiya
and
Kazuhiro Hotta
Affiliation:
Meijo University, 1-501 Shiogamaguchi, Tempaku-ku, Nagoya 468-8502, Japan
Keyword(s):
Semantic Segmentation, TransUNet, Mix Transformer, Word Patches.
Abstract:
UNet is widely used in medical image segmentation, but it cannot extract global information sufficiently. On the other hand, TransUNet achieves better accuracy than conventional UNet by combining a CNN, which is good at local features, and a Transformer, which is good at global features. In general, TransUNet requires a large amount of training data, but there are constraints on training images in the medical area. In addition, the encoder of TransUNet uses a pre-trained model on ImageNet consisted of natural images, but the difference between medical images and natural images is a problem. In this paper, we propose a method to learn Word Patches from other medical datasets and effectively utilize them for training TransUNet. Experiments on the ACDC dataset containing 4 classes of 3D MRI images and the Synapse multi-organ segmentation dataset containing 9 classes of CT images show that the proposed method improved the accuracy even with small training data, and we showed that the per
formance of TransUNet is greatly improved by using Word Patches created from different medical datasets.
(More)