HaloAE: A Local Transformer Auto-Encoder for Anomaly Detection and Localization Based on HaloNet
Emilie Mathian, Emilie Mathian, Huidong Liu, Lynnette Fernandez-Cuesta, Dimitris Samaras, Matthieu Foll, Liming Chen
2023
Abstract
Unsupervised anomaly detection and localization is a crucial task in many applications, e.g., defect detection in industry, cancer localization in medicine, and requires both local and global information as enabled by the self-attention in Transformer. However, brute force adaptation of Transformer, e.g., ViT, suffers from two issues: 1) the high computation complexity, making it hard to deal with high-resolution images; and 2) patch-based tokens, which are inappropriate for pixel-level dense prediction tasks, e.g., anomaly localization,and ignores intra-patch interactions. We present HaloAE, the first auto-encoder based on a local 2D version of Transformer with HaloNet allowing intra-patch correlation computation with a receptive field covering 25% of the input image. HaloAE combines convolution and local 2D block-wise self-attention layers and performs anomaly detection and segmentation through a single model. Moreover, because the loss function is generally a weighted sum of several losses, we also introduce a novel dynamic weighting scheme to better optimize the learning of the model. The competitive results on the MVTec dataset suggest that vision models incorporating Transformer could benefit from a local computation of the self-attention operation, and its very low computational cost and pave the way for applications on very large images a
DownloadPaper Citation
in Harvard Style
Mathian E., Liu H., Fernandez-Cuesta L., Samaras D., Foll M. and Chen L. (2023). HaloAE: A Local Transformer Auto-Encoder for Anomaly Detection and Localization Based on HaloNet. In Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2023) - Volume 5: VISAPP; ISBN 978-989-758-634-7, SciTePress, pages 325-337. DOI: 10.5220/0011865900003417
in Bibtex Style
@conference{visapp23,
author={Emilie Mathian and Huidong Liu and Lynnette Fernandez-Cuesta and Dimitris Samaras and Matthieu Foll and Liming Chen},
title={HaloAE: A Local Transformer Auto-Encoder for Anomaly Detection and Localization Based on HaloNet},
booktitle={Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2023) - Volume 5: VISAPP},
year={2023},
pages={325-337},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011865900003417},
isbn={978-989-758-634-7},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2023) - Volume 5: VISAPP
TI - HaloAE: A Local Transformer Auto-Encoder for Anomaly Detection and Localization Based on HaloNet
SN - 978-989-758-634-7
AU - Mathian E.
AU - Liu H.
AU - Fernandez-Cuesta L.
AU - Samaras D.
AU - Foll M.
AU - Chen L.
PY - 2023
SP - 325
EP - 337
DO - 10.5220/0011865900003417
PB - SciTePress