6 CONCLUSION AND OUTLOOK
In this paper, a multi-head attention variational au-
toencoder (MA-VAE) for anomaly detection in auto-
motive testing is proposed. It not only features an
attention configuration that avoids the bypass phe-
nomenon but also introduces a novel method of
remapping windows to whole sequences. A num-
ber of experiments are conducted to demonstrate the
anomaly detection performance of the model, as well
as to underline the benefits of key aspects introduced
with the model.
From the results obtained, MA-VAE clearly ben-
efits from the MA mechanism, indicating the avoid-
ance of the bypass phenomenon. Moreover, the
proposed approach only requires a small train-
ing/validation subset size but fails to obtain a suit-
able threshold, as with increasing subset size only the
calibrated anomaly detection performance increases.
Training with different seeds also is shown to have
little impact on the anomaly detection metrics, pro-
vided the threshold is chosen suitably, further under-
lining the previous point. Moreover, mean-type re-
verse windowing fails to significantly outperform its
first-type and last-type counterparts, while introduc-
ing additional lag if it is applied to online anomaly
detection. Lastly, the hyperparameter optimisation re-
vealed that the MA-VAE variant with the largest latent
dimension and attention key dimension resulted in the
best anomaly detection performance. It is only 9% of
the time wrong when an anomaly is flagged and man-
ages to discover 67% of the anomalies present in the
test data set. Also, it outperforms all other competing
models it is compared with.
In the future, a method of threshold choice involv-
ing active learning will be investigated, which can use
user feedback to hone in on a better threshold. Also,
MA-VAE is set to be tested in the context of online
anomaly detection, i.e. during the driving cycle mea-
surement.
REFERENCES
Bahdanau, D., Cho, K., and Bengio, Y. (2015). Neural Ma-
chine Translation by Jointly Learning to Align and
Translate. In International Conference on Learning
Representations (ICLR).
Bahuleyan, H., Mou, L., Vechtomova, O., and Poupart,
P. (2018). Variational Attention for Sequence-to-
Sequence Models. In International Conference on
Computational Linguistics (COLING).
Bridle, J. S. (1990). Probabilistic Interpretation of Feedfor-
ward Classification Network Outputs, with Relation-
ships to Statistical Pattern Recognition. Neurocom-
puting, pages 227–236.
Chen, T., Liu, X., Xia, B., Wang, W., and Lai, Y.
(2020). Unsupervised Anomaly Detection of In-
dustrial Robots Using Sliding-Window Convolutional
Variational Autoencoder. IEEE Access, 8:47072–
47081.
Chollet, F. (2021). Deep Learning with Python. Manning
Publications.
Fu, H., Li, C., Liu, X., Gao, J., Celikyilmaz, A., and Carin,
L. (2019). Cyclical Annealing Schedule: A Simple
Approach to Mitigating. In Conference of the Associa-
tion for Computational Linguistics: Human Language
Technologies (NAACL-HLT).
Goodfellow, I., Bengio, Y., and Courville, A. (2016). Deep
Learning. MIT Press.
Kingma, D. P. and Welling, M. (2014). Auto-Encoding
Variational Bayes. In International Conference on
Learning Representations (ICLR).
Li, L., Yan, J., Wang, H., and Jin, Y. (2021). Anomaly
Detection of Time Series With Smoothness-Inducing
Sequential Variational Auto-Encoder. Transactions on
Neural Networks and Learning Systems, 32(3):1177–
1191.
Park, D., Hoshi, Y., and Kemp, C. C. (2018). A Multimodal
Anomaly Detector for Robot-Assisted Feeding Us-
ing an LSTM-Based Variational Autoencoder. IEEE
Robotics and Automation Letters, 3(3):1544–1551.
Pereira, J. and Silveira, M. (2018). Unsupervised Anomaly
Detection in Energy Time Series Data Using Varia-
tional Recurrent Autoencoders with Attention. In In-
ternational Conference on Machine Learning and Ap-
plications (ICMLA).
Pereira, J. and Silveira, M. (2019). Unsupervised repre-
sentation learning and anomaly detection in ECG se-
quences. International Journal of Data Mining and
Bioinformatics, 22(4):389.
Rezende, D. J. and Mohamed, S. (2015). Variational Infer-
ence with Normalizing Flows. In International Con-
ference on Machine Learning (ICML).
Rezende, D. J., Mohamed, S., and Wierstra, D. (2014).
Stochastic Backpropagation and Approximate Infer-
ence in Deep Generative Models. In International
Conference on Machine Learning (ICML).
Shannon, C. (1949). Communication in the Presence of
Noise. Proceedings of the IRE, 37(1):10–21.
Su, Y., Zhao, Y., Niu, C., Liu, R., Sun, W., and Pei, D.
(2019). Robust Anomaly Detection for Multivariate
Time Series through Stochastic Recurrent Neural Net-
work. In International Conference on Knowledge Dis-
covery & Data Mining (KDD).
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones,
L., Gomez, A. N., Kaiser, L., and Polosukhin, I.
(2017). Attention Is All You Need. In Conference
on Neural Information Processing Systems (NIPS).
von Schleinitz, J., Graf, M., Trutschnig, W., and Schr
¨
oder,
A. (2021). VASP: An autoencoder-based approach
for multivariate anomaly detection and robust time
series prediction with application in motorsport.
Engineering Applications of Artificial Intelligence,
104:104354.
NCTA 2023 - 15th International Conference on Neural Computation Theory and Applications
418