Our results have shown that multivariate time
series classification of vehicle development projects
is feasible. Even with a small number of training
samples and a comparatively high number of features,
an F1 score of 85.7% (at 78% accuracy) could be
achieved. Considering the class distribution, this is a
promising result. By dividing the time series into
three periods, these results could be considerably
increased again with an F1 score of 98% (at 96.8%
accuracy).
Ensemble methods such as Ada Boost and
Random Forest stood out in particular. Along with
decision trees, these two methods not only showed
very good applicability for the given problem, but
also outperformed the neural networks (likely due to
a lack of training data). In addition, these white box
models offer the advantage of transparency,
interpretability and lower computing time. Due to this
definite assignability of the project type, we see our
hypothesis confirmed.
For similar problems, we therefore recommend
the use of ensemble methods, considering the
classification results, the implementation effort and
the computing time. However, it can be assumed that
the performance of the neural networks will increase
with an increasing number of training samples.
Further work will therefore consist in adding
additional training samples to the dataset.
Furthermore, for having a complete picture, besides
considering the approach presented in this paper, a
comparative evaluation of the results with other
classification methods focusing on optimised neural
networks (e.g. FCN, CNN, LSTM) and ensemble
methods (e.g. HIVE-COTE) should be performed.
We will also consider different fold sets in our
training and testing.
In our future work, we will also conduct detailed
considerations for a better understanding of feature
importance. In order to address the curse of
dimensionality, the relevance of the individual
features will be determined, compared and evaluated
depending on the respective project phases. Finally,
the implementation of prediction models is planned,
enabling the prediction of the progression from any
point in time within the project.
ACKNOWLEDGEMENTS
We would like to thank all reviewers for their
valuable comments.
REFERENCES
Atzori, M., Cognolato, M. and Müller, H. 2016. Deep
Learning with Convolutional Neural Networks Applied
to Electromyography Data: A Resource for the
Classification of Movements for Prosthetic Hands.
Frontiers in Neurorobotics, 10, 2016
Bagnall, A., Lines, J., Bostrom, A., Large, J. and Keogh, E.
2017. The great time series classification bake off: a
review and experimental evaluation of recent
algorithmic advances. Data Mining and Knowledge
Discovery 31(3), pp 606–660
Bagnall, A., Lines, J., Hills, J. and Bostrom, A. 2016. Time-
series classification with COTE: the collective of
transformation-based ensembles. In: International
conference on data engineering, pp 1548–1549
Bahdanau, D., Cho, K. and Bengio, Y. 2015. Neural
machine translation by jointly learning to align and
translate. In: International conference on learning
representations
Bai, S., Kolter, J.Z. and Koltun, V. 2018. An Empirical
Evaluation of Generic Convolutional and Recurrent
Networks for Sequence Modeling, arXiv,
https://arxiv.org/pdf/1803.01271.pdf
Baydogan, M.G., Runger, G. and Tuv, E. 2013. A bag-of-
features framework to classify time series. IEEE Trans
Pattern Anal Mach Intell 35(11), pp 2796–2802
Boehme, O. and Meisen, T. 2021. Predicting the Progress
of Vehicle Development Projects – an Approach for the
Identification of Input Features. In: 13th International
Conference on Agents and Artificial Intelligence
(ICAART 2021)
Bostrom, A. and Bagnall, A. 2015. Binary shapelet
transform for multiclass time series classification. In:
Big data analytics and knowledge discovery, pp 257–
269
Cui, Z., Chen, W. and Chen, Y. 2016. Multi-Scale
Convolutional Neural Networks for Time Series
Classification, arXiv
Esling, P. and Agon, C. 2012. Time-series data mining.
ACM Comput Surv 45(1), pp 12:1–12:34
Goldberg, Y. 2016. A primer on neural network models for
natural language processing. Artif Intell Res 57(1), pp
345–420
Karim, F., Majumbar, S., Darabi, H. and Harford, S. 2019.
Multivariate LSTM-FCNs for time series classification,
Neural Networks, 116, pp 237-245
Kate, R.J. 2016. Using dynamic time warping distances as
features for improved time series classification. Data
Min Knowl Discov 30(2), pp 283–312
Khosla, R., Howlett, R.J. and Jain, L.C. 2005. Knowledge-
Based Intelligent Information and Engineering
Systems, 9th International Conference, KES 2005
Melbourne, Australia, September 2005 Proceedings,
Part IV, p 3
Le, Q. and Mikolov, T. 2014. Distributed representations of
sentences and documents. In: International conference
on machine learning, vol 32, pp II–1188–II–1196
Lines, J., Taylor, S. and Bagnall, A. 2016. HIVE-COTE:
the hierarchical vote collective of transformation-based