Applied Feature-oriented Project Life Cycle Classification

Oliver Böhme and Tobias Meisen

Chair for Technologies and Management of Digital Transformation, Bergische Universität Wuppertal,

Rainer-Gruenter-Str. 21, Wuppertal, Germany

Keywords: Machine Learning, Classification, Prediction, Deep Neural Networks, MLP, LSTM, Multivariate,

Automotive, R&D, Projects Progressions, Project Life Cycle, Comparative Analysis.

Abstract: The increasing complexity in automotive product development is forcing traditional manufacturers to

fundamentally rethink. As a result, many companies are already investing in the development of methods to

increase the controllability of their development processes. The use of data-driven approaches is a promising

way to provide an early prediction of potential problems in the course of a project by learning from the past.

In vehicle development, projects can be divided into two basic categories: new vehicle launches and model

enhancement projects. The course of projects according to the above-mentioned categories can be based on

different influencing factors. To verify this hypothesis and to determine the extent of the differences in the

data, we carry out a data-driven classification of the project category. In contrast to the recognition of other

time-dependent data (e.g., univariate sensor data courses), we use multivariate project information from the

automotive industry. With this paper, which is of an application nature, we prove that a multivariate

classification of automotive projects can be realized based on the underlying project’s progression.

1 INTRODUCTION

The automotive industry is facing unprecedented

challenges. There are new, fresh competitors entering

the world markets, technological advancements call

for further developments and increasing customer

requirements force the classic manufacturers to

transform. At the same time, growing volumes of

information processing, an increase in the number of

interconnected features and electronic control units

(ECU) in the vehicle as well as a simultaneous,

steadily increasing focus on data-based business

models are creating ever greater complexity in

development, production and sales. In addition to

classic purchasing criteria such as comfort, design

and engine performance, high-quality software is

becoming increasingly important (Simonazzi et al.,

2020). It has become a success factor for the

reliability of the automotive product and its ability to

succeed on the market. Product quality has an

influence on the automotive company's reputation.

For this reason, it is becoming increasingly important

for automotive companies to develop methods that

ensure the successful management of their product

development.

As described in Boehme and Meisen (2021) we

therefore strive to develop a data-driven approach that

focuses on a quantitative evaluation and prediction of

the progress of vehicle development projects by using

machine learning methods (Boehme and Meisen,

2021). A system like that will help to predict risks of

milestone shifts at an early stage of the project in

order to develop measures to steer the project back on

track.

In the research area, vehicle development projects

are mainly divided into two categories: New vehicle

launches with regard to the start of production (SOP)

and model enhancement project, that are managed

and continuously improved along their life cycle

(LC). When a vehicle manufacturer intends to

develop a new vehicle or a new derivative of an

existing vehicle, then it is an SOP project. In order to

continuously improve the product, integrate new

features or react to current technical changes in the

vehicle environment, the vehicles are advanced

throughout their product life cycle. Within the

framework of further development, deadlines are also

defined by which the hardware-software-compound

is to be released. Projects whose product life cycle

falls into this category are assigned to LC projects.

Böhme, O. and Meisen, T.

Applied Feature-oriented Project Life Cycle Classiﬁcation.

DOI: 10.5220/0010578402850291

In Proceedings of the 10th International Conference on Data Science, Technology and Applications (DATA 2021), pages 285-291

ISBN: 978-989-758-521-0

285

Based on the knowledge of the domain’s experts,

which were interviewed by the authors, there is

evidence that in many cases the course of SOP and

LC projects appears to be differentiated. Hence, the

aim of this paper is to validate, that the differences in

the project’s courses can be recognized by a data

driven classification approach. For this purpose, we

apply machine learning methods to automatically

recognize the project type and derive evidence for the

hypothesis based on the model's performance.

Therefore, a dataset is used that contains real-world

data from the electrical/ electronics development

department of an automobile manufacturer. In

addition, common methods for the classification of

time series are implemented. Finally, the results are

compared and discussed.

The remainder of this paper is structured as

follows; Section 2 provides an overview of the

current state of the art in time series classification

methods. Section 3 describes the experimental setup.

It introduces the dataset used and describes the

development and optimization of the classification

model. The results as well as the comparison of these

results with those of common classification methods

are presented in section 4. A summary of the goals,

methodology and results are presented in section 5.

Additionally, a recommendation for further work is

given. Our vision is to predict the progress of vehicle

development projects. For this paper, we aim to

present an approach that is capable of classifying

vehicle development projects based on various

influencing factors.

2 RELATED WORK

As far as our own literature review showed, there are

no approaches for the classification of project

progressions. Considering this, we examine existing

approaches from other industrial application areas

and evaluate their adaptability.

In 2006, Yang & Wu identified time series

classification as one of the ten most difficult problems

in data mining research (Yang and Wu, 2006). Since

then, it has been studied for several years (Esling and

Agon, 2012). Research interest has grown with the

increasing availability of existing time series datasets

(Silva et al., 2018). Since 2015, hundreds of time

series classification algorithms have been published

(Bagnall et al., 2017). One of the most traditional and

widely used approaches is Nearest Neighbor (Bagnall

et al., 2017) (Lines and Bagnall, 2014). Recent

contributions have therefore focused on methods that

can go beyond k-NN (in conjunction with Dynamic

Time Warping [DTW] as a distance metric).

Baydogan et al. focused their research on the

application of random forests (Baydogan et al., 2013).

From 2015 onwards, different types of discriminative

classifiers such as Support Vector Machines (SVM),

became more focused on by the research community

(Bagnall et al., 2016) (Bostrom and Bagnall, 2015)

(Schäfer, 2015) (Kate, 2016). Most of the approaches

developed have the common property of a data

transformation phase, in which the time series are

transformed into a new feature space (Bostrom and

Bagnall, 2015) (Kate, 2016).

Motivated by these considerations, an ensemble

of 35 classifiers called Collective of Transformation

based Ensembles (COTE) was created (Bagnall et al.,

2016). COTE was further developed by Lines et al.

by adding a hierarchical system component to HIVE-

COTE by using a new hierarchical structure with

probabilistic adjustment and by adding two additional

classifiers to the ensemble (Lines et al., 2016) (Lines

et al., 2018). In 2017 the authors stated that the

method is considered state-of-the-art for time series

classification (Bagnall et al., 2017). However, the

method is not practicable in many areas of application

because the calculation, optimisation and cross-

validation of 37 classifiers is computationally

intensive (Bagnall et al., 2017) (Lucas et al., 2018).

Due to the system limitations shown, some

attempts have been made recently to apply deep

learning approaches to time series classification

problems. After the success of deep neural networks

in the field of computer vision, a number of

researchers have proposed different architectures for

deep neural networks to solve time series

classification tasks (NLP, machine translation,

learning word embedding or document classification)

(Sutskever et al., 2014) (Bahdanau et al., 2015)

(Mikolov et al., 2013) (Mikolov et al., 2013) (Le and

Mikolov, 2014) (Goldberg, 2016).

In 2015 Ordonez and Roggen used Deep

Convolutional and Recurrent Neural Networks for

Human Activity Recognition (Ordonez and Roggen,

2016). Similar research was carried out by Atzori et

al. and successfully applied in the field of motion

detection of prosthetic hands (Atzori et al., 2016).

Cui et al. presented a multi-scale convolutional

neural network (MCNN) in 2016 that could achieve

state-of-the-art performance (Cui et al., 2016). A year

later, Wang et al. evaluated the performance of eleven

different classifiers on 44 UCR datasets. With

significant improvements compared to NN-DTW and

COTE, a fully convolutional network was able to

establish itself as the most powerful classifier (Wang

DATA 2021 - 10th International Conference on Data Science, Technology and Applications

286

et al., 2017). Bai et al. presented a generic

convolutional network with dilations and residual

connections in 2018 that showed more effective

results than the common RNNs (such as LSTM) (Bai

et al., 2018). At the same time, Karim et al. tested an

enhancement of an FCN with LSTM submodules on

85 UCR datasets, which provided the best accuracy

on average (Karim et al., 2019).

As it can be derived from the related work there is

a wide range of classification methods for different

fields of application. Furthermore, the literature

research showed that no implementation in the

automotive sector and the development of vehicle

projects has been undertaken so far. Given this, we

want to prove the applicability of the identified

classification methods on the complexity of vehicle

development projects in the next step.

3 RESEARCH METHOD

In this section the design and execution of the

experiment is described. In addition, the experimental

setup is explained.

3.1 Dataset Description and Statistical

Analysis

The dataset used in this contribution was built by the

authors and contains real-world data from the

electrical/ electronics development department of an

automotive manufacturer.

Figure 1: Two-dimensional Visualization of the Data.

It contains 302 examples, each of which is

described with 20 attributes. One of the attributes

indicates whether the vehicle project under

consideration is an SOP or an LC project. Each

For reasons of confidentiality, the dataset cannot be

released at the moment. This will be done by the authors

at the appropriate time.

example is a time series representing the progress of

a vehicle development project over 195 time stamps.

Figure 1 shows a two-dimensional visualisation of

the data, where the Y-axis represents the total number

of errors in the project at the time of the respective

time stamp. By calling errors, we understand

hardware and software errors appeared over the

development of each project. Each time stamp

represents a weekly snapshot of the project status, in

terms of the current total number of errors,

considering project meta- and environmental factors.

Table 1 shows an overview, description and

statistical evaluation of the different features. The

mean, median, standard deviation, maximum and

minimum values are shown. The two classes are

represented in our data in a ratio of 65% (LC) to 35%

(SOP).

3.2 Experimental Setup

The experiments are carried out using Python version

3.7. In order for our dataset to become a suitable input

for our learning models, the original structure had to

be pre-processed. Before splitting the dataset into a

train- and test set, the input data was normalised. With

normalisation, we strive to transform the values in our

columns to a common scale. The split percentage was

chosen at 30/70, since that has been suggested in

several literatures in the field of machine learning

(Khosla, 2015). Finally, train and test data were

transformed into numpy arrays for efficiency reasons.

The classifiers used are the SKLEARN

implementations of AdaBoost, Decision Tree

classifier, Discriminant Analysis, Gaussian Process

classifier, MLP classifier, Support Vector Machine

(SVC, Linear SVC, SGD classifier), Random Forest

classifier and K-NN. In addition, by using Tensorflow

and Keras we implemented a baseline LSTM-

classifier. Except for the LSTM, each classifier was

optimised based on its individual (hyper-)parameters

using GridSearch.

The LSTM was designed with a first embedded layer

that uses vectors of the length of the trainings data

shape to represent each vehicle project. The next four

layers are bidirectional LSTM layers with 100

memory units. Since it is a classification problem, we

use a dense output layer with a single neuron and a

sigmoid activation function to make 0 or 1 predictions

for the two classes (SOP and LC) in the classification

problem. Due to the fact that it is a binary

classification problem, the logarithmic loss is used as

Applied Feature-oriented Project Life Cycle Classiﬁcation

287

Table 1: Features of the Automotive Dataset.

the loss function. In addition, the efficient ADAM

optimisation algorithm is used. The model is fit for 50

epochs. The batch size of 32 reviews is used to

distribute the weight updates.

For comparable results we also used 5-fold cross

validation for each algorithm. To determine the best

model, we used the F1-Score, since it is the harmonic

mean of respective recall and precision values

(Tatbul, 2018). For further details on the

implementation and for the purpose of further use we

published our code.

4 RESULTS

In Table 2 we show a comparative evaluation of all

methods applied to the dataset presented. The MLP

classifier showed the lowest performance at 79.1%.

The second neural network approach, LSTM, also

performed only 0.4 percentage points higher. The

Discriminant Analysis (79.5%) and the Gaussian

Process classifier (79.7%) also ranked on

approximately the same level. With 80.9%, K-NN is

in the midfield of the comparative evaluation. In

contrast, AdaBoost, Decision Tree and Support

Vector Machine showed a slightly better

performance, achieving a F1-Score between 82.4%

and 82.5%. The Random Forest classifier showed the

best performance with a F1-Score of 85.7%.

https://github.com/tmdt-buw/carclass

Compared directly with the neural network

approaches, it achieved over 6 percentage points

better.

To further analyse the performance of the models,

we divided the time series into three equal periods.

The following is a simplified explanation of the

choice of the three periods. As can be seen in Figure

1, the differentiation of the time series based on the

total number of errors in the first period cannot yet be

measured consistently. This is due to the fact that this

phase is usually used for function build-up and

therefore only minor testing can be carried out over

this period. In the second period, testing the functions

at vehicle level is one of the main tasks in terms of

integration and ensuring product quality. In this

phase, it is crucial to find all the errors possible.

Finally, the third phase describes the reduction of

errors. In this phase, it is decided whether the quality

and deadline targets can be kept. Afterwards, again a

comparative evaluation is carried out for each period.

Figure 2 shows the performance of the algorithms

measured by the F1-Score for the shapelets and the

whole series.

Once again, the Ada Boost, Decision Trees and

Random Forest were able to confirm their

performance. In relation to the first third, Decision

Tree showed the best performance with an F1 score

of 85%. This demonstrates the algorithm's robustness

to errors and shows that it can handle categorical and

continuous data well at the same time.

DATA 2021 - 10th International Conference on Data Science, Technology and Applications

288

Table 2: Comparative Evaluation on the implemented

Classifiers.

Ada Boost was able to classify the vehicle projects

with respect to the period in the middle with an almost

perfect F1 score (98%). Due to the small number of

training samples, this demonstrates on the one hand

the adaptability to the complex time series in the

vehicle development environment. The high accuracy

(96.8%) in this phase also confirms this result. The

good performance with so few training samples could

be an indicator that the factors influencing project

performance depend on the type of project. This

supports the hypothesis that for a holistic multivariate

prediction of project progress, differentiation in terms

of predictors may be useful to maximise the

performance of the prediction model.

However, the measurement results of the algorithm

for the third period point to the known disadvantages

such as its sensitivity to noisy data and outliers. This

means that it is always more difficult to achieve good

performance with Ada Boost without overfitting

when the data cannot be easily assigned to a particular

separation plane. With only 58.8% F1 score,

AdaBoost represents the worst result in the method

comparison here. Among other things, this can be

attributed to the fact that the AdaBoost reacts very

sensitively to noise and outliers, both of which can be

triggered by the different project ends. Despite the

high heterogeneity of the data in the last period, the

Random Forest was still able to achieve an F1 score

of 77.8%. This confirms the algorithm's robustness

against outliers and its good handling of non-linear

data.

The linear SVC shows the worst result in period 1

with only 69.8 %. In particular, with reference to the

class distribution (ratio 1:2), this performance is only

significantly above a random classification. In terms

of ranking, again we see the same pattern for period

2. However, the lowest F1 score achieved is 87.3%,

which is still a very good result considering the

number of training samples.

Furthermore, the LSTM has to be seen as one of

the weaker classifiers. The deep neural network

performs only slightly better than the weakest

classifier in each case. It should be emphasised that

the LSTM is only available in its baseline variant and

has therefore not yet been fully optimized. In

addition, it is to be expected that the performance will

improve considerably with an increasing amount of

data.

Figure 2: Segment-wise Comparison regarding F1-Score.

5 CONCLUSIONS

Our aim was to investigate, whether the common

methods for multivariate classification can be applied

to the recognition of the project type of vehicle

development projects. We aimed to prove the

hypothesis that the project types "SOP" and "LC" are

subject to different influencing factors. For this

purpose, we implemented current state-of-the-art

methods from the field of machine learning as well as

deep neural networks (baseline classifiers) and

compared them with respect to their performance.

Applied Feature-oriented Project Life Cycle Classiﬁcation

289

Our results have shown that multivariate time

series classification of vehicle development projects

is feasible. Even with a small number of training

samples and a comparatively high number of features,

an F1 score of 85.7% (at 78% accuracy) could be

achieved. Considering the class distribution, this is a

promising result. By dividing the time series into

three periods, these results could be considerably

increased again with an F1 score of 98% (at 96.8%

accuracy).

Ensemble methods such as Ada Boost and

Random Forest stood out in particular. Along with

decision trees, these two methods not only showed

very good applicability for the given problem, but

also outperformed the neural networks (likely due to

a lack of training data). In addition, these white box

models offer the advantage of transparency,

interpretability and lower computing time. Due to this

definite assignability of the project type, we see our

hypothesis confirmed.

For similar problems, we therefore recommend

the use of ensemble methods, considering the

classification results, the implementation effort and

the computing time. However, it can be assumed that

the performance of the neural networks will increase

with an increasing number of training samples.

Further work will therefore consist in adding

additional training samples to the dataset.

Furthermore, for having a complete picture, besides

considering the approach presented in this paper, a

comparative evaluation of the results with other

classification methods focusing on optimised neural

networks (e.g. FCN, CNN, LSTM) and ensemble

methods (e.g. HIVE-COTE) should be performed.

We will also consider different fold sets in our

training and testing.

In our future work, we will also conduct detailed

considerations for a better understanding of feature

importance. In order to address the curse of

dimensionality, the relevance of the individual

features will be determined, compared and evaluated

depending on the respective project phases. Finally,

the implementation of prediction models is planned,

enabling the prediction of the progression from any

point in time within the project.

ACKNOWLEDGEMENTS

We would like to thank all reviewers for their

valuable comments.

REFERENCES

Atzori, M., Cognolato, M. and Müller, H. 2016. Deep

Learning with Convolutional Neural Networks Applied

to Electromyography Data: A Resource for the

Classification of Movements for Prosthetic Hands.

Frontiers in Neurorobotics, 10, 2016

Bagnall, A., Lines, J., Bostrom, A., Large, J. and Keogh, E.

2017. The great time series classification bake off: a

review and experimental evaluation of recent

algorithmic advances. Data Mining and Knowledge

Discovery 31(3), pp 606–660

Bagnall, A., Lines, J., Hills, J. and Bostrom, A. 2016. Time-

series classification with COTE: the collective of

transformation-based ensembles. In: International

conference on data engineering, pp 1548–1549

Bahdanau, D., Cho, K. and Bengio, Y. 2015. Neural

machine translation by jointly learning to align and

translate. In: International conference on learning

representations

Bai, S., Kolter, J.Z. and Koltun, V. 2018. An Empirical

Evaluation of Generic Convolutional and Recurrent

Networks for Sequence Modeling, arXiv,

https://arxiv.org/pdf/1803.01271.pdf

Baydogan, M.G., Runger, G. and Tuv, E. 2013. A bag-of-

features framework to classify time series. IEEE Trans

Pattern Anal Mach Intell 35(11), pp 2796–2802

Boehme, O. and Meisen, T. 2021. Predicting the Progress

of Vehicle Development Projects – an Approach for the

Identification of Input Features. In: 13th International

Conference on Agents and Artificial Intelligence

(ICAART 2021)

Bostrom, A. and Bagnall, A. 2015. Binary shapelet

transform for multiclass time series classification. In:

Big data analytics and knowledge discovery, pp 257–

269

Cui, Z., Chen, W. and Chen, Y. 2016. Multi-Scale

Convolutional Neural Networks for Time Series

Classification, arXiv

Esling, P. and Agon, C. 2012. Time-series data mining.

ACM Comput Surv 45(1), pp 12:1–12:34

Goldberg, Y. 2016. A primer on neural network models for

natural language processing. Artif Intell Res 57(1), pp

345–420

Karim, F., Majumbar, S., Darabi, H. and Harford, S. 2019.

Multivariate LSTM-FCNs for time series classification,

Neural Networks, 116, pp 237-245

Kate, R.J. 2016. Using dynamic time warping distances as

features for improved time series classification. Data

Min Knowl Discov 30(2), pp 283–312

Khosla, R., Howlett, R.J. and Jain, L.C. 2005. Knowledge-

Based Intelligent Information and Engineering

Systems, 9th International Conference, KES 2005

Melbourne, Australia, September 2005 Proceedings,

Part IV, p 3

Le, Q. and Mikolov, T. 2014. Distributed representations of

sentences and documents. In: International conference

on machine learning, vol 32, pp II–1188–II–1196

Lines, J., Taylor, S. and Bagnall, A. 2016. HIVE-COTE:

the hierarchical vote collective of transformation-based

DATA 2021 - 10th International Conference on Data Science, Technology and Applications

290

ensembles for time series classification. In: IEEE

international conference on data mining, pp 1041–1046

Lines, J., Taylor, S. and Bagnall, A. 2018. Time series

classification with HIVE-COTE: the hierarchical vote

collective of transformation-based ensembles. ACM

Trans Knowl Discov Data 12(5), pp 52:1–52:35

Lines, J. and Bagnall, A. 2014. Time series classification

with ensembles of elastic distance measures. Data Min

Knowl Discov 29(3), pp 565–592

Lucas, B., Shifaz, A., Pelletier, C., O’Neill, L., Zaidi, N.,

Goethals, B., Petitjean, F. and Webb, G.I. 2018.

Proximity forest: an effective and scalable distance-

based classifier for time series. Data Min Knowl Discov

28, pp 851–881

Mikolov, T., Chen, K., Corrado, G. and Dean, J. 2013.

Efficient estimation of word representations in vector

space. In: International conference on learning

representations—workshop

Mikolov, T., Sutskever, I., Chen, K., Corrado, G. and Dean,

J. 2013. Distributed representations of words and

phrases and their compositionality. In: Neural

information processing systems, pp 3111–3119

Ordonez, F.J. and Roggen, D. 2016. Deep Convolutional

and LSTM Recurrent Neural Networks for Multimodal

Wearable Acitivity Recognition, Sensors 16(1), p 115

Schäfer, P. 2015. The BOSS is concerned with time series

classification in the presence of noise. Data Min Knowl

Discov 29(6), pp 1505–1530

Silva, D.F., Giusti, R., Keogh E. and Batista, G. 2018.

Speeding up similarity search under dynamic time

warping by pruning unpromising alignments. Data

Mining and Knowledge Discovery 32(4), pp 988–1016

Simonazzi, A., Sanginés J.C. and Russo, M. 2020. The

Future of the Automotive Industry: Dangerous

Challenges or New Life for a Saturated Market,

Institute for New Economic Thinking, 2020

Sutskever, I., Vinyals, O. and Le, Q.V. 2014. Sequence to

sequence learning with neural networks. In: Neural

information processing systems, pp 3104–3112

Tatbul, N., Lee, T.J., Zdonik, S., Alam, M. and Gottschlich,

J. 2018. Precision and Recall for Time Series. In:

Advances in Neural Information Processing Systems 31

(NeurIPS 2018)

Wang, Z., Yan, W. and Oates, T. 2017. Time series

classification from scratch with deep neural networks:

A strong baseline. International Joint Conference on

Neural Networks (IJCNN), pp 1578–1585

Yang, Q. and Wu, X. 2006.10 challenging problems in data

mining research. Inf Technol Decis Mak 05(04), pp 597-

604.

Applied Feature-oriented Project Life Cycle Classiﬁcation

291