methodological approach to accomplish a ML-based
project. Section 4 explains the PdM techniques. Sec-
tion 5 discusses the applications of ML methods for
PdM. Section 6 concludes the paper and points out
the future research directions.
2 RELATED WORK
This section mainly concentrates on the previous
work done in relation to ML based PdM for smart
manufacturing. A state-of-the-art review of ML tech-
niques for PdM is presented by (Carvalho et al.,
2019). An end-to-end ML based predictive main-
tenance approach for manufacturing is provided by
(Ayvaz and Alpay, 2021). The proposed system is
scalable and effective for high-dimensional stream-
ing data. The system is also implemented in a real
manufacturing factory with success. Further, (Ouadah
et al., 2022) described the process of selecting the
most suitable supervised ML methods for PdM. Sim-
ilarly, (Hosamo et al., 2022) used supervised ML
techniques to forecast the equipment’s state in order
to plan maintenance in advance. In addition, vari-
ous supervised ML algorithms such as, logistic re-
gression, neural networks, support vector machines,
decision trees and k-nearest neighbors were applied
to predict costly production line disruptions (Iftikhar
et al., 2019). The accuracy of the proposed ML mod-
els were tested on a real-world data set with promis-
ing results. Furthermore, (Garan et al., 2022) men-
tioned the benefits of a data-enteric ML methodology
for predicting RUL. A supervised learning based pre-
dictive model to predict failure within a fixed time
period (at least 4 hours in advance) is presented by
(Herrero and Zorrilla, 2022). PdM for aircraft engines
has been studied by (Azyus and Wijaya, 2022) using
both classification and regression techniques. Like-
wise, the work by (Schwendemann et al., 2021) pro-
vided an overview of the most important approaches
for bearing-fault analysis, first based on classification
to detect the unhealthy condition, position and sever-
ity of the fault, later based on regression to predict the
RUL.
Moreover, a ML based PdM system for manufac-
turing industry is developed by (Arena et al., 2022)
to estimate the RUL based on ensemble models. A
feature selection strategy for unsupervised learning
is presented by (Yang et al., 2011). This work sug-
gested that fewer features could help to maximize the
performance of unsupervised learning models. (Kre-
mer et al., 2021) applied a deep learning method
for anomaly detection. Additionally, ensemble based
prediction models are implemented using supervised
and unsupervised learning (Rousopoulou et al., 2020)
and (Iftikhar et al., 2020), respectively. Finally,
a structured and comprehensive survey provided an
overview of the anomaly detection techniques (Chan-
dola et al., 2009). The work presented in this pa-
per considers a number of the recommendations pre-
sented in (Chandola et al., 2009).
The focus of the previous works is on various as-
pects and recent advancements of PdM using ML.
Most of these works focus on selecting ML models
for PdM and comparing their performance. On the
other hand, the work presented in this paper empha-
sises on the practical issues in relation to PdM. In ad-
dition, it covers most of the scenarios with respect to
PdM based on both labeled and unlabeled data.
3 METHODOLOGY
The development methodology used in this paper
is based on the data science workflow: CRoss In-
dustry Standard Process for Data Mining (CRISP-
DM)
1
. CRISP-DM is a robust, proven and generally
used methodology for planning, organizing and im-
plementing ML projects. CRISP-DM consists of the
following six phases: business understanding, data
understanding, data preparation, modeling, evalua-
tion, and deployment.
• Business understanding: One of the major flaws
with ML-based projects in PdM is to start with
data gathering and model building rather than
business understanding. Different areas of interest
have different concerns and anticipations. Firstly,
business objectives/goals should be defined. Fol-
lowed by use cases that accomplish the defined
goals along with the tools/technologies that are re-
quired to full-fill these objectives.
• Data understanding: Once the business cases are
developed, the next step is to collect and under-
stand data. At this stage there are two common
scenarios, either the data can be/has been col-
lected by using existing sensors or there is a need
to set up new/additional sensors to collect data
that is required to fulfill the requirements of the
use case(s). In the first scenario, a ML model
is selected in order to best suit the data at hand,
whereas in the second scenario right data needs to
be collected based on a pre-planned ML model.
The most important question to answer at this
stage is “can already/potentially available data be
used to achieve the defined business goals?”. To
gain insight into the acquired data, exploratory
1
https://thinkinsights.net/digital/crisp-dm
IN4PL 2022 - 3rd International Conference on Innovative Intelligent Industrial Production and Logistics
86