Slag Removal Path Estimation by Slag Distribution and Deep

Learning

Junesuk Lee

, Geon-Tae Ahn

, Byoung-Ju Yun

and Soon-Yong Park

School of Electronics Engineering, Kyungpook National University, Daegu, South Korea

Research Institute of Industrial Science and Technology, Pohang, South Korea

Keywords: Path Estimation, Deep Learning, Intelligent Robots, Industrial Robots.

Abstract: In the steel manufacturing process, de-slagging machine is used to remove slag floating on molten metal in a

ladle. In general, temperature of floating slag on the surface of the molten metal is above 1,500℃. The process

of removing such slag at high temperatures is dangerous and is only performed by trained human operators.

In this paper, we propose a deep learning method for estimating the slag removal path to automate slag

removal task. We propose an idea of developing a slag distribution image structure(SDIS); combined with a

deep learning model to estimate the removal path in an environment in which the flow of molten metal cannot

be controlled. The SDIS is given as the input into to the proposed deep learning model, which we train by

imitating the removal task of experienced operators. We use both quantitative and qualitative analyses to

evaluate the accuracy of the proposed method with the experienced operators.

1 INTRODUCTION

Recently, in the field of intelligent robotics, artificial

intelligence technology has significant research

topics (Kim. J. et al., 2018), (Cauli et al., 2018). An

intelligent robot is a robot that can recognize the

external environment and judge the situation by itself

to operate autonomously. Intelligent robots are

mainly divided into industrial robots and service

robots. Industrial robots perform dangerous,

hazardous, or simple repetitive tasks on behalf of

humans. These robots offer significant benefits in the

manufacturing industry: labor cost, productivity, and

quality. In particular, industrial robots improve the

working environment for humans. In industrial

environments, people are always threatened by

polluted air, dangerous materials, and so on.

Introducing robots into industrial environments can

consequently reduce the severity of most dangerous

effects to humans.

In steel company, various by-products/residues,

such as slag, dust, and sludge are produced when steel

is produced. Molten iron is located in the lower part

of the furnace, and by-products drift on it. To produce

https://orcid.org/0000-0002-1702-6720

https://orcid.org/0000-0002-9898-2262

https://orcid.org/0000-0001-5090-9667

Figure 1: System overview.

high quality steel, such residue must be removed from

the furnace. In general steel company, slag removal is

performed using a skimmer as shown in Figure 1.

Since steel production has dangerous working

environment, there is a high probability of accidents

involving human operators. This is because the

molten metal has a temperature of about 1,500℃ or

higher and the view of the metal is covered by some

dust. To protect operators getting exposed to such

dangerous conditions, automated robots capable of

performing human tasks are required.

In this paper, we propose an accurate slag

removal path estimation method which can easily be

embedded into any automated system. The system

overview of the proposed method is briefly shown in

Ladle Skimmer

Slag

Slag removal

Path estimation

Input image Slag distribution image Removal path estimation

246

Lee, J., Ahn, G., Yun, B. and Park, S.

Slag Removal Path Estimation by Slag Distribution and Deep Learning.

DOI: 10.5220/0008944602460252

In Proceedings of the 15th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2020) - Volume 4: VISAPP, pages

246-252

ISBN: 978-989-758-402-2; ISSN: 2184-4321

Figure 1, where the skimmer moves from top to

bottom along the ladle. The overview of the proposed

system has two distinctive stages: an image

transformation stage which contains slag distribution

information, and a slag removal path estimation stage

based on deep learning. The main contributions of

this paper are three-fold:

1. propose a learning model that mimics the slag

removal path of a skilled human operators to

estimate its removal path.

2. create a slag distribution image structure that

contains slag distribution information from color

images.

3. estimate the slag removal path in real-time, and

the experimental results verify the performance of

the proposed method.

The structure of this paper is organized as

follows: First, the related work is discussed in Section

2. Section 3 describes the creation of the proposed

slag distribution image and how training data can be

extracted. Section 4 describes the structure and post-

processing of the deep learning model used to predict

the slag removal path. Section 5 discusses about the

working environment and quantitative and qualitative

accuracy analyses through a few experimental results.

Finally, we conclude in Section 6.

2 RELATED WORK

In this section, we briefly discuss previous studies

which are related to automated robots of desk

cleaning, rock excavation, slag removal, and route

prediction. (Kim. J. et al., 2018) proposed a desk

cleaning technique using the iCub humanoid Robot

for cleaning graffiti and lentils from a desk. For a

robot to clean the top of a desk automatically, it must

recognize the material on the desk and estimate the

path to clean it. In this study, a human instructor

teaches the robot how to perform cleaning tasks. Task

Parametrized Gaussian Mixture Model (TP-GMM) is

used to encode the demo variables and to properly

generalize the features. However, the

parameterization of TP-GMM is very difficult

because it requires partitioning and extracting

complex images of small tables. Therefore, while the

instructor demonstrates the cleaning task, a trained

deep neural network is used to extract parameters

from the robot camera image.

(Fukui et al., 2015) discussed about an Automated

Ore Excavator. To carry out autonomous excavation

of rocks, it is necessary to recognize the state of the

fragmented rock piles and plan the appropriate

excavation operation accordingly. They proposed an

imitation-based motion planning method and

developed a rock pile condition recognizer with an

excavation motion planner. To verify the proposed

method, they developed a 1/10 scale excavation

model and conducted excavation experiments.

Experimental results showed that rock piles could be

distinguished according to surface shape and particle

distribution, where the number and the variety of

training data proved important for realizing high

productivity excavation.

(Kim. J. S. et al., 2018) conducted a study to

remove slag using a de-slagging machine. In general,

de-slagging machines can only be controlled by

trained professionals. In their research, they proposed

a method for estimating the slag removal path

automatically using CNN. They trained their network

by extracting block regions based on the actions of an

experienced specialist. They performed backtracking

and curve fitting to properly estimate the removal

path and compared with the path of the experienced

expert.

(Minoura et al., 2018) proposed a path prediction

method that takes target object attributes and physical

environment information into account. Previous path

prediction methods using deep learning architecture

took into account the physical environment of a single

target, such as a pedestrian. However, they proposed

a route prediction method that could consider

multiple target types. The method represents the

attributes as one-hot vectors and encodes the physical

attributes through convolutional layers. Furthermore,

we used relative coordinates as the past motion

history of prediction targets. They verified the

proposed method using the Stanford drone dataset.

TRAINING DATA ACQUISITION

3.1 Slag Distribution Image Generation

In slag removal task, an area with high slag

distribution is removed first. The reason is that high

slag distribution means that the slag is concentrated

in the area. By removing dense slag areas, it results in

efficient removal task.

In this section, we discuss the design architecture

of the slag distribution image which is proposed to

train our deep learning network (Figure 2). It consists

of 3 channels: grayscale, morphology, and distance

transform images. In addition, it also contains

information that can be used to distinguish the inside

and outside of the ladle. In our method, we determine

that the slag exists only inside the inner part of the

Slag Removal Path Estimation by Slag Distribution and Deep Learning

247

ladle and the skimmer is moved only within this area.

We also generate a binary ladle image, which consists

of zero and one as shown in Figure 3. The inside of

the ladle is labeled as 1 and the outside is 0.

Figure 2: Slag distribution image.

Figure 3: binary ladle image.

Figure 4 shows how the slag distribution image is

created and how it is converted from an input color

image. First, we convert the input RGB image into

grayscale (Figure 4(a)) and perform the Hadamard

multiplication using the binary ladle image. This

operation results in zero value outside the ladle area

in the converted grayscale image (Figure 4(b)). The

operator ‘◦’ in Figure 4 represents the Hadamard

multiplication operation.

After getting the inverse binary image from the

converted grayscale image, we apply the opening

morphology operation (erosion and dilation) as

shown in Figure 4(d). We used a 3x3 kernel for

erosion and a 9x9 kernel for dilation, respectively.

This morphology operation not only removes small

slag chunks but also fills holes in large chunks. Next,

we perform the Hadamard multiplication on Figure

4(d) with the binary ladle image. Next, we multiply

the figure 4(e) by 0.2 scale and add the output of the

function 𝐺(ladle binary image). The final result of the

morphology operation is shown in Figure 4(f).

Function 𝐺



𝑥



is the same as Equation 1. Operation

symbol ⊕ means bit XOR operation.

𝐺



𝑥



=𝑥⊕𝑥

 ∗ 255

(1)

To generate the distance transform image, we first

apply the distance transform algorithm (Felzenszwalb

and Huttenlocher, 2012) on Figure 4(d). The

Hadamard multiplication is applied on Figure 4(g) to

distinguish between inside and outside of the ladle.

Finally, we merge the grayscale, morphology, and

distance transform images to create the slag

distribution image as shown in Figure 4(i). The three

Figure 4: Pipeline for constructing the proposed slag distribution image.

Ori

inal ima

Gra

ima

Mor

holo

ima

Distance transform image

Sla

∘ Binary ladle image

Merge

∘ Binary ladle image

Original image

Inverse binarization Morphology operation

Distance transform

∘ Binary ladle image

Morphology image

Distance transform image

Slag distribution image

×0.2𝐺



Binary ladle image



Convert grayscale image

Grayscale image

(a) (b)

(g) (h)

(i)

Operation symbol ∘ : Hadamard multiplication

VISAPP 2020 - 15th International Conference on Computer Vision Theory and Applications

248

separate channels in this slag distribution image

resemble the BGR channels in a general color image.

3.2 Proposed Training Data Format

and Data Acquisition Method

In this paper, we use a learning-based algorithm to

solve the slag removal path estimation problem. In the

proposed method, we equally divide the height (y-

coordinate) of the slag distribution image into 24

fixed intervals as shown in Figure 5. We perform the

learning and estimating processes only along the x-

coordinate. Fixing the y-coordinate reduced the total

computational cost and increased the efficiency of the

path estimation problem.

Figure 5: Proposed training data format.

To train our network, we collect training data by

imitating the removal path of a skilled human

operator. An example of a collected dataset is shown

in Figure 6. The left side shows a recorded removal

path, the middle is the slag distribution image, and the

right is a 2D slag removal path-vector that stores the

coordinates of the recorded control points. We use the

slag distribution image and the 2D path vector to train

our deep learning model. After training our network,

we use only the slag distribution image in the test

phase. Some examples of manual path recording are

shown in Figure 7, which will be used in later

sections.

(a) (b) (c)

Figure 6: Training data, (a) Removal path visualization

image (b) Training image (c) Removal path vector.

Figure 7: Acquisition of human operator’s removal path.

The slag removal path-vector 𝑃



𝑥,𝑦



can be

defined mathematically as in Equation (2). In this

equation, 𝑤 and ℎ represent the width and height of

the training image. 𝑥, 𝑦, and 𝑖 are of an integer data

type. If slag removal is not need in the 𝑖-th vector

position, we set its x-coordinate to -1. The green

points in Figure 7 are displayed only if this condition

is satisfied.

𝑃



𝑥,𝑦



=



𝑥,𝑦





0≤𝑥<𝑤,

𝑦=

ℎ

∗𝑖,

0≤𝑖<24



(2)

4 SLAG REMOVAL PATH

ESTIMATION

4.1 Deep Network Architecture

The network structure proposed in this research is

shown in Figure 8. We use the ResNet50 model's Skip

connection (He et al., 2016) to minimize gradient

vanishing problem. When training the network, we

use both the slag distribution image and x-coordinates

of the 2D vector as the inputs. Our network model

outputs the respective x-coordinate value

corresponding to each y-coordinate. The input image

size is 288x192x3, and the output of the trained

network model is 24x1x1, which means the 24 x-

coordinate values. We use a total of 25 layers to create

the proposed network. All convolution layers in the

proposed network use elu-activation (Clevert et al.,

2015) and zero-padding. A key point of the proposed

network is that the filter size of all convolution is 5x5.

In general, the object recognition problem using deep

learning uses a convolution filter size of 3x3.

Slag Removal Path Estimation by Slag Distribution and Deep Learning

249

However, since it is important to recognize the

distribution of slag, we use a large filter size to

observe a larger area.

Figure 8: Proposed network model for slag removal path

estimation.

4.2 Loss Function

In this paper, we use the loss function as shown in

Equation (3) to train our network. This loss function

consists of two loss-terms. 𝜆 is a constant parameter

for fine-tuning the weight for each term. In this paper,

𝜆



is set to 10 and 𝜆



is set to 1.

𝑙𝑜𝑠𝑠





=𝜆



𝑙𝑜𝑠𝑠



𝜆



𝑙𝑜𝑠𝑠



(3)

𝑙𝑜𝑠𝑠



=

‖

𝑣





𝑡



𝑥



‖







(4)

𝑣





1, 𝑖𝑓 𝑡



1

0, 𝑒𝑙𝑠𝑒

(5)

𝑙𝑜𝑠𝑠



is an expression for finding weight values

optimized for training data. In 𝑙𝑜𝑠𝑠



, 𝑀 means the

number of output elements. The proposed network

outputs 24 x-coordinate values, so we set 𝑀 to 23. 𝑡



means the answer label. 𝑥



means the output of the

network. 𝑣



is a variable that excludes the control

point whose x coordinate point is -1 on the answer

label. 𝑣



is defined as in Equation (5).

𝑙𝑜𝑠𝑠



=

‖

𝑥



2∗𝑥



𝑥



‖







(6)

𝑙𝑜𝑠𝑠



is defined by Equation (6).

𝑙𝑜𝑠𝑠



is an expression that minimizes the

difference in magnitude between the values of listed

𝑥-coordinates. In other words, it is a formula that

minimizes the curvature of the slag removal path.

This equation is necessary because the curvature of

the slag removal path is large when a human operator

removes slag. That is, the formula is to make an

output value 𝑥



that minimizes the difference

between 𝑥



and 𝑥



and the difference between

𝑥



and 𝑥



4.3 Path Estimation

The proposed slag removal path estimation model

yields 24 values for the x-coordinate. These output

values correspond to the 24-equally spaced y-

coordinate values, as described in Section 3.2. As

shown in Figure 7, the number of control points are

sometimes 24 or less depending on the distribution of

slag. To decide the removal path along only in the

image areas covered by slag, output points in the de-

slag area should be removed. In this paper, such

inefficient slag removal path coordinates are

excluded as shown in Figure 9. The exclusion steps

of inefficient removal path coordinates are as follows:

1) In the grayscale channel of the slag distribution

image, we search the intensity values of control

points in the order from the top (𝑘=0) to bottom

( 𝑘=23 ).

2) If the intensity of a control point is less than 𝛾, this

area is considered to be the area where slags exist.

𝛾 is the intensity threshold value for the slag area.

We define ℎ as the lowest index number among

the control points where the slag exists.

3) We extract the ℎ

to 𝑘

as the final slag removal

path.

Figure 9: Exclusion of unnecessary control points.

Removal inefficient control-points

Slag removal path

estimation model

Input image

Initial removal path

Final slag removal path

VISAPP 2020 - 15th International Conference on Computer Vision Theory and Applications

250

5 EXPRIMENTAL RESULTS

We have done experiments using many ladle images

to verify the performance of the proposed slag

removal path estimation method. When performing

these experiments, we used Windows-10 (64 bit) and

an NVIDIA Titan XP graphics card. Also, our

implementation uses the Keras-GPU library. To train

the proposed network model for slag removal path

estimation, we use a large number of training data as

shown in Table 1. We assign a high weight to the

training data count as there is not enough dataset. We

use the Adadelta optimizer method (Zeiler, 2012) to

train the proposed network model. We set the batch

size to 100 and the learning rate to 0.8 in the learning

options.

Table 1: Number of Training, Validation and Test set.

Training set Validation set Test set

1,110 139 139

Figure 10 is a comparison between the estimated

removal paths from the proposed method with those

of experienced operators, which we define as the GT

(Ground Truth).

The experimental results are analysed by both

qualitative and quantitative ways. First, qualitative

analysis is as follows. In Figure 10, (a), (b), and (c)

estimated slag removal paths look similar to GT.

However, the result in Figure 10(d) differs from the

GT, it estimates the best efficient path. Figure 10(e)

shows bad performance compared with GT.

As quantitative analyses, we evaluate the amount

of slag removal in the image. We compare the

estimated removal path with the amount of slag

removed from the GT. Slag removal is measured by

the amount of slag inside the 3x3 kernel at the all the

control points. The measurement method counts the

intensity value of less than 𝛾 in the grayscale channel

of the slag distribution image. We measure the slag

removal of 89 images and compare the proposed

method using Equation (7). The average % of slag

removal of the proposed method is about 77.33%

compared with GT.

𝑆



   

   

× 100

(7)

The processing time of the proposed method is

shown in Table 2. The conversion time from the input

image to the generation of the proposed slag

distribution

image takes 28ms. Slag removal path

(a) (b) (c) (d) (e)

Figure 10: Slag removal path estimation results of various ladle images: Green dots are the result of an experienced

professional (Ground Truth), RED dots are the estimated result of the proposed method.

Slag Removal Path Estimation by Slag Distribution and Deep Learning

251

estimation takes 9ms to process. Therefore, the total

average processing time is 37ms.

Table 2: Processing time.

Convert to Slag

distribution ima

Slag removal path

estimation

28ms

(

35.71 f

)

9ms

(

111.11 f

)

6 CONCLUSIONS

In this paper, we propose an efficient slag removal

path estimation method based on a deep learning

network. We introduce a slag distribution image

structure, which includes 3-channel slag distribution

information. This image is used as the input of the

network, which was trained by slag removal path

information by experienced human operators. We

obtain optimal slag removal path information through

the outputs of the network, and apply post-processing

techniques to remove invalid control points from the

output of the trained model. In experiments, we

visualize that the estimated slag removal path has an

average of 77.33% performance enhancement

compared to the manually recorded path. The total

average processing time is about 37ms, ensuring its

real-time capabilities.

ACKNOWLEDGEMENTS

This work was supported by 'The Cross-Ministry

Giga KOREA Project' grant funded by the Korea

government(MSIT) (No.GK17P0300, Real-time 4D

reconstruction of dynamic objects for ultra-realistic

service)

This research was supported by the National Research

Foundation of Korea (NRF) funded by the Korea

government (MSIT: Ministry of Science and ICT)

((No. 2018M2A8A5083266))

REFERENCES

Kim, J., Cauli, N., Vicente, P., Damas, B., Cavallo, F. and

Santos-Victor, J., 2018, April. “iCub, clean the table!”

A robot learning from demonstration approach using

Deep Neural Networks. In 2018 IEEE International

Conference on Autonomous Robot Systems and

Competitions (ICARSC), pages. 3-9. IEEE.

Cauli, N., Vicente, P., Kim, J., Damas, B., Bernardino, A.,

Cavallo, F. and Santos-Victor, J., 2018. Autonomous

table-cleaning from kinesthetic demonstrations using

Deep Learning. In Joint IEEE International Conference

on Development and Learning (ICDL) and Epigenetic

Robotics (EpiRob). IEEE.

Fukui, R., Niho, T., Nakao, M. and Uetake, M., 2015.

Imitation-based control of automated ore excavator to

utilize human operator knowledge of bedrock condition

estimation and excavating motion selection. In 2015

IEEE/RSJ International Conference on Intelligent

Robots and Systems (IROS), pages. 5910-5916. IEEE.

Kim, J. S., Ahn, G. T., Park. S. Y., 2018. A Block-based

Path Recognition of Slag Removal Using

Convolutional Neural Network. In 2018 International

Conference on Pattern Recognition and Artifical

Intelligence (ICPRAI).

Minoura, H., Hirakawa, T., Yamashita, T. and Fujiyoshi, H.,

2018. Path Predictions using Object Attributes and

Semantic Environment. In 14th International

Conference on Computer Vision Theory and

Applications (VISIGRAPP), pages. 19-26.

Felzenszwalb, P.F. and Huttenlocher, D.P., 2012. Distance

transforms of sampled functions. Theory of computing,

volume 8.1, pages.415-428.

He, K., Zhang, X., Ren, S. and Sun, J., 2016. Deep residual

learning for image recognition. In Proceedings of the

IEEE conference on computer vision and pattern

recognition, pages. 770-778.

Clevert, D.A., Unterthiner, T. and Hochreiter, S., 2015. Fast

and accurate deep network learning by exponential

linear units (elus). arXiv preprint arXiv:1511.07289.

Zeiler, M.D., 2012. ADADELTA: an adaptive learning rate

method. arXiv preprint arXiv:1212.5701.

VISAPP 2020 - 15th International Conference on Computer Vision Theory and Applications

252