An Innovative Automated Robotic System based on Deep Learning

Approach for Recycling Objects

Jaeseok Kim, Olivia Nocentini, Marco Scafuro, Raffaele Limosani, Alessandro Manzi, Paolo Dario

and Filippo Cavallo

The Biorobotics Institute, Sant’Anna School of Advanced Studies, Viale Rinaldo Piaggio, Pontedera, Pisa, Italy

Keywords:

Image Processing, Manipulation, Grasping, Deep Learning, Classiﬁcation of Materials, Recycling System.

Abstract:

In this paper, an industrial robotic recycling system that is able to grasp objects and sort them according to

their materials is presented. The system architecture is composed of a robot manipulator with a multifunctional

grasping tool, one platform, a depth and an RGB camera. The innovation of this work consists of integrating

image processing, grasping, motion planning and object material classiﬁcation to create a new automated

recycling system framework. An efﬁcient object recognition approach is presented that uses segmentation

and ﬁnds grasping points to properly manipulate objects. A deep learning approach was also used with a

modiﬁed LeNet model for waste objects classiﬁcation, sorting them into two main classes: carton and plastic.

Image processing and classiﬁcation were integrated with motion planning that is used to move the robot with

optimized trajectories. To evaluate the system, the success rate and the execution time for grasping and object

classiﬁcation were computed. In addition, the accuracy of the network model was evaluated. A total success

rate of 86.09% and 90% was obtained for carton and plastic samples grasped using suction, while 86.67% and

78.57% using gripper. In addition, a classiﬁcation accuracy of 96% was reached on test samples

1 INTRODUCTION

Nowadays, robotics has widely been developed in

various ﬁelds (Bostelman et al., 2016). In particu-

lar, the industrial robotics is rapidly growing up to

increase productivity and a new industrial paradigm

as industry 4.0 is emerging (Lu, 2017). The goal of

industry 4.0 is to reach a higher level of automation

efﬁciently. Many industrial robotic applications (Lu,

2017) have been developed such as smart city, smart

transportation, smart factory and etc. Especially, in

order to build a smart city, several key technologies

should be enhanced to improve the quality of human

life. One of them is to develop an automated waste

recycling system that could improve the quality of

human life and protect the environment (Gundupalli

et al., 2017).

Automated waste recycling system requires high

functionalities such as object detection, object classi-

ﬁcation, motion planning and etc. In addition, com-

plex manipulation scenario is necessary to develop the

system. In order to develop a recycling system, all

these functionalities should be integrated and each of

them should be communicated with the others. How-

ever, the development of them is still an issue and in

Figure 1: Robotic recycling system using a multifunctional

grasping tool.

industrial environment considering also the interac-

tion between workers and the robotic system.

In this paper, a system for recycling waste objects

automatically was developed. For developing the sys-

tem, a four major problem domains was considered:

1) object perception, 2) object classiﬁcation, 3) mo-

tion planning with a multifunctional grasping tool and

4) integration of all components to obtain a reliability

and acceptability industrial robotic system. The main

contributions of this paper are the following:

• An efﬁcient object recognition system is pre-

Kim, J., Nocentini, O., Scafuro, M., Limosani, R., Manzi, A., Dario, P. and Cavallo, F.

An Innovative Automated Robotic System based on Deep Learning Approach for Recycling Objects.

DOI: 10.5220/0007839906130622

In Proceedings of the 16th International Conference on Informatics in Control, Automation and Robotics (ICINCO 2019), pages 613-622

ISBN: 978-989-758-380-3

613

sented that applies clustering and segmentation

of waste objects, using a depth camera. It also

supports grasping point detection and grasp affor-

dance estimation for distinguishing objects using

a multifunctional grasping tool.

• A deep learning approach is introduced with a

modiﬁed LeNet model to classify materials of

the wastes. This model can classify the objects

into two main categories: plastic and carton. A

dataset for training the model was collected and

augmented applying rotation and illumination to

the original dataset.

• A motion planning was applied to generate trajec-

tories optimized for robot’s arm movements and

conﬁgure a proper pose of the multi-functional

grasping tool. Moreover, motion planning was

used to pick and place objects for recycling wastes

in a speciﬁc area.

• Object perception, manipulation and object clas-

siﬁcation were integrated as main functionalities

of a new automated recycling system framework.

In summary, the originality of the proposed sys-

tem is to create an automated recycling system that

not only picks the objects but also classiﬁes them with

multifunctional grasping tool using a deep learning

approach.

2 RELATED WORK

In this section, works related to recycling system were

reviewed, which covers three areas of interest: image

processing, grasping, and classiﬁcation of materials.

2.1 Image Processing using Depth

Camera

Depth sensors are widely used to perceive a variety

of environments and they are used to measure the dis-

tance data from a sensor to an object visualizing them

using point cloud data (Masuta et al., 2016).

2.1.1 Point Cloud Data and Acquisition

Point cloud data is a collection of data points deﬁned

by a given 2D or 3D coordinates system and colour

information. Point cloud becomes a common tech-

nique for image processing because it is easy to vi-

sualize and more accurate than traditional image pro-

cessing techniques (Nurunnabi et al., 2012). More-

over, this method is often the only possible primitive

for exploring shapes in higher dimensions (Donoho

and Grimes, 2003), (Tenenbaum et al., 2000). An-

other beneﬁt of this technique is to reduce computa-

tional time; in (Lei et al., 2017), Lei et al. used point

cloud data acquired from a 3D camera justifying their

choice as the fastest grasping approach.

Point cloud uses segmentation to process data; this

technique is deﬁned as the process of classifying point

clouds into multiple homogeneous regions and this is

helpful for analyzing the scene in various aspects such

as locating and recognizing objects, classiﬁcation,

and feature extraction (Nguyen and Le, 2013), (Thi-

lagamani and Moorthi, 2011). In (Vo et al., 2015),

Vo et al. proposed an octree-based region growing

algorithm for fast and accurate segmentation of ter-

restrial and aerial LiDAR point clouds. In (Ni et al.,

2017), Ni et al. used segmentation method to pro-

cess the acquired images. Based on the state of the

art, we propose an approach that uses segmentation to

decompose 3D data into meaningful regions function-

ally.

2.2 Grasping Strategy

The goal of this system is to robustly grasp objects

without relying on their object identities or poses.

As concern the grasping part, Principal Component

Analysis (PCA) was used to choose the grasping tools

(suction or gripper) to pick the objects, according to

their dimensions. PCA is a standard tool in modern

data analysis and it represents a simple method for ex-

tracting relevant information from confusing data sets

(Xiao et al., 2013). PCA was used, in (Cruz et al.,

2012) to accelerate the grasping process of unknown

objects: a single-view partial point cloud was con-

structed and grasp candidates were allocated along the

principal axis. In (Adnan and Mahzan, 2015), Adnan

et al. used PCA in the grasping process to reduce the

dimensional dataset of hand motion as well as mea-

suring the capacity of the ﬁngers movement. Another

use of PCA is shown in (Dai et al., 2013), where the

authors introduced a new PCA grasping motion anal-

ysis approach that captured correlations among hand

joints and represented dynamic features of grasping

motion with a low number of variables. The use of

PCA in different grasping situations brought us to

adopt this technique in the grasping part of our work.

2.3 Object Classiﬁcation for Recycling

The environmental health in the world is bad inﬂu-

enced by an improper waste recycling management

(Chu et al., 2018). To solve the problem, automated

sorting and recycling waste materials system have

been broadly investigated (Gundupalli et al., 2017).

ICINCO 2019 - 16th International Conference on Informatics in Control, Automation and Robotics

614

In particular, the classiﬁcation of industrial wastes is

one of the core functions to develop the system. The

recent search has proposed recycling system that has

been widely developed based on computer vision and

the use of deep learning algorithms for classifying

waste materials. In (Simonyan and Zisserman, 2014)

is highlighted and experimentally demonstrated how

very deep convolutional networks can reach a high

classiﬁcation accuracy for large-scale image classi-

ﬁcation and can generalize well to a wide range of

tasks and datasets. Awe et al. (Awe et al., 2017)

used a faster R-CNN model, an object detection net-

work with Region Proposal Networks (RPNs), in or-

der to classify waste into three categories: paper, re-

cycling and landﬁll. Rad et al. (Rad et al., 2017)

have developed a computer vision based system for

classiﬁcation and localization of waste on the streets

using GoogLenet. In particular, they had a signiﬁ-

cant improvement in classiﬁcation accuracy splitting

a class in two similar classes: leaves class and piles of

leave class. This kind of approach allowed to perceive

leaves grouped together and guaranteed a better gen-

eralization in classifying leaves. Mittal et al. (Mittal

et al., 2016) introduced an android app, SpotGarbage,

which uses an AlexNet model to detect and localize

garbage in images. Based on the paper, a CNN net-

work was designed with a modiﬁed LeNet model and

training set, which has not complex images. It can

classify waste objects in two main categories: carton

and plastic.

3 SYSTEM ARCHITECTURE

The goal of this work is to develop a robotic recycling

system that will be able to grasp objects in the ac-

tual environment and will be able to sort them accord-

ing to their materials (carton or plastic). A Microsoft

Kinect was attached under the platform and this sen-

sor is used to process the point clouds. After the im-

age processing, objects are grasped using UR5 robot

arm with the grasping tools (Robotiq gripper and one

big suction). Then, the objects are brought in front of

an RGB camera that classiﬁes them according to their

material. Last, the objects that have to be recycled are

collected in a box placed near the manipulator (see

Figure 2). To develop the whole process described

above, a platform was created to support the manipu-

lator and to delimit its movements for safety reasons.

Figure 2: System structure composed of three phases: 1)

Image processing, 2) Grasping and manipulation phase and

3) Classiﬁcation of materials.

4 SYSTEM DESCRIPTION

This system can be divided into the following pro-

cesses: a) image processing, b) grasping point detec-

tion, c) object material classiﬁcation, d) motion plan-

ning.

In the proposed scenario, the Kinect camera ac-

quires data as point cloud and processes them in or-

der to obtain clusters representing each object inside

the box. Then, the manipulator plans the optimal path

to reach each object and chooses the grasping strat-

egy between gripper and suction, according to the di-

mensions of the object grasped. After the grasping

process, a modiﬁed LeNet model is trained to recog-

nize the material of an object. The net classiﬁes the

objects more suitable between carton and plastic ex-

tracting their features with the RGB camera.

After material recognition, the object is moved to

a delivery box by the arm (UR5 Manipulator) placed

outside the structure. In details, two different deliv-

ery boxes were used: one that collects carton and the

other one for plastic (see Figure 3).

4.1 Object Segmentation

In this work, the data acquired from the Kinect as

point clouds were processed using the Point Cloud

Library (PCL). PCL is ﬁrst of all an open project for

2D/3D images and point cloud processing and in ad-

dition contains numerous state-of-the-art algorithms

including ﬁltering, feature estimation, surface recon-

struction, model ﬁtting, and segmentation (Rusu and

Cousins, 2011). The data collected as point clouds

are processed in order to extract the shapes of the ob-

An Innovative Automated Robotic System based on Deep Learning Approach for Recycling Objects

615

suction = true

gripper = false

suction = false

gripper = true

 no

yes

yes

Depth data

centroid of each cluster

normal of the plane

Image

Processing

End of the

process

Motion planning

for suction

Initialization

Object

classiﬁcation

Max number of

iteration reached

Motion planning

for

plastic delivery

Motion planning

for

carton delivery

Grasping

strategy

Success

Motion planning

for gripper

Figure 3: Flow chart of the robotic recycling system; It in-

cluded image processing, grasping strategy object classiﬁ-

cation and motion planning.

jects that have to be recycled. In details, the image

processing of these data is divided into the follow-

ing steps: a) acquisition of data-set as point clouds,

b) workspace ﬁltering, c) clustering objects, d) plane

model segmentation, e) extraction of the highest point

of the object and normal of the plane. At ﬁrst 3D point

cloud data were acquired from the Kinect, attached to

a support linked to the structure. After data acqui-

sition, the workspace was set and three pass-through

ﬁlters were implemented along the camera axis. The

ﬁrst one was applied along the z-axis of the camera

frame in order not to detect the mobile platform as an

object that had to be grasped. Then two other ﬁlters

were applied, which set the x and y axes of the Kinect

workspace to avoid the detection of the edges of the

structure.

After the ﬁltering process, clustering and seg-

mentation processes were employed to take over

each object of the scenario. Different methodolo-

gies have been suggested for 3D point cloud seg-

mentation. They can be categorized into ﬁve classes:

edge based methods, region-based methods, attributes

Figure 4: Image processing applied to an object. In red

clustering and visualization of the normal and in blue PCA

points are shown.

based methods, model-based methods, and graph-

based methods (Nguyen and Le, 2013). In this work,

a model-based method, that makes use of geometric

primitive shapes for grouping points was employed;

in details, a plane based model for the picking part

was chosen. The main reason behind this choice has

been that this model was shown as the most appropri-

ate to extract a good surface for the grasping. More-

over, a plane model based is a very suitable choice be-

cause planes are one of the most important primitives

since man-made structures mainly consist of planes,

(Feng et al., 2014), (Xiao et al., 2013). Then, for each

plane, the z coordinate of the highest point, and cen-

troid x and y coordinates were sent as a goal state for

the manipulation planning of the UR5 and the orien-

tation of the plane was used to adapt to object shape

using gripper/suction. The z-axis of the highest point

was used instead of the centroid of the plane to pre-

vent crashing the sample from the end-effector dur-

ing grasping. If we had used the centroid, the robot

would have pressed too much the sample and both of

them would have damaged. Then, in order to work

with world frame, points were converted from camera

frame (see Figure 4).

4.2 Grasping Objects using Principal

Component Analysis

The goal of the system is to perform robust grasp-

ing operations without predeﬁned grasping pose esti-

mation. To better achieve this goal,a multifunctional

end effector was created that can use both suction and

gripper tools. A similar end effector that had a re-

tractable mechanism that enables quick and automatic

switching between suction and gripper modalities was

also used in (Chu et al., 2017) for recognizing and

grasping objects. PCA was used in order to select the

most suitable grasping tool, base on the size of the

object. This technique ﬁnds the dimensions of the ob-

jects and compares them with the opening of the grip-

ICINCO 2019 - 16th International Conference on Informatics in Control, Automation and Robotics

616

Figure 5: Convolutional neural network architecture for the classiﬁcation of materials. The two outputs are for the probability

of carton and plastic.

per: if the dimensions of the objects are bigger than

the width of the gripper, we used suction, otherwise,

we used gripper (see Figure 4).

4.3 Classiﬁcation System

The goal of the proposed classiﬁcation system is to

separate waste objects in two different classes, after

they have been picked up and positioned in front of an

RGB camera. The chosen approach consists in recog-

nizing and classifying each object separately in order

to improve classiﬁcation accuracy.

Based on (LeCun et al., 1998), a modiﬁed LeNet

5 model was developed that works with RGB im-

ages of 150x150 pixels as inputs (that are substan-

tially bigger than the ones usually used for character

recognition by standard LeNet models) and the out-

put was modiﬁed as two classes. The ﬁrst two con-

volutional layers learn 32 and last convolutional layer

learns 64 ﬁlters, where each ﬁlter has size 3 x 3. Each

convolutional layer is followed by the ReLU activa-

tion function and by a 2 x 2 Max-Pooling in both

the x and y direction with a stride of 1. In order to

avoid overﬁtting, regularization was applied choosing

a dropout term of 0.5 after the ﬁrst dense layer of 64

units. After another ReLU activation function, there

is the last dense layer with 2 units, that are the num-

ber of the class labels in which the waste classiﬁca-

tion is performed. The proposed model was trained

over a manually labeled dataset of normalized RGB

images of the waste objects, with pixels values rang-

ing from 0 to 1. The optimization algorithm used was

stochastic gradient descent with learning rate of 0.01.

Categorical-crossentropy, also called Softmax Loss,

was selected as loss function (1):

−

∑



log ˆy

+ (1 − y

)log(1 − ˆy

)



(1)

The output of the trained CNN is represented with a

probability over the 2 classes from an input image.

4.4 Motion Planning for Grasping

Objects

For decades, motion planning has been developed for

discovering optimal robot movement. In particular,

Kinematics and Dynamics Library (KDL) and Open

Motion Planning Library (OMPL) are broadly used

to search movements of a robot arm. In ROS, Mo-

tion Planning Framework (Moveit!) (Chitta et al.,

2012) was integrated with these libraries as plug-

ins in the system architecture so that it can sup-

port self-collision avoidance with inverse kinemat-

ics to determine the feasibility of grasp. Moveit!

also can generate several possible paths to reach the

goal with sampling-based planning. In this work, be-

fore using Moveit!, collision areas were conﬁgured

in URDF (the standard robot description format in

ROS), to avoid crashing between obstacles and robot

arm. Moreover, speciﬁc positions such as delivery

places, initial robot arm position and etc. were de-

ﬁned in Semantic Robot Description Format (SRDF).

During motion planning, it was allowed to apply re-

planning process because it supports searching more

optimal path compared to the previous one generated.

Furthermore, we used trajectory following method,

which generates waypoints between the arm and goal

that could avoid collisions and maintain deﬁned end

effector pose constantly.

(a)

(b)

Figure 6: Total 30 samples (cartons and plastics) were pre-

pared to test the recycling system.

5 EXPERIMENTAL SETUP

In this section, the experimental setup that is needed

to demonstrate our system will be explained. A new

An Innovative Automated Robotic System based on Deep Learning Approach for Recycling Objects

617

Kinect2

RGB camera

Trash can

Workspace

Suction

Gripper

UR5 robotic arm

Figure 7: Experimental setup for the recycling system com-

posed of ﬁve elements: UR5 manipulator, RGB and Kinect

cameras, grasping tools, workspace, and trash can.

robotic platform is introduced, the procedure to col-

lect the dataset is described, and the system is initial-

ized. For the experiment, evaluation scenarios were

organized.

5.1 Hardware System Description

The proposed robotic platform was developed to build

a robotic recycling system during CENTAURO Re-

gional Project - iSort (2016 - 2018). It consists of

one cage with four steel bars, a robot arm (UR5), a

depth camera (Microsoft Kinect v2), a Robotiq two-

ﬁnger gripper, two suction cups (small and big) and

a Logitech mono camera. The system was organized

to allow implementation of major functionalities that

recognize waste objects and classify them according

to their material (see Figure 7).

The experimental setup was considered to build

as one in the actual industrial environment. However,

due to the safety issues with limited space, we could

not build the same as one in the environment. More-

over, operation speed was reduced because of the pro-

tection of robot and human.

5.2 Collecting the Dataset

Before collecting images for the training part, infor-

mation of cartons and plastics were recorded by a we-

bcam mounted from different angles. Image frames

were extracted automatically using ROS bag func-

tionality. The dataset collected is composed of a to-

tal of 105 sample images, which are 51 carton and

54 plastic samples. However, the dataset had really

few samples to exploit the real power of the CNNs.

In order to overcome the limitation of training exam-

ples, data augmentation was applied to the training

examples with a number of random transformations.

As a result, the dataset increased the number of train-

ing samples (total 3002 samples) and never see twice

the exact same images. This method helps to prevent

overﬁtting and support the CNN model to generalize

the situations that can be found in the actual environ-

ment. The augmented dataset will be publicly avail-

able

(a)

(b)

8.0cm

18.0cm

16.0cm

5.0cm

Figure 8: Object conﬁgurations for testing was selected

based on material and its height (z-axis). (a) An short car-

ton (SC), (b) An tall carton (TC), (c) An voluminous plastic

(VP), (d) A non voluminous plastic (NVP).

5.3 System Initialization

The robot arm installed on the steel plate was placed

out of workspace to be totally visible on depth camera

system (Figure 1). Workspace was set in the middle

of a platform with the four steel bar and it was placed

on approximately 40 cm from the bottom; this posi-

tion was considered an easy one to grasp objects. For

object classiﬁcation, a RGB camera was installed in

front of the robot arm’s initial position. Two boxes

were prepared to collect waste objects from delivery

of the arm. The arm initial position, the camera posi-

tion for classiﬁcation and the boxes for trash can po-

sitions were predeﬁned during the recycling system

operation. For the experiment, a set of total 30 ob-

jects (Figure 6) composed of 50% of carton and plas-

tic was prepared to perform the following tests: grasp

an object using a multifunctional grasping tool (suc-

tion and gripper) and classify object materials. In ad-

dition, if one carton was tested with different object

conﬁgurations (Figure 8), it was counted as two sam-

ples. Moreover, if intuitively a grasping tool could

not operate picking an object, the object was removed

using the speciﬁc tool (e.g: if width of carton or plas-

tic materials are bigger than gripper’s width or thinner

than suction’s width).

The network was trained using 60% and 20% of

the dataset as training set and validation set, and the

remaining 20% as test set. The training process was

The dataset will be released on github:

https://github.com/Alchemist77/Centauro Project

ICINCO 2019 - 16th International Conference on Informatics in Control, Automation and Robotics

618

stopped after 250 epochs. The Stochastic gradient de-

scent (SGD) was used as the optimizer to minimize

the loss function. The batch size during training was

of 16 elements and learning rate was 0.01. The net-

works were implemented using Keras with Tensor-

ﬂow frameworks (Abadi et al., 2016)

5.4 Evaluation Scenarios

To evaluate the system, we measured the success rate

and the execution time for grasping objects and ob-

ject classiﬁcation. In order to obtain the success rate

for grasping objects, ﬁrstly a waste object (carton or

plastic) was placed on the workspace letting the arm

grasp it for 5 repetition trials without human interven-

tion. To evaluate the performance of the success rate,

the percentage of the number of trials was calculated:

G(r)

100(%), r = 1, . . . , N

, (2)

where G

represents success rate, G(r) is deﬁned as

the number of success at repetition r, and N

= 5 is

the number of repetitions during grasp process. The

success rate of object classiﬁcation has the same cal-

culation procedure.

To measure the execution time for grasping ob-

jects and object classiﬁcation processes, different ini-

tial states were proposed. For grasping objects pro-

cess, object segmentation and clustering with motion

planning were considered. In contrast, the object clas-

siﬁcation process started from the grasping of the ob-

ject and the measurement ended when CNNs visual-

ized the output of object material.

In addition, the different conﬁguration of the

shape of an object was considered during the evalu-

ation of the two processes above.

6 EXPERIMENTAL RESULTS

In this section, system results are shown analyzing

ﬁrst the success rate obtained for grasping and clas-

siﬁcation and then the execution time for each task.

Furthermore, network performances in terms of loss

and accuracy are discussed.

6.1 Success Rate

Tables 1 and 2 show the results obtained from grasp-

ing using suction (Figure 9a) and gripper (Figure 9b)

respectively. Both tables are divided into 4 subgroups

according to objects material and objects physical

properties.

(a)

(b)

Figure 9: Experiments for grasping objects using the mul-

tifunctional tool. (a) Grasping an object using suction, (b)

Grasping an object using gripper.

(a) (b)

Figure 10: Experiments for classiﬁcation of objects with

text sign. (a) carton classiﬁcation (b) plastic classiﬁcation.

First, a trial with grasping cartons was done us-

ing suction with 12 short carton (SC) and 9 tall carton

(TC) samples (Figure 8 (a and b)) and each object was

tested for ﬁve times. The same approach was used for

plastic samples (4 voluminous plastic (VP) and 6 non-

voluminous plastic (NVP) samples (Figure 8 (c and

d))). The results show that suction has better perfor-

mance with plastic (90% success rate) than with car-

ton (86,09%). The main problem of grasping carton

was to relate on the dimensions of the object: if the

object was too thin, then the suction couldn’t reach

the grasping point because the segmentation could

not ﬁnd the object. When the object was too high

there was the problem that the suction crashed be-

cause there was too much pressure on the tool. Other

issues happened due to mechanical problems of the

manipulator, the presence of holes and discontinuities

(a)

(b)

Figure 11: Examples of failed classiﬁcation of objects. (a)

carton classiﬁcation (b) plastic classiﬁcation.

An Innovative Automated Robotic System based on Deep Learning Approach for Recycling Objects

619

Table 1: Success rate of grasping objects using suction.

Category Object conﬁguration

Number of

objects

Number of

attempts

Number of

successes

Total

success rate

Carton

SC 12 70 62/70 (88.57%)

99/115 (86.09%)

TC 9 45 37/45 (82.22%)

Plastic

VP 4 20 18/20 (90%)

45/50 (90%)

NVP 6 30 27/30 (90%)

Table 2: Success rate of grasping objects using gripper.

Category Object conﬁguration

Number of

objects

Number of

attempts

Number of

successes

Total

success rate

Carton

SC 5 25 21/25 (84%)

65/75 (86.67%)

TC 10 50 44/50 (88%)

Plastic

VP 5 25 19/25 (76%)

55/80 (68.75%)

NVP 11 55 36/55 (65.45%)

Table 3: Success rate of object classiﬁcations.

Category Object conﬁguration

Number of

objects

Number of

attempts

Number of

successes

Total

success rate

Carton

SC 6 30 30/30 (100%)

85/100 (85%)

TC 14 70 55/70 (78.57%)

Plastic

VP 7 35 30/35 (85.71%)

86/100 (86%)

NVP 13 65 56/65 (86.15%)

in the surface of objects and the presence of scotch

tape in some parts of the carton samples. As regard

to plastic samples, segmentation problems were oc-

curred due to the ﬂexible surface of plastic and due to

some breakage of plastic during suction operation.

As regard to the grasping part using gripper, the

results present the opposite situation: carton samples

have a better performance (86.67%) than plastic sam-

ples (68.75%). The limitation of workspace is the

main problem of grasping. In addition, rotational joint

mechanical limitations blocked the robot or brought

the manipulator to hit the object. Other two minor

problems encountered were the wrong segmentation

of an object and the height of a sample (if the object

was too high, the gripper crashes).

Summarizing, from the experimental stage was

understood that the best situation happened when

plastic was taken using suction; on the contrary, plas-

tic samples were discovered to be the worst samples

to grasp using gripper.

Concerning the classiﬁcation part, Table 3 shows

the results obtained from classiﬁcation experiments

(Figure 10a and 10b). The percentage of success rate

obtained for carton and plastic samples are quite simi-

lar: plastic samples have a slightly higher success rate

(86%) than carton samples (85%). Both percentages

are quite high and are similar to the success rates of

the grasping experiments. The problems met during

classiﬁcation experiments were due to the colour of

the samples. White and light blue cartons were de-

tected as plastic due to the similarity of these colours

with plastic colours (Figure 11a). A plastic bag was

seen, instead, as a carton and these wrong classiﬁca-

tion happened because this object was grasped even if

it was not part of the original dataset. The presence of

the sunlight also affected the success of classiﬁcation

as it occurred for the packaging material (Figure 11b).

Figure 12: Visualization of execution times for grasping and

classifying objects.

6.2 Execution Time

Figure 12 shows the execution time during classiﬁ-

cation and grasping tasks using gripper and suction

ICINCO 2019 - 16th International Conference on Informatics in Control, Automation and Robotics

620

respectively. Firstly, the time for grasping object is

different, based on the grasping tool used: the mean

value (26.50s) using suction is less than the one ob-

tained with gripper (33.19s), therefore suction process

is more fast. Also, standard deviation (SD) with suc-

tion (1.72s) is higher than SD as regarding of gripper

(7.33s). Suction grasping performances are simpler

and easier obtained compared to the ones using grip-

per. On the contrary, when the arm uses gripper, the

variability of the execution time increased because ex-

tra processes were operated to detect the object high-

est point: robot adapts the end effector orientation in

order to grasp the object. With regard to classiﬁcation

part, mean value using both tools (suction: 17.57s and

17.76s) are comparable. Moreover, SD using gripper

(1.05S) and suction (1.06s) are not substantially dif-

ferent because the trajectories, which to the camera

for classiﬁcation were already predeﬁned.

Figure 13: Accuracy value evaluation during 250 epochs for

training and test (blue and orange solid lines).

6.3 Train and Test Set Results,

Accuracy and Loss

In Figure 13 is showed the performance of the net-

work in terms of accuracy, on training and test sets

during 250 epochs. In blue and orange respectively,

the accuracy of the network on training and test

phases is represented. Accuracy reaches the value of

0.99 during training and 0.96 in the test phase. In par-

allel the optimizer algorithm leads loss to converge to

a loss function of 0.01 in training phase and 0.08 in

the test phase. Therefore, the model can perform and

generalize well on new data and it is not affected by

overﬁtting.

7 DISCUSSION AND

CONCLUSION

In this work, the development of an autonomous

robotic system was presented. This system is able to

grasp objects and sort them according to their mate-

rial compositions (carton or plastic), in order to fos-

ter recycling practices in industries. The mainly nov-

elties of this work are two: building a preliminary

framework for benchmarking industrial applications

in sorting management and integrating functionalities

as image processing, motion planning, grasping and

classiﬁcation in a unique robotic structure. Another

challenging aspect of this work is the use of a mul-

tifunctional end effector equipped with both gripper

and suction tools; this multifunctionality increased

the success rate during the grasping process, reduc-

ing the probability of error. During the experiments,

only the bigger suction was used. For future work,

both types of suction will be applied for selecting the

right one, according to the object dimensions. A lim-

itation of the proposed work was that we did not con-

sidered grasping in cluttered environments. For this

reason, this grasping part could be a challenging ﬁeld

to analyse in the future (ten Pas et al., 2017). An-

other issue of this work concerns the dataset: orig-

inal dataset should be integrated in order to have a

large-scale dataset, allowing the classiﬁcation system

to more generalize on new objects, furthermore a new

approach based on learning more features should be

found, to have a better classiﬁcation of materials (Si-

monyan and Zisserman, 2014) . Finally, another chal-

lenging idea could be handling a greater variety of

objects and new groups of material like glass and or-

ganic would generalize an automated recycling sys-

tem completely. (Zeng et al., 2017).

ACKNOWLEDGEMENT

This work was supported by the Tuscany’s regional

research project CENTAURO: Colavoro, Efﬁcienza,

preveNzione nell’industria dei motoveicoli mediante

Tecnologie di AUtomazione e RObotica. bando FAR-

FAS 2014.

REFERENCES

Abadi, M., Barham, P., Chen, J., Chen, Z., Davis, A.,

Dean, J., Devin, M., Ghemawat, S., Irving, G., Isard,

M., et al. (2016). Tensorﬂow: A system for large-

scale machine learning. In 12th {USENIX} Sympo-

An Innovative Automated Robotic System based on Deep Learning Approach for Recycling Objects

621

sium on Operating Systems Design and Implementa-

tion ({OSDI} 16), pages 265–283.

Adnan, N. H. and Mahzan, T. (2015). Gesture recognition

based on human grasping activities using pca-bmu.

Gesture, 6(11).

Awe, O., Mengistu, R., and Sreedhar, V. (2017). Smart

trash net: waste localization and classiﬁcation. arXiv

preprint.

Bostelman, R., Hong, T., and Marvel, J. (2016). Survey

of research for performance measurement of mobile

manipulators. Journal of Research of the National In-

stitute of Standards and Technology, 121:342–366.

Chitta, S., Sucan, I., and Cousins, S. (2012). Moveit![ros

topics]. IEEE Robotics & Automation Magazine,

19(1):18–19.

Chu, K., Zhang, Q., Han, H., Xu, C., Pang, W., Ma, Y.,

Sun, N., and Li, W. (2017). A systematic review and

meta-analysis of nonpharmacological adjuvant inter-

ventions for patients undergoing assisted reproductive

technology treatment. International Journal of Gyne-

cology & Obstetrics, 139(3):268–277.

Chu, Y., Huang, C., Xie, X., Tan, B., Kamal, S., and Xiong,

X. (2018). Multilayer hybrid deep-learning method

for waste classiﬁcation and recycling. Computational

Intelligence and Neuroscience, 2018.

Cruz, L., Lucio, D., and Velho, L. (2012). Kinect and rgbd

images: Challenges and applications. In 2012 25th

SIBGRAPI Conference on Graphics, Patterns and Im-

ages Tutorials, pages 36–49. IEEE.

Dai, W., Sun, Y., and Qian, X. (2013). Functional analysis

of grasping motion. In 2013 IEEE/RSJ International

Conference on Intelligent Robots and Systems, pages

3507–3513. IEEE.

Donoho, D. L. and Grimes, C. (2003). Hessian eigen-

maps: Locally linear embedding techniques for high-

dimensional data. Proceedings of the National

Academy of Sciences, 100(10):5591–5596.

Feng, C., Taguchi, Y., and Kamat, V. R. (2014). Fast plane

extraction in organized point clouds using agglomera-

tive hierarchical clustering. In Robotics and Automa-

tion (ICRA), 2014 IEEE International Conference on,

pages 6218–6225. IEEE.

Gundupalli, S. P., Hait, S., and Thakur, A. (2017). A review

on automated sorting of source-separated municipal

solid waste for recycling. Waste management, 60:56–

74.

LeCun, Y., Bottou, L., Bengio, Y., Haffner, P., et al. (1998).

Gradient-based learning applied to document recogni-

tion. Proceedings of the IEEE, 86(11):2278–2324.

Lei, Q., Chen, G., and Wisse, M. (2017). Fast grasping of

unknown objects using principal component analysis.

AIP Advances, 7(9):095126.

Lu, Y. (2017). Industry 4.0: A survey on technologies, ap-

plications and open research issues. Journal of Indus-

trial Information Integration, 6:1–10.

Masuta, H., Motoyoshi, T., Sawai, K. K. K., and Oshima,

T. (2016). Plane extraction using point cloud data for

service robot. In Computational Intelligence (SSCI),

2016 IEEE Symposium Series on, pages 1–6. IEEE.

Mittal, G., Yagnik, K. B., Garg, M., and Krishnan, N. C.

(2016). Spotgarbage: smartphone app to detect

garbage using deep learning. In Proceedings of the

2016 ACM International Joint Conference on Per-

vasive and Ubiquitous Computing, pages 940–945.

ACM.

Nguyen, A. and Le, B. (2013). 3d point cloud segmentation:

A survey. In RAM, pages 225–230.

Ni, H., Lin, X., and Zhang, J. (2017). Classiﬁcation of als

point cloud with improved point cloud segmentation

and random forests. Remote Sensing, 9(3):288.

Nurunnabi, A., Belton, D., and West, G. (2012). Robust

segmentation in laser scanning 3d point cloud data.

In Digital Image Computing Techniques and Appli-

cations (DICTA), 2012 International Conference on,

pages 1–8. IEEE.

Rad, M. S., von Kaenel, A., Droux, A., Tieche, F., Ouer-

hani, N., Ekenel, H. K., and Thiran, J.-P. (2017). A

computer vision system to localize and classify wastes

on the streets. In International Conference on Com-

puter Vision Systems, pages 195–204. Springer.

Rusu, R. B. and Cousins, S. (2011). 3d is here: Point cloud

library (pcl). In Robotics and automation (ICRA),

2011 IEEE International Conference on, pages 1–4.

IEEE.

Simonyan, K. and Zisserman, A. (2014). Very deep con-

volutional networks for large-scale image recognition.

arXiv preprint arXiv:1409.1556.

ten Pas, A., Gualtieri, M., Saenko, K., and Platt, R. (2017).

Grasp pose detection in point clouds. The Interna-

tional Journal of Robotics Research, 36(13-14):1455–

1473.

Tenenbaum, J. B., De Silva, V., and Langford, J. C. (2000).

A global geometric framework for nonlinear dimen-

sionality reduction. science, 290(5500):2319–2323.

Thilagamani, S. and Moorthi, S. (2011). A survey on image

segmentation through clustering. International Jour-

nal of Research and Reviews in Information Sciences,

Vo, A.-V., Truong-Hong, L., Laefer, D. F., and Bertolotto,

M. (2015). Octree-based region growing for point

cloud segmentation. ISPRS Journal of Photogramme-

try and Remote Sensing, 104:88–100.

Xiao, J., Zhang, J., Adler, B., Zhang, H., and Zhang, J.

(2013). Three-dimensional point cloud plane seg-

mentation in both structured and unstructured en-

vironments. Robotics and Autonomous Systems,

61(12):1641–1652.

Zeng, A., Song, S., Yu, K., Donlon, E., Hogan, F. R., Bauz

M., Ma, D., Taylor, O., Liu, M., Romo, E., Fazeli,

N., Alet, F., Daﬂe, N. C., Holladay, R., Morona, I.,

Nair, P. Q., Green, D., Taylor, I., Liu, W., Funkhouser,

T. A., and Rodriguez, A. (2017). Robotic pick-and-

place of novel objects in clutter with multi-affordance

grasping and cross-domain image matching. CoRR,

abs/1710.01330.

ICINCO 2019 - 16th International Conference on Informatics in Control, Automation and Robotics

622