2 RELATED WORK
2.1 Traditional Techniques
Traditional techniques for cardboard package detec-
tion and pose estimation have laid the groundwork
for modern advancements in automated depalletis-
ing systems. These methods, while foundational, of-
ten struggle with limitations in complex and dynamic
industrial environments. One of the primary tradi-
tional methods is RFID-based detection. RFID tags
are attached to packages to facilitate identification
and tracking throughout the logistics process. For
instance, (Bouzakis and Overmeyer, 2010) demon-
strated the use of RFID tags to describe the geome-
try of cardboard packages, enabling automated ma-
nipulation by industrial robots. Furthermore, RFID
systems can detect package tampering and openings
by analyzing changes in the radiation profile caused
by the movement of RFID-based antennas, as high-
lighted by (Wang et al., 2020).
Another technique involves terahertz imaging,
which utilizes terahertz waves to screen folded card-
board boxes for inserts or anomalies. This method
offers high-speed and unambiguous detection capa-
bilities, as noted by (Brinkmann et al., 2017). Vi-
sual monitoring and machine vision systems also play
a crucial role. (Casta
˜
no-Amoros et al., 2022) ex-
plored the use of low-cost sensors and deep learn-
ing techniques to detect and recognize different types
of cardboard packaging on pallets, optimizing ware-
house logistics. Electrostatic techniques, as described
by (Hearn and Ballard, 2005), leverage electrostatic
charges to identify and sort waste packaging mate-
rials, differentiating between plastics and cardboard.
Additionally, nonlinear ultrasonic methods, investi-
gated by (Ha and Jhang, 2005), are employed to de-
tect micro-delaminations in packaging by analyzing
harmonic frequencies generated by ultrasonic waves.
While these traditional methods provide valuable
insights and capabilities, they often face challenges
such as accuracy, speed, cost, and environmental in-
terference. These limitations have driven the devel-
opment and adoption of more advanced techniques,
particularly those based on deep learning.
2.2 Deep Learning Techniques
Deep learning techniques have revolutionized the
field of cardboard package detection and pose estima-
tion, offering significant improvements in accuracy,
robustness, and efficiency. Convolutional Neural Net-
works (CNN) (Figure 1) form the backbone of these
advancements, enabling the development of sophisti-
cated models that can handle complex environments
with ease. Models like YOLO (You Only Look Once)
and SSD (Single Shot MultiBox Detector) have set
new benchmarks for real-time object detection. These
models balance speed and accuracy, making them
highly suitable for industrial applications where quick
and precise detection is crucial.
Our 2023 study, (Yesudasu et al., 2023) ex-
plores the application of YOLOv3 for object detec-
tion in automated depalletization systems. YOLOv3
is renowned for its speed and accuracy, making it
an ideal choice for real-time detection of cardboard
packages on a pallet. The detection process in their
study is seamlessly integrated with a pose estimation
algorithm, enabling the system to determine the ori-
entation and position of each package. This integra-
tion significantly enhances the efficiency and preci-
sion of the depalletization task. However, the pre-
vious system primarily handled free cardboard boxes
without addressing the complexities of varied box lo-
cations and orientations. Additionally, it had limita-
tions in detecting gaps between packages, a critical
factor for optimizing the depalletization process. By
learning hierarchical feature representations directly
from data, these models excel in identifying and lo-
calizing objects in diverse and challenging scenarios.
Deep learning extends beyond CNN to include archi-
tectures such as Deep Boltzmann Machines (DBM),
Deep Belief Networks (DBN), and Stacked Denois-
ing Autoencoders. These models have been success-
fully applied to various tasks, including face recogni-
tion, activity recognition, and human pose estimation.
The versatility of deep learning in handling different
computer vision challenges underscores its potential
in cardboard package detection and pose estimation.
Figure 1: Architecture of a typical Convolutional Neural
Network (Monica et al., 2020).
Significant strides have been made in object detec-
tion with models like Faster R-CNN, YOLOv3, and
SSD. These models use region proposal networks,
grid-based prediction, and multi-scale feature extrac-
tion to achieve high accuracy and efficiency. For ex-
ample, Faster R-CNN integrates a region proposal
network for efficient object detection, while YOLOv3
Advanced Techniques for Corners, Edges, and Stacked Gaps Detection and Pose Estimation of Cardboard Packages in Automated
Dual-Arm Depalletising Systems
265