
ures 8c and 8d), thus almost guaranteeing that it will
be visible when images are taken of that towel type.
As can be seen in the overview of the dataset in
table 2 as well as in the confusion matrices in sec-
tion 5 (figures 9 and 10), the dataset is imbalanced,
with some classes containing more images than oth-
ers. This is caused by the availability of towels dur-
ing data collection, i.e. there were more towels avail-
able for some towel types than others, as well as the
data collection process being automatic. It has been
attempted to minimize the effects of an imbalanced
dataset by matching the distribution of the full dataset
when splitting it into training, validation and testsets.
As seen in results of the network trained on the full
dataset to solve the type classification problem (ta-
ble 3), the imbalance doesn’t seem to be a problem,
since classes with a relatively large amount of data
(’VDK’) achieve comparable metrics to classes with
relatively small amounts of data (’GreyStriped’, ’Yel-
lowStriped’).
7 CONCLUSION
In conclusion, this paper presents a proof of concept
which is capable of capturing and processing images
of towels being processed by Inwatecs BLIZZ using
two mounted depth cameras. Furthermore, a CNN
network has been developed, which classifies both the
type and face of towels. This proof of concept can be
used by BLIZZ to improve its functionality, enabling
it to deliver towels to folding machines more consis-
tently, while also improving its versatility.
A dataset consisting of six different types of towel,
of which three have non-identical faces and total-
ing 22152 images has been collected and labelled.
The developed image classification network has been
trained and tested on this dataset, resulting in an ac-
curacy of 99.10% when it is trained to solve only the
type classification problem. Likewise, the proposed
network trained to solve only the face classification
problem achieves an accuracy of 94.48%, 97.71% and
98.52% on three different datasets consisting of im-
ages of just the ’Rentex’, ’BathTowel’ and ’Nedlin’
towel types, respectively. Comparatively, when the
proposed network is trained to solve both classifica-
tion problems, it achieves an accuracy of 96.96%.
REFERENCES
Dawson-Howe, K. (2014). A Practical Introduction to
Computer Vision with OpenCV. Wiley Publishing, 1st
edition.
Evidently AI (2023). Accuracy, precision, and recall in
multi-class classification. https://www.evidentlyai.
com/classification-metrics/multi-class-metrics. Ac-
cessed: 18-09-2023.
Gabas, A., Corona, E., Aleny
`
a, G., and Torras, C. (2016).
Robot-aided cloth classification using depth informa-
tion and cnns. In Articulated Motion and Deformable
Objects.
Goodfellow, I., Bengio, Y., and Courville, A. (2016). Rep-
resentation Learning, page 536. MIT Press. http:
//www.deeplearningbook.org.
He, K., Zhang, X., Ren, S., and Sun, J. (2015). Deep resid-
ual learning for image recognition.
Inwatec (2023a). Inwatec. https://inwatec.dk/. Accessed:
14-09-2023.
Inwatec (2023b). Thor. https://inwatec.dk/products/
thor-robot-separator/. Accessed: 14-09-2023.
Ioffe, S. and Szegedy, C. (2015). Batch normalization: Ac-
celerating deep network training by reducing internal
covariate shift.
Kingma, D. P. and Ba, J. (2017). Adam: A method for
stochastic optimization.
Lyngbye, M. A. and Jakobsgaard, M. S. (2021). Garment
classification using neural networks. Master’s thesis,
University of Southern Denmark, Odense, Denmark.
Maitin-Shepard, J., Cusumano-Towner, M., Lei, J., and
Abbeel, P. (2010). Cloth grasp point detection based
on multiple-view geometric cues with application to
robotic towel folding. In 2010 IEEE International
Conference on Robotics and Automation, pages 2308–
2315.
Paulauskaite-Taraseviciene, A., Noreika, E., Purtokas,
R., Lagzdinyte-Budnike, I., Daniulaitis, V., and
Salickaite-Zukauskiene, R. (2022). An intelligent so-
lution for automatic garment measurement using im-
age recognition technologies. Applied Sciences, 12(9).
Ronneberger, O., Fischer, P., and Brox, T. (2015). U-
net: Convolutional networks for biomedical image
segmentation. In Navab, N., Hornegger, J., Wells,
W. M., and Frangi, A. F., editors, Medical Image Com-
puting and Computer-Assisted Intervention – MICCAI
2015, pages 234–241, Cham. Springer International
Publishing.
Sewts (2023). Velum. https://www.sewts.com/velum/. Ac-
cessed: 14-09-2023.
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I.,
and Salakhutdinov, R. (2014). Dropout: A simple way
to prevent neural networks from overfitting. J. Mach.
Learn. Res., 15(1):1929–1958.
Stanford Vision Lab (2023). Imagenet. https://www.
image-net.org/. Accessed: 14-09-2023.
Tensorflow (2023). Resnet50-preprocess
input.
https://www.tensorflow.org/api docs/python/tf/keras/
applications/resnet50/preprocess input. Accessed:
27-08-2023.
Yu, Y., Wang, C., Fu, Q., Kou, R., Huang, F., Yang, B.,
Yang, T., and Gao, M. (2023). Techniques and chal-
lenges of image segmentation: A review. Electronics,
12(5).
VISAPP 2024 - 19th International Conference on Computer Vision Theory and Applications
316