a domain, that is well suited for downstream tasks
based on object geometry. Applying our approach
to different tasks, we showed that the image-to-line
transformation can be decoupled from these down-
stream tasks, and we presented various methods to fa-
cilitate the transformation. Our experiments showed
that our method can be a drop-in to improve object de-
tection quality, even using datasets with semi-realistic
synthetic data. The intermediate line representation
also enables novel augmentation methods, further im-
proving network generalization to real-world data. Fi-
nally, we demonstrated how our approach could be
used in a real-world use case by training a network to
identify objects in a bin-picking scenario without any
real training images.
Despite the success of our approach, there are ar-
eas for further exploration and optimization. The pro-
jection of images to their abstract representation is an
additional step that requires computation time, and
optimizing the runtime should be a focus in follow-
up work. One promising idea is to use knowledge
distillation with a student-teacher approach producing
smaller image-to-image networks based on the pre-
sented ones. Additionally, we believe the downstream
networks can be trimmed down since the abstract in-
put data contains condensed, more meaningful data
than pure color images.
ACKNOWLEDGEMENTS
This work has been supported by the Schaeffler
Hub for Advanced Research at Friedrich-Alexander-
Universit
¨
at Erlangen-N
¨
urnberg (SHARE at FAU).
REFERENCES
Bochkovskiy, A., Wang, C.-Y., and Liao, H.-Y. M. (2020).
Yolov4: Optimal speed and accuracy of object detec-
tion. arXiv preprint arXiv:2004.10934.
Brachmann, E. (2020). 6D Object Pose Estimation using
3D Object Coordinates [Data].
Canny, J. (1986). A computational approach to edge de-
tection. IEEE Transactions on pattern analysis and
machine intelligence, pages 679–698.
Chan, C., Durand, F., and Isola, P. (2022). Learning to
generate line drawings that convey geometry and se-
mantics. In Proceedings of the IEEE/CVF Conference
on Computer Vision and Pattern Recognition, pages
7915–7925.
Chang, A. X., Funkhouser, T., Guibas, L., Hanrahan,
P., Huang, Q., Li, Z., Savarese, S., Savva, M.,
Song, S., Su, H., et al. (2015). Shapenet: An
information-rich 3d model repository. arXiv preprint
arXiv:1512.03012.
Ciresan, D. C., Meier, U., Masci, J., Gambardella, L. M.,
and Schmidhuber, J. (2011). Flexible, high perfor-
mance convolutional neural networks for image clas-
sification. In Twenty-second international joint con-
ference on artificial intelligence. Citeseer.
Croitoru, F.-A., Hondru, V., Ionescu, R. T., and Shah, M.
(2023). Diffusion models in vision: A survey. IEEE
Transactions on Pattern Analysis and Machine Intel-
ligence.
DeCarlo, D., Finkelstein, A., Rusinkiewicz, S., and San-
tella, A. (2003). Suggestive contours for conveying
shape. In ACM SIGGRAPH 2003 Papers, pages 848–
855. ACM New York, NY, USA.
Denninger, M., Sundermeyer, M., Winkelbauer, D., Zi-
dan, Y., Olefir, D., Elbadrawy, M., Lodhi, A., and
Katam, H. (2019). Blenderproc. arXiv preprint
arXiv:1911.01911.
Dosovitskiy, A., Fischer, P., Ilg, E., Hausser, P., Hazirbas,
C., Golkov, V., Van Der Smagt, P., Cremers, D., and
Brox, T. (2015). Flownet: Learning optical flow with
convolutional networks. In Proceedings of the IEEE
international conference on computer vision, pages
2758–2766.
Drost, B., Ulrich, M., Bergmann, P., Hartinger, P., and Ste-
ger, C. (2017). Introducing mvtec itodd-a dataset for
3d object recognition in industry. In Proceedings of
the IEEE international conference on computer vision
workshops, pages 2200–2208.
Foundation, B. (2022). Blender.
Girshick, R. (2015). Fast r-cnn. In Proceedings of the IEEE
international conference on computer vision, pages
1440–1448.
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B.,
Warde-Farley, D., Ozair, S., Courville, A., and Ben-
gio, Y. (2020). Generative adversarial networks. Com-
munications of the ACM, 63(11):139–144.
Goodman, N. (2022). Languages of art. In Lexikon
Schriften
¨
uber Musik, pages 293–376. Springer.
Harary, S., Schwartz, E., Arbelle, A., Staar, P., Abu-
Hussein, S., Amrani, E., Herzig, R., Alfassy, A.,
Giryes, R., Kuehne, H., et al. (2022). Unsupervised
domain generalization by learning a bridge across do-
mains. In Proceedings of the IEEE/CVF Conference
on Computer Vision and Pattern Recognition, pages
5280–5290.
Hertzmann, A. (2021a). The role of edges in line drawing
perception. Perception, 50(3):266–275.
Hertzmann, A. (2021b). Why do line drawings work? a re-
alism hypothesis. Journal of Vision, 21(9):2029–2029.
Hodan, T., Haluza, P., Obdr
ˇ
z
´
alek,
ˇ
S., Matas, J., Lourakis,
M., and Zabulis, X. (2017). T-less: An rgb-d dataset
for 6d pose estimation of texture-less objects. In 2017
IEEE Winter Conference on Applications of Computer
Vision (WACV), pages 880–888. IEEE.
Isola, P., Zhu, J.-Y., Zhou, T., and Efros, A. A. (2017).
Image-to-image translation with conditional adversar-
ial networks. In Proceedings of the IEEE conference
on computer vision and pattern recognition, pages
1125–1134.
VISAPP 2024 - 19th International Conference on Computer Vision Theory and Applications
728