loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Authors: Aidana Nurakhmetova ; Jean Lahoud and Hisham Cholakkal

Affiliation: Department of Computer Vision, Mohamed bin Zayed University of Artificial Intelligence, Masdar City, Abu Dhabi, U.A.E.

Keyword(s): 3D Point Clouds, Data-Efficient Transformer, 3D Object Detection.

Abstract: Recent 3D detection models rely on Transformer architecture due to its natural ability to abstract global context features. One is the 3DETR network - a pure transformer-based model designed to generate 3D boxes on indoor dataset scans. It is generally known that transformers are data-hungry. However, data collection and annotation in 3D are more challenging than in 2D. Thus, our goal is to study the data-hungriness of the 3DETR-m model and propose a solution for its data efficiency. Our methodology is based on the observation that PointNet++ provides more locally aggregated features that can be useful to support 3DETR-m prediction on small dataset problem. We suggest three methods of backbone fusion that are based on addition (Fusion I), concatenation (Fusion II), and replacement (Fusion III). We utilize pre-trained weights from the Group-free model trained on the SUN RGB-D dataset. The proposed 3DETR-m outperforms the original model in all data proportions (10%, 25%, 50%, 75%, and 100%). We improve 3DETR-m paper results by 1.46% and 2.46% in mAP@25 and mAP@50 on the full dataset. Hence, we believe our research efforts can provide new insights into the data-hungriness issue of 3D transformer detectors and inspire the usage of pre-trained models in 3D as one way towards data efficiency. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 216.73.216.61

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Nurakhmetova, A., Lahoud, J. and Cholakkal, H. (2023). Data-Efficient Transformer-Based 3D Object Detection. In Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2023) - Volume 4: VISAPP; ISBN 978-989-758-634-7; ISSN 2184-4321, SciTePress, pages 615-623. DOI: 10.5220/0011673200003417

@conference{visapp23,
author={Aidana Nurakhmetova and Jean Lahoud and Hisham Cholakkal},
title={Data-Efficient Transformer-Based 3D Object Detection},
booktitle={Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2023) - Volume 4: VISAPP},
year={2023},
pages={615-623},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011673200003417},
isbn={978-989-758-634-7},
issn={2184-4321},
}

TY - CONF

JO - Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2023) - Volume 4: VISAPP
TI - Data-Efficient Transformer-Based 3D Object Detection
SN - 978-989-758-634-7
IS - 2184-4321
AU - Nurakhmetova, A.
AU - Lahoud, J.
AU - Cholakkal, H.
PY - 2023
SP - 615
EP - 623
DO - 10.5220/0011673200003417
PB - SciTePress