How to Train an Accurate and Efficient Object Detection Model on any Dataset

Galina Zalesskaya, Bogna Bylicka, Eugene Liu

2023

Abstract

The rapidly evolving industry demands high accuracy of the models without the need for time-consuming and computationally expensive experiments required for fine-tuning. Moreover, a model and training pipeline, which was once carefully optimized for a specific dataset, rarely generalizes well to training on a different dataset. This makes it unrealistic to have carefully fine-tuned models for each use case. To solve this, we propose an alternative approach that also forms a backbone of Intel® Geti™ platform: a dataset-agnostic template for object detection trainings, consisting of carefully chosen and pre-trained models together with a robust training pipeline for further training. Our solution works out-of-the-box and provides a strong baseline on a wide range of datasets. It can be used on its own or as a starting point for further fine-tuning for specific use cases when needed. We obtained dataset-agnostic templates by performing parallel training on a corpus of datasets and optimizing the choice of architectures and training tricks with respect to the average results on the whole corpora. We examined a number of architectures, taking into account the performance-accuracy trade-off. Consequently, we propose 3 finalists, VFNet, ATSS, and SSD, that can be deployed on CPU using the OpenVINO™ toolkit. The source code is available as a part of the OpenVINO™ Training Extensionsa

Download


Paper Citation


in Harvard Style

Zalesskaya G., Bylicka B. and Liu E. (2023). How to Train an Accurate and Efficient Object Detection Model on any Dataset. In Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2023) - Volume 5: VISAPP; ISBN 978-989-758-634-7, SciTePress, pages 770-778. DOI: 10.5220/0011781400003417


in Bibtex Style

@conference{visapp23,
author={Galina Zalesskaya and Bogna Bylicka and Eugene Liu},
title={How to Train an Accurate and Efficient Object Detection Model on any Dataset},
booktitle={Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2023) - Volume 5: VISAPP},
year={2023},
pages={770-778},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011781400003417},
isbn={978-989-758-634-7},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2023) - Volume 5: VISAPP
TI - How to Train an Accurate and Efficient Object Detection Model on any Dataset
SN - 978-989-758-634-7
AU - Zalesskaya G.
AU - Bylicka B.
AU - Liu E.
PY - 2023
SP - 770
EP - 778
DO - 10.5220/0011781400003417
PB - SciTePress