Depth Map Estimation of Focus Objects Using Vision Transformer

Chae-rim Park, Kwang-il Lee, Seok-je Cho

2022

Abstract

Estimating Depth map from image is critical in a variety of tasks such as 3D object detection and extraction. In particular, it is an essential task for Robot, AR/VR, Drone, and Autonomous vehicles, and plays an important role in Computer vision. In general, stereo technique is used to extract the Depth map. It matches two images a different locations in the same scene and determines and output the distance according to the size of the relative motion. In this paper, I propose a method for extracting Depth map using vision Transformer(ViT) through input images from various environments. After automatically focusing on the object in the image using ViT, semantic segmentation is performers to Computer vision, and fine-tuning images with fewer resources represents a better Depth map.

Download


Paper Citation


in Harvard Style

Park C., Lee K. and Cho S. (2022). Depth Map Estimation of Focus Objects Using Vision Transformer. In Proceedings of the 3rd International Symposium on Automation, Information and Computing - Volume 1: ISAIC; ISBN 978-989-758-622-4, SciTePress, pages 150-155. DOI: 10.5220/0011908700003612


in Bibtex Style

@conference{isaic22,
author={Chae-rim Park and Kwang-il Lee and Seok-je Cho},
title={Depth Map Estimation of Focus Objects Using Vision Transformer},
booktitle={Proceedings of the 3rd International Symposium on Automation, Information and Computing - Volume 1: ISAIC},
year={2022},
pages={150-155},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011908700003612},
isbn={978-989-758-622-4},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 3rd International Symposium on Automation, Information and Computing - Volume 1: ISAIC
TI - Depth Map Estimation of Focus Objects Using Vision Transformer
SN - 978-989-758-622-4
AU - Park C.
AU - Lee K.
AU - Cho S.
PY - 2022
SP - 150
EP - 155
DO - 10.5220/0011908700003612
PB - SciTePress