Authors:
Shuichi Akizuki
1
and
Manabu Hashimoto
2
Affiliations:
1
Department of Electrical Engineering, Keio University, Hiyoshi, Yokohama, Japan, Department of Engineering, Chukyo University, Nagoya, Aichi and Japan
;
2
Department of Engineering, Chukyo University, Nagoya, Aichi and Japan
Keyword(s):
Dataset Generation, Semantic Segmentation, Affordance.
Related
Ontology
Subjects/Areas/Topics:
Computer Vision, Visualization and Computer Graphics
;
Features Extraction
;
Image and Video Analysis
;
Segmentation and Grouping
Abstract:
In this research, we propose a method using a low cost process to generate large volumes of real images as training data for semantic segmentation. The method first estimates the six-degree-of-freedom (6DoF) pose for objects in images obtained using an RGB-D sensor, and then maps labels that have been pre-assigned to 3D models onto the images. It also captures additional input images while the camera is moving, and is able to map labels to these other input images based on the relative motion of the viewpoint. This method has made it possible to obtain large volumes of ground truth data for real images. The proposed method has been used to create a new publicity available dataset for affordance segmentation, called the NEDO Part-Affordance Dataset v1, which has been used to benchmark some typical semantic segmentation algorithms.