2 RELATED WORK
Generating data for training different models of
convolutional neural networks is a rather actual topic.
Therefore, various teams of researchers are
developing algorithms for creating synthetic datasets.
The topic of creating synthetic data is discussed in
some resources:
a) The paper which considers the benefits of
synthetic data generation for СNN training (The
Ultimate Guide to Synthetic Data in 2020);
b) The research on using ray tracers to create
training databases (John B. McCormac, 2018).
There are some tools which are able to make
synthetic data for СNN learning.
a) A simple GUI-based COCO-style JSON
Polygon masks' annotation tool to facilitate the quick
and efficient crowd-sourced generation of annotation
masks and bounding boxes. Optionally, one could
choose to use a pre-trained Mask RCNN model to
come up with initial segmentations. This tool could
be used for hand-made annotation of existing images
(Hans Krupakar, 2018).
b) This project is a development of the project
mentioned in the previous paragraph, the
development of this tool is continued by the team of
programmers, who are interested in this field. The
original functionality has been saved and refined
(Hans Krupakar, 2018).
However, a tool that could create high-quality
annotated sets of multiple overlapped objects has not
been implemented yet.
c) Nvidia Deep learning Dataset Synthesizer
(NDDS) a UE4 plugin from Nvidia (J. Tremblay, T.
To, A. Molchanov, S. Tyree, J. Kautz, S. Birchfield,
2018) (J. Tremblay, T. To, S. Birchfield, 2018) to
empower computer vision researchers to export high-
quality synthetic images with metadata. NDDS
supports images, segmentation, depth, object pose,
bounding box, key points, and custom stencils. In
addition to the exporter, the plugin includes different
components for generating highly randomized
images. This randomization includes lighting,
objects, camera position, poses, textures, and
distractors, as well as camera path following, etc.
Together, these components make it possible for
researchers to easily create randomized scenes for
training deep neural networks.
The strong features of the Nvidia tool are:
Ability of using a physical engine;
Flexibility of GUI-based basic scene
configuring;
Possibility of using colored meshes and RGB-D
point clouds.
The main weak features of the Nvidia tool are as
follows:
UE4 dependence ( CUDA and graphics need);
Batch mode is problematic;
External scene configuration is complicated for
realization.
3 PROPOSED METHOD
We would like to present an approach and the tool,
which is able to generate a synthetic dataset for a
batch of mesh- defined objects in an automatic mode
based on ray-tracing.
Ray-tracing is a process of modelling the real
physical process of light reflection and consumption.
The approach allows us to generate realistic images
and therefore could be able to present high-quality
training datasets based on artificial images only.
This tool is based on POV-Ray physical core
(POV-Ray – The Persistence of Vision Raytracer).
The main target of the current project is developing a
python based tool for making artificial images from
mesh models which could be easily implemented into
a self-learning process. All instances should be easily
configured by using text-based config files. The tool
could be used without long packet dependencies.
3.1 Process of Image Creation
Images are generated by using the Ray Tracer —
POV-Ray. The Persistence of Vision Raytracer
(POV-Ray: Download) is a high-quality, Free
Software tool for creating stunning three-dimensional
graphics (POV-Ray: Hall of Fame). The source code
is available for those wanting to do their own
research.
Creating realistic images significantly depends on
the configuration of the lighting sources.
Image generation uses one primary point white
light source for general lighting: RGB intensity =
(1.0,1.0,1.0). The light source is used with a common
brightness factor 1.0, and also has the ability to set the
rotation angle relative to the camera located at a
distance equal to the removal of the camera and four
additional fixed spot light sources of low intensity
(intensity - [0.4.0.4.0.4]) spaced from the Z-axis by
angles (75,0,0) (-75,0,0) (0,75,0) (0, -75.0) without
the possibility of changing its position from the
configuration file. The primary light illuminates the
geometry of the part, highlighting its features.
Additional lighting sources compensate for "rigidity"
and provide backlight for shaded areas of details.
KMIS 2020 - 12th International Conference on Knowledge Management and Information Systems