We used our custom implementation for neural networks, which was implemented
to run on GPU. The training of a DRN on the synthetic dataset took less than 8 minutes
using an NVIDIA GTX-770 graphics card.
2.2 Input Data
The training vectors are labeled positive if the QR code coverage ratio is higher than
a selected T
c
threshold for that block. Typically, F-score peaks at T
c
≈ 0.5, while
T
c
≈ 0.1 leads to better recall (hit rate). However, the number of these partially covered
blocks is one scale smaller than the one of empty and fully covered blocks, therefore T
c
is not a determinative parameter of the training. Furthermore, even if the DRN misses
partially covered blocks, that only means it misses the perimeter of the code object, and
expansion of the positively classified cell groups of the feature matrix overcomes this
issue.
The input vectors for the DRN are one-dimensional vectors, formed by the quan-
tized DCT coefficients of a 8×8 px block. During the decoding, the multiplication with
the quantization table can be omitted for two reasons. On one hand, due to the non-
linear nature of neural networks, they are capable to learn on a vector set and on the
same set multiplied element-wise with another fixed vector, with similar efficiency. The
other reason is that, the components of the input vectors were normalized as described
in [4] to have zero mean and unit variance. This normalization improved the numerical
condition of the optimization problem during training, ensuring faster convergence.
According to this setup, the DRN has to be trained using images of the same com-
pression level as images of the end-user application, since we are not using the original
DCT matrix, which would require a multiplication by the quantization table. Without
de-quantization, different levels of compression applied to the same image content lead
to different vectors. These vectors can be far away from the training samples of a spe-
cific compression level. To overcome this, DRNs can be trained using the de-quantized
coefficient vectors, which are roughly the same on similar compression levels. Detailed
evaluation of this concept is in the Results section (Fig. 4).
3 Evaluation and Results
The test database consists of 10 000 synthetic and 100 arbitrarily acquired images con-
taining QR code. The synthetic examples are built with a computer-generated QR code
containing all of the lower- and uppercase letters of the alphabet in random order. This
QR code was placed on a random negative image, with perspective transformation.
After that, Gaussian smoothing and noise have been gradually added to the images,
ranging [0,3] for the σ of the Gaussian kernel. For noise, a noise image (I
n
) was gener-
ated with intensities ranging from [-127, 127] following normal distribution, and added
gradually to the original 8-bit image (I
o
) as I = αI
n
+(1−α)I
o
, with α ranging [0, 0.5].
Some samples with parameters being in the discussed ranges are present in (Fig. 2). A
total number of 42 million vectors were extracted from those images, about 10 mil-
lons of them are labeled as positive. Real images were taken with a 3.2 Mpx Huawei
hand-held phone camera. Significant smoothing is present on those images due to the
40