FotonNet: A Hardware-efﬁcient Object Detection System using 3D-depth

Segmentation and 2D-deep Neural Network Classiﬁer

Gurjeet Singh

, Sunmiao

, Shi Shi

and Patrick Chiang

1,2,3

Dept. of EECS, Oregon State University, Corvallis, U.S.A.

State Key Laboratory of ASIC & System, Fudan University, Shanghai, China

PhotonIC Technologies, Shanghai, China

Keywords:

Object Detection, 3D Data, Hardware, Depth Sensors.

Abstract:

Object detection and classiﬁcation is one of the most crucial computer vision problems. Ever since the in-

troduction of deep learning, we have witnessed a dramatic increase in the accuracy of this object detection

problem. However, most of these improvements have occurred using conventional 2D image processing.

Recently, low-cost 3D-image sensors, such as the Microsoft Kinect (Time-of-Flight) or the Apple FaceID

(Structured-Light), can provide 3D-depth or point cloud data that can be added to a convolutional neural net-

work, acting as an extra set of dimensions. We are proposing a hardware-based approach for Object Detection

by moving region of interest identiﬁcation closer to sensor node in the hardware. Due to this approach, we do

not need a large dataset with depth images to retrain the network. Our 2D + 3D system takes the 3D-data to

determine the object region followed by any conventional 2D-DNN, such as AlexNet. In this method, our ap-

proach can readily dissociate the information collected from the Point Cloud and 2D-Image data and combine

both operations later. Hence, our system can use any existing trained 2D network on a large image dataset

and does not require a large 3D-depth dataset for new training. Experimental object detection results across

30 images show an accuracy of 0.67, whereas 0.54 and 0.51 for FasterRCNN and YOLO, respectively.

1 INTRODUCTION

The ability for robots and computers to see and under-

stand the environment is becoming a burgeoning ﬁeld,

needed in autonomous vehicles, augmented reality,

drones, facial recognition, and robotic helpers. Such

systems require the detection and classiﬁcation of ob-

jects to perform various tasks. In 2012, Krizhevsky

et al. (Krizhevsky et al., 2012) (Deng et al., ) in-

troduced the CNN (Convolutional Neural Network)

based Deep Neural Network technique for image clas-

siﬁcation which outperformed existing methods. Af-

ter the successful implementation of image classiﬁ-

cation, Ren et al. (Ren et al., 2015) implemented

Region-Based CNN for Object Detection. The Fully

Convolutional Network provided a solution to seman-

tic segmentation, followed by Mask RCNN (Long

et al., 2015), which further improved the capabilities

of the Deep Neural Network.

All the networks mentioned above were assuming

a standard 2D-image dataset, due to the wide preva-

lence of low-cost 2D-image sensors (i.e., cellphones

with open-source sharing of their RGB data images).

Recently, low-cost 3D-image sensors that can be em-

bedded into cellphones (i.e., FaceID, Google Tango)

have begun arising, enabling the potential for ubiq-

uitous RGB+D sensor data, that can be used to im-

prove object detection. Previous work from Gupta

et al. (Gupta et al., 2014) trained a FasterRCNN on

RGBD dataset NYUD (Silberman et al., 2012), using

HHA encoding of 3D-depth images with 2D-images.

This approach considers depth data as an additional

NN input and not as independent values coming from

different sensors, and thereby needs a large amount of

depth data to train the neural network. Unfortunately,

since 3D-depth sensors are not widely available as of

yet, the ability to create a clean, accurate, and well-

annotated RGB + Point Cloud dataset (~5000 images)

as large as ImageNet (~1.2 million images) is still not

possible. Unavailability of depth sensors also lim-

its researchers from testing a large number of corner

cases (i.e., difﬁcult scenarios such as different SNR

situations for 2Dimages or 3D-depth data.) Hence,

the accuracy of a CNN trained with a small amount of

RGB+D data will be inferior to a CNN trained with a

large amount of RGB data.

Singh, G., Sunmiao, ., Shi, S. and Chiang, P.

FotonNet: A Hardware-efﬁcient Object Detection System using 3D-depth Segmentation and 2D-deep Neural Network Classiﬁer.

DOI: 10.5220/0008958604610468

In Proceedings of the 9th International Conference on Pattern Recognition Applications and Methods (ICPRAM 2020), pages 461-468

ISBN: 978-989-758-397-1; ISSN: 2184-4313

461

We are purposing a hardware architecture with

3D and 2D sensors in which the 3D sensor has in-

tegrated Region-of-Interest detection, and the 2D im-

age is used for classiﬁcation. This architecture sepa-

rates the use of depth data from image data. Due to

such architecture, there is no need for a large depth

dataset for training the neural network. Also, we can

use a general-purpose classiﬁcation network trained

on large image datasets such as Imagenet which re-

duces the need to retrain the network again and again

as well as on local node.

2 PREVIOUS WORK

In this section, we overview and discuss some prior

object detection techniques and network architectures

used for images and depth datasets.

2.1 Faster R-CNN

Ren et. al. (Ren et al., 2015) proposed a Region

Proposal Network (RPN) to obtain bounding region

boxes. These boxes are then sent to a classiﬁca-

tion layer to decide whether it contains an object or

not. Since there are overlapping regions of bound-

ing boxes, non-max suppression (NMS) is used to

combine different adjacent bounding boxes. For their

experiment, both RPN and classiﬁcation networks

share the VGG-16 model (13 CNN and 3 FCN lay-

ers) for feature extraction and the NMS limit to 0.7,

resulting in bounding boxes of 2000 for each im-

age. Gupta et al. (Gupta et al., 2014) used Faster

RCNN to learn rich features from RGBD images of

the NYUD dataset. They transformed depth informa-

tion into 3-channels into HHA transformation. Figure

1 shows the basic operation of Faster R-CNN, show-

ing the anchor boxes using the Region Proposal Net-

work(RPN), and then the ﬁnal output.

Figure 1: Region Proposal Network and the Output from

Pascal VOC (Ren et al., 2015).

2.2 Mask R-CNN

He et. al.(He et al., 2017) demonstrates Mask R-

CNN which is an extension of Faster R-CNN. Mask

R-CNN extends Faster R-CNN by adding an object

mask detector in parallel for instance segmentation.

Faster RCNN does not provide a pixel representa-

tion of the object in a scene. Instead, it just provides a

bounding-box around an object. To address this prob-

lem, He et al. uses an ROI align layer that gives pixel

to pixel representation from the bounding boxes. Fur-

thermore, there is an ROI pool layer that combines

local bounding boxes. Figure 2 demonstrates the ROI

pool layer and instance segmentation layers used for

instance segmentation that operates on top of Object

Detection.

Figure 2: Mask R-CNN architecture for instance segmenta-

tion (He et al., 2017).

2.3 You Only Look Once (YOLO)

Redmon et al. (Redmon et al., 2016) propose a

much simpler neural network for Object Detection.

This network does not need Region Proposal Network

(RPN) layer for training. Instead, the picture is di-

vided into several predesigned anchor boxes. Next,

Redmon et al. run classiﬁers such that overlapping

bounding boxes are removed using Non-max suppres-

sion (NMS). The advantage of this network lies in its

end-to-end training. In this way, Redmon et al. can

train networks faster and easier by sacriﬁcing some

accuracy. Figure 3 shows a detailed YOLO architec-

ture with 24 convolutional layers and three fully con-

nected layers.

Figure 3: YOLO architecture (Redmon et al., 2016).

2.4 VoxelNet

VoxelNet (Zhou and Tuzel, 2017) is a unique way of

running object detection just on 3D point-cloud data

(with no other sensor data input). Point Clouds are

ICPRAM 2020 - 9th International Conference on Pattern Recognition Applications and Methods

462

sparse, such that points clouds are converted to voxels

and therefore just samples with certain chosen thresh-

olds. Next, random samples are chosen and converted

to point-wise inputs for feature learning. Figure 4

shows various layers and ﬁnal output for VoxelNet ar-

chitecture.

Figure 4: VoxelNet architecture (Zhou and Tuzel, 2017).

2.5 3D Object Detection Networks

Ever since the introduction of depth data from sen-

sors such as Microsoft Kinect, researchers are trying

to incorporate depth data into computer vision prob-

lems. Lin et al. proposed a method for 3D object de-

tection(Lin et al., 2013) using 2D segmentation and

3D geometry. Song et al. (Song et al., 2015) pro-

pose deep sliding models for Amodal 3D object de-

tection. Schwarz et. al.(Schwarz et al., 2018) used

multiview RGBD cameras to perform heuristics on

depth data for their object picking robot. Gupta et

al. (Gupta et al., 2014) provided HHA encoding to

train the network on Faster RCNN. Gupta and Hoff-

man provided several transfer learning techniques to

transfer weights from 2D network to depth data (Hoff-

man et al., 2016) (Gupta et al., 2016).

2.6 Shortcomings of Prior Art

Most of the networks previously described methods

use only conventional 2D image datasets. NYUD

possesses RGB+D data, but unfortunately, the dataset

exhibits a very high density of cluttered objects, with

many of these objects exhibiting no depth features

(curtains, windows, etc.) Unfortunately, with such

a cluttered RGB+D dataset, any training using RGB

data and depth data, depth data will not contribute

much to learning. For example, we previously tried to

use the depth data of NYUD for training, but because

there are not many features to learn from depth. It

is challenging to stop overﬁtting because of the small

number of images.

In regards to VoxelNet, the network is trained with

a number of different objects (Car, Bike, etc.), such

that training for another network with a different set of

objects is impractical. For both NYUD and VoxelNet,

a large amount of 3D depth data for regular training

of neural-network is required. Unfortunately, there

currently exists no 3D-depth dataset large enough to

rival conventional 2D-image datasets (ImageNet, etc.)

Ever since the publication of RGBD datasets like

NYUD and SUN RGBD (Song et al., 2015), re-

searchers are trying to incorporate depth dataset into

learning and detection processes. Methods range

from using heuristics to use encoding techniques such

as HHA. Researchers also tried to use transfer learn-

ing techniques to train depth weights using ImageNet

datasets. All these techniques are computationally in-

tensive and hard to move closer to hardware such as a

SOC. These techniques also make these systems more

complex.

3 PROPOSED WORK

In this paper, we propose a DNN Object Detection

system that dissociates the depth-data from the RGB

data. In this way, our proposed system does not

need to require a large training dataset for the depth

data but still simultaneously extracts meaningful in-

formation from the depth sensor. Furthermore, our

system combines this depth data with conventional

2D-trained image data to generate a practical, low-

complexity object detection and classiﬁcation system.

This paper is constructed as follows. First, we will

give a system overview of our system, followed by

test results, a comparison with current state-of-the-art

systems, and then concluding with our future research

direction.

3.1 System Overview

Figure 5 shows an overview of the system architec-

ture. We designed our system with the understanding

that there currently exists no large dataset of depth

data for training. First, our system generates bound-

ing boxes from the depth sensor output by perform-

ing clustering on the 3D point cloud. After denois-

ing and cleaning of the clustered depth objects, we

use the clustered objects to split the 2D image into

sub-images which are fed to a 2D-Image Deep Neu-

ral Classiﬁcation Network.For Threshold Filtering, a

threshold for the classiﬁcation top-1 accuracy is set to

judge the availability of the classiﬁcation result. If the

top-1 accuracy is greater than the threshold, then the

output is accepted. Detailed operation for each sub-

system is explained below.

FotonNet: A Hardware-efﬁcient Object Detection System using 3D-depth Segmentation and 2D-deep Neural Network Classiﬁer

463

Figure 5: FotonNet System Architecture.

3.2 Bounding-box Detection

The bounding box detector generates bounding boxes

proposals based upon the different clusters of point

cloud data. In this system, the K-means Clustering

Algorithm (MacQueen et al., 1967) is picked to pro-

duce a limited number of boxes. Compared with

another classical algorithm Hierarchical clustering

(Ward Jr, 1963), K-means clustering have a low com-

plexity when the number of clusters is ﬁxed. When

the cluster number is 8, for 30000 points of point

cloud the speed of these two algorithms is 28.08s and

1.54s (based on python 2.7, CPU only) separately.

Considering the current depth cameras, they can of-

fer around 3 ~40000 points. So K-means is more ef-

ﬁcient in this situation. The algorithm ﬂow is shown

in Algorithm 1. We set the total number of objects

we can detect as C depending on the scenario. In this

dataset for the indoor scene, we set C to 8 which can

obtain a satisﬁed result. At the same time, this method

also limits the upper amount we can detect. Due to

the complexity of K-means depending on the itera-

tion times, the number of point cloud is decreased

linearly to speed up. In this paper, the number of

points in point cloud is around 30000. For K-means

clustering, it costs much time to calculate distances

between the current centroids and every point. As a

useful solution observed, decreasing the number of

point cloud linearly before we input the point cloud

into K-means clustering can solve this problem efﬁ-

ciently. The more amount is discarded, then the speed

is higher. In the same time, the accuracy will be ef-

fected for losing some critical information. Balancing

the accuracy and speed, the ratio is set to 0.7, and the

distance metric is Euclidean distance.

3.3 Denoising

After the clustering step, several simple denoising

steps have implemented that act as a pre-processing

and image cleanup that improves the subsequent ob-

ject detection. First, the two clusters from the two

top corners are eliminated since they are likely pe-

ripheral background images and not the main images

for classiﬁcation. Second, small-sized clusters which

are a small percentage ratio of the entire point-cloud

Algorithm 1: Algorithm for K-means cluster detec-

tor:

Parameter setting: C(C > real number of

objects) Set each point in the pointcloud P as

a cluster

Initialize the center centroids

(µ

, , µ

), (µ

( j = 1, ...., k)) randomly:

while

∑

j=0

||µ

i+1

− µ

> δ do

for every i do

i) Compute the distance between each

cluster to the current k centroids.;

ii) Relabel the nearest centroid to the

cluster.;

iii) Compute the new average point in a

cluster as the new centroid.;

iv) Compute the difference between the

new generating centroids and last

centroids

∑

j=0

||µ

i+1

− µ

. ;

end

are judged as noise (not real objects) and are also re-

moved. Several other sub-images that are extraneous,

such as point-clouds from the wall and the ground,

have not been eliminated in this preliminary imple-

mentation.

3.4 Sub-images

After the detection of several different bounding

boxes, we split the image into several sub-images so

that we can perform object-detection on several ob-

jects detected by the depth sensor. The 3D-depth

sensor is calibrated with the 2D-camera such that

the bounding boxes can be projected on to the 2D-

image plane, using the method from Park et al. (Park

et al., 2014). In particular, we utilize Parks polygonal

method and calculate the Projection Matrix for Sin-

gular Value Decomposition (SVD).

3.5 Deep Neural Network

Our proposed Object Detection system has the

advantage in that it can use any existing classiﬁcation

network that has been trained on large datasets of

2D images (as opposed to VoxelNet or NYUD). The

advantage of this system is that we can use an already

pre-trained network and therefore do not need to

train a new network for depth data. For this initial

experiment, we utilize AlexNet due to its widespread

availability.

ICPRAM 2020 - 9th International Conference on Pattern Recognition Applications and Methods

464

Figure 6: Architecture for FotonNet.

One advantage for our system is that it imple-

ments AlexNet inference using available systems like

NVDLA open-source deep learning inference engine

(NVD, b), thereby enabling a future low-cost ASIC

implementation. For the performance estimation of

the system, we used a ZYNQ Ultrascale+ SoC to run

the corresponding parts of the entire system, includ-

ing the image clustering, preprocessing, and AlexNet

inference. The preprocessing part is written in C++,

compiled with the gcc-linario arm-linux cross com-

piler. Sub-images are generated accordingly and fed

into the NVDLA User Mode Driver. The NVDLA

small conﬁguration hardware is programmed into the

programmable logic (PL) part of the Zynq SoC. An

NVDLA full conﬁguration can run AlexNet at 1100

Frames per second with 2048 Multiply-accumulate

units (MACs) when its batch size is 16 (NVD, a). The

NVDLA small conﬁguration we are using exhibits 64

MACs, which uses 32 times less hardware than the

full conﬁguration, as it is not the bottleneck of speed

for our implementation. The implementation work-

ﬂow of the entire system is shown in Figure 7.

3.6 Object Detection Output

After Classiﬁcation, we obtain the output with corre-

sponding bounding boxes. We apply a threshold λ on

the ﬁnal sub-images in order to determine if there are

any extra bounding boxes. Depending on the ability

of CNN, if the object in the proposed box is not valid,

then the top-1 score is further lower than a true one. In

our experiments, the classiﬁcation probability below

a certain threshold (i.e., 0.2) will be discarded..

4 EXPERIMENT

Since the depth data is used here to perform cluster-

ing, it is difﬁcult to compare our system with previ-

ous benchmarks. In order to perform a side by side

comparison of our system versus other algorithms, we

captured both the 2D-Image and 3D-Depth data of 30

images with various indoor objects inside. We tested

our proposed FotonNet using these 30 images and

point cloud data, while for the conventional networks

(Girshick et al., 2018) and YOLO darknet (Redmon

and Farhadi, 2018), we only used the RGB images

from the scenes and ran through detection. The out-

put of FotonNet is shown in Figure 8. We also tried to

train Faster RCNN, YOLO and AlexNet on NYUDv2

dataset with 40 classes with both image and depth

data. All the networks were trained for about 150

epochs and trained on 795 training images and tested

on 654 images. Even after converging error the net-

works were failed to achieve signiﬁcant object detec-

tion ability. We assume it is because of the small num-

ber of training images and high and dense objects in

the scenes.

We tested this system with our 30 test images and

calculated the mean Average IOU. The Average IOU

for our system is 0.72413, with a measured. The re-

sults of our architecture and the parameters sets are as

follows:

• Clustering Time (ARM A53): 0.5 sec

• Classiﬁcation Time: 0.0545 sec

In order to compare our system versus others, we

ran our test images through each of the networks,

thereby providing the IOU, latency, and size of each

network. We also provide latency and accuracy as re-

ported by existing neural networks. For the AlexNet

network using Xilinx 's ChaiDNN library on ZCU104

FPGA. We used onboard ARM A53 for clustering

tasks. The speed of our system then will depend on

FotonNet: A Hardware-efﬁcient Object Detection System using 3D-depth Segmentation and 2D-deep Neural Network Classiﬁer

465

Figure 7: Implementation Architecture.

Figure 8: Final output of the system.

the number of clusters we generate. Assuming on av-

erage we generate 8 clusters, the speed of our sys-

tem is approximately 55fps. Please note, we are as-

suming that the clustering algorithm is available and

integrated into the on-board 3D-depth sensor. How-

ever, the speed can be easily increased if we use the

NVDLA classiﬁcation engine, which reports a speed

of 1100 fps. One signiﬁcant advantage of FotonNet

is its ease-of-deployment. By dissociating depth from

the RGB data, it will be easier to expand object de-

tection to even more classes and neural networks than

just limited Object Detection classes. Thus, this ap-

proach makes it easier to implement and expand than

currently existing systems.

5 CONCLUSION

We demonstrate a new HW-system architecture for

object detection which leverages low-cost 3D depth

sensors. The size of this network is signiﬁcantly re-

duced compared with the status quo because of the

reduction in the number of ROIs and the elimination

of any extra training steps needed for object classiﬁ-

cation. We used this approach to keep in mind that we

are going to move Boundary box detection on-chip in

Depth Sensor. We found that our approach outper-

forms YOLO and Faster-RCNN in object detection.

In this paper, we experimented with the hierarchical

clustering algorithm. We will try to come up with

a faster clustering algorithm and try to migrate that

functionality on the chip.

REFERENCES

Nvdla performance. https://github.com/nvdla/hw/blob/

nvdlav1/perf/NVDLA\ OpenSource\ Performance.

xlsx. Accessed: 2018-11-14.

Nvdla primer. http://nvdla.org/primer.html. Accessed:

2018-11-14.

Deng, J., Dong, W., Socher, R., Li, L., Li, K., and Fei-

Fei, L. I. A large-scale hierarchical image database.

computer vision and pattern recognition, 2009. cvpr

2009. In IEEE Conference on, pages 248–255.

Girshick, R., Radosavovic, I., Gkioxari, G., Doll

ar, P.,

and He, K. (2018). Detectron. https://github.com/

facebookresearch/detectron.

Gupta, S., Girshick, R., Arbel

aez, P., and Malik, J. (2014).

Learning rich features from rgb-d images for object

detection and segmentation. In European Conference

on Computer Vision, pages 345–360. Springer.

ICPRAM 2020 - 9th International Conference on Pattern Recognition Applications and Methods

466

Figure 9: a. Sample output from Faster RCNN /Mask RCNN b. Sample output from YOLOv3 c. Sample output from

FotonNet.

Table 1: Comparison of System with Existing State of the art methods.

Network & Usability & Rough Estimate of HW-Complexity & Accuracy & Latency (FPS) & No. of

training Parameters

Faster RCNN Mask RCNN & No training data for depth & Large Network & 0.759 (COCO test (mAP))

0.53968 (30 images(IOU)) & 7 & VGG16 + RPN + Regression Layer

YOLO & No training data for depth & Large Network & 0.634 (COCO test(mAP)

0.506849 (30 images(IOU))& 21 & 24 Conv Layer + 2 FC Layer

FotonNet & No need for training when using large classiﬁcation network & Low Complexity, Cluster and

Classiﬁcation can be moved on chip & —- (COCO Test)

0.67 (30 images(IOU)) & 55(Assuming Clustering is performed by sensor locally) & Using Pre- Trained

AlexNet

Gupta, S., Hoffman, J., and Malik, J. (2016). Cross modal

distillation for supervision transfer. In Proceedings of

the IEEE Conference on Computer Vision and Pattern

Recognition, pages 2827–2836.

He, K., Gkioxari, G., Doll

ar, P., and Girshick, R. (2017).

Mask r-cnn. In Computer Vision (ICCV), 2017

IEEE International Conference on, pages 2980–2988.

IEEE.

Hoffman, J., Gupta, S., Leong, J., Guadarrama, S., and Dar-

rell, T. (2016). Cross-modal adaptation for rgb-d de-

tection. In 2016 IEEE International Conference on

Robotics and Automation (ICRA), pages 5032–5039.

IEEE.

Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2012). Im-

agenet classiﬁcation with deep convolutional neural

networks. In Advances in neural information process-

ing systems, pages 1097–1105.

Lin, D., Fidler, S., and Urtasun, R. (2013). Holistic scene

understanding for 3d object detection with rgbd cam-

eras. In Proceedings of the IEEE International Con-

ference on Computer Vision, pages 1417–1424.

Long, J., Shelhamer, E., and Darrell, T. (2015). Fully con-

volutional networks for semantic segmentation. In

Proceedings of the IEEE conference on computer vi-

sion and pattern recognition, pages 3431–3440.

MacQueen, J. et al. (1967). Some methods for classiﬁcation

and analysis of multivariate observations. In Proceed-

ings of the ﬁfth Berkeley symposium on mathematical

statistics and probability, volume 1, pages 281–297.

Oakland, CA, USA.

Park, Y., Yun, S., Won, C. S., Cho, K., Um, K., and Sim, S.

(2014). Calibration between color camera and 3d lidar

instruments with a polygonal planar board. Sensors,

14(3):5333–5353.

Redmon, J., Divvala, S., Girshick, R., and Farhadi, A.

(2016). You only look once: Uniﬁed, real-time object

detection. In Proceedings of the IEEE conference on

FotonNet: A Hardware-efﬁcient Object Detection System using 3D-depth Segmentation and 2D-deep Neural Network Classiﬁer

467

computer vision and pattern recognition, pages 779–

788.

Redmon, J. and Farhadi, A. (2018). Yolov3: An incremental

improvement. arXiv.

Ren, S., He, K., Girshick, R., and Sun, J. (2015). Faster

r-cnn: Towards real-time object detection with region

proposal networks. In Advances in neural information

processing systems, pages 91–99.

Schwarz, M., Milan, A., Periyasamy, A. S., and Behnke,

S. (2018). Rgb-d object detection and semantic seg-

mentation for autonomous manipulation in clutter.

The International Journal of Robotics Research, 37(4-

5):437–451.

Silberman, N., Hoiem, D., Kohli, P., and Fergus, R. (2012).

Indoor segmentation and support inference from rgbd

images. In European Conference on Computer Vision,

pages 746–760. Springer.

Song, S., Lichtenberg, S. P., and Xiao, J. (2015). Sun rgb-

d: A rgb-d scene understanding benchmark suite. In

Proceedings of the IEEE conference on computer vi-

sion and pattern recognition, pages 567–576.

Ward Jr, J. H. (1963). Hierarchical grouping to optimize an

objective function. Journal of the American statistical

association, 58(301):236–244.

Zhou, Y. and Tuzel, O. (2017). Voxelnet: End-to-end learn-

ing for point cloud based 3d object detection. arXiv

preprint arXiv:1711.06396.

ICPRAM 2020 - 9th International Conference on Pattern Recognition Applications and Methods

468