Figure 2.1.6: Azure Kinect 1024x1024 K4aviewer results
visualizing the scenes from RGB, Infrared & Depth
cameras corresponding to 720p resolution.
Figure 2.1.7: Azure Kinect 1024x1024 K4aviewer results
visualizing the Point Cloud scene corresponding to 720p
resolution.
The figures from 2.1.4 to 2.1.7 are representing
the RGB, Infrared, Depth and Point Cloud results
obtained from Azure Kinect DK correspondingly.
2.2 NVIDIA Jetson Tx2
NVIDIA Jetson Tx2 designed explicitly for High
performance AI on-the-edge applications, housing
both GPU and CPU on the same System-on-Chip
(SoC) achieving faster processing supporting the
emerging deep neural networks. (NVIDIA, 2021).
Jetson Tx2 leverages Pascal GPU architecture to
increase performance upon Streaming Multiprocessor
(SM). The Graphics Processing Cluster (GPC)
includes multiple SM units and a Raster Engine for
computing, rasterization, shading and texturing.
Jetson Tx2 runs more efficiently between 5 watts
at max efficiency and 15 watts at max performance.
(NVIDIA, 2021). It has better power efficiency
making it ideal for AI on-the-edge applications such
as autonomous vehicles, drones, virtual reality and
mixed reality applications.
The figure 2.2.1 corresponds to NVIDIA Jetson
Tx2 connected with the Azure Kinect DK leveraging
the GPU parallel computing capabilities on the data
obtained from the camera.
Figure 2.2.1: Edge AI Setup Leveraging Azure Kinect DK
Accelerated with NVIDIA Jetson Tx2.
2.3 Emergence of AI-on-the-Edge
With the rise of Industry - 5.0 we are at the cusp of
technology explosion driving innovation.
Convergence of various technologies such as Big
Data, Artificial Intelligence and Deep Learning
incorporates AI on-the-edge applications.
With these rapidly evolving technologies such as
Intel Movidius VPUs, NVIDIA GPUs, Intel Nervana
Neural Network Processors (NNP), Google Tensor
Processing Units (TPUs), Intel FPGA, Xilinx FPGA
etc., the deep learning algorithms are emerging with
advanced architectures leading to better performance,
reducing latency and achieving higher throughput
exploring unlimited possibilities leveraging
Disruptive Technologies.
Leveraging the state of art technologies with deep
learning architectures opens New Product Market
providing immersive Business solutions. Exploring
various approaches towards computer vision leads us
to CNN, DNN, CuDNN, Mask R-CNN, Mesh R-
CNN, LSTM, GoogLeNet, ResNet, SegNet and
YOLO approaches which are trained on 2D data.
With the focus on Industry - 5.0 (Atwell, 2017)
real-time applications, point-cloud based 3D deep
learning is gaining pace rapidly as it involves an end-
to-end deep learning network acquiring features
directly from the point clouds.
3D object recognition, 3D object segmentation
and point-wise semantic segmentation tasks are
crucial component for applications which are tightly
constrained by hardware resources and battery.
Therefore, it is important to design efficient and fast
3D deep learning models for real-time applications on
the edge such as virtual reality, mixed reality and
autonomous driving.