Table 4: Example calculation of cloud processing costs for
5000 users for the computing mechanisms. (Hardware and
energy cost of the edge devices is not taken into account).
Cloud only Edge only Split5 Split10 Split13
#users/instance 10 228 29 32 143
#instances
total
500 22 173 157 35
total costs/month e3763.5 e165.594 e1302.171 e1181.739 e263.445
monthly costs/user e0.7526 e0.0331 e0.2604 e0.2363 e0.0526
Also, we calculated the costs of running the t2.micro
instance for these 5000 users for a month were calcu-
lated. Results are listed in table 4.
As we can see in table 4, pricing can change dras-
tically depending on the used computing mechanism.
We notice that the costs depend on the number of
needed instances, thus on the maximum amount of
users per instance. When we combine the results of
table 3 and table 4, we can calculate the total costs
per month for the global pipeline. Results of these
costs can be found in table 5 for the Jetson Nano and
table 6 for the Raspberry Pi. We assume that the de-
vices keep functioning one year, such that the pur-
chase cost of the edge hardware needs to be venti-
lated over 12 months. As previously mentioned, we
can see a huge difference between the various com-
puting mechanisms. Another noticeable fact, which
we already noticed in table 3, is the huge difference in
expenses depending on the used edge device. These
originate from the energy costs difference and the to-
tal costs of purchasing 5000 devices.
We can conclude that a balance needs to be found
between the costs of the edge device and the used
mechanism depending on the requirements of the ap-
plication. For example, when using an edge-only
method we can see cloud costs are minimal, but as
stated earlier a splitted model may have more speed
performance benefits. This benefit has the disadvan-
tage of a higher monthly cost due to the higher CPU
usage of the cloud. This trade-off between time per-
formance and cost is visualised in fig. 10. It is re-
markable that indeed for different situations, differ-
ent distribution mechanisms are optimal. We observe
that, for smaller edge devices, edge-only computation
is the cheapest and the fastest solution for this net-
work, while for larger devices a middle split between
edge and cloud is preferable.
6 CONCLUSION
In this paper, we studied which parameters determine
the optimal distribution of CNN computations be-
tween edge and cloud computing to maximise speed
and performance benefits. In order to answer this
question we studied a practical use case, the detec-
tion of litter on a fleet of cloud-connected embedded
devices. We trained a MobilenetV2 CNN model, and
performed experiments on a self designed pipeline in
which we splitted this CNN model over the edge and
cloud instances.
We first conducted research on the three differ-
ent instances separately. Our used instances were two
possible edge devices: a Jetson Nano and a Raspberry
Pi, and the cloud. These each have distinctive specifi-
cations, hence different performance values.
In our experiments, we splitted the MobilenetV2
architecture between the convolution layers. We com-
pared these partitioned models with ordinary comput-
ing mechanisms (egde-only and cloud-only) by tim-
ing each interesting process of the pipeline. When
analysing the tests done on the 160 × 160 model, we
can see a partitioned model has more gain in ben-
efits. This behaviour can be linked to the amount
of data sent over. We observed that edge comput-
ing can reduce latency by sending over less data, but
this is counterbalanced by the computing power of the
cloud.
For a larger input image size, 224×224, the model
gave different results. The obvious difference we first
notice is the prediction times on both the edge and
the cloud increased. Due to the change in input size
of the neural network, intermediate activation tensor
sizes change simultaneously. Because of this split13
becomes the fastest choice for the Jetson Nano as op-
posed to the 160 × 160 model. Because of the weaker
computational performance of the Raspberry Pi and
the good computing power of the cloud, the fastest
mechanism is offloading this larger model to the cloud
and sending over the resized input image.
As we see that sending intermediate results over
the network consumes a substantial amount of the to-
tal latency, we investigated the influence of the speed
of this network. We conclude that networks with lim-
ited specifications (like 4G) indeed have an impact
on the performance of the pipeline. The most bene-
ficial mechanism for this situation is running the en-
tire model on the edge device, indeed sending over the
least amount of data.
Next to an optimisation towards maximal speed
performance, we also studied the economical cost
of each mechanism, an optimisation criterium that
is orthogonal to performance. When speed is im-
portant, we can conclude a high bandwidth together
with a splitted mechanism and a powerful but expen-
sive edge device is recommended. Economically, our
choice of the used mechanism makes a huge differ-
ence in expenses. From our detailed cost estimate,
we can state that when the CPU usage of the cloud in-
creases, costs will increase simultaneously since more
instances are needed to divide the workload. This
Optimal Distribution of CNN Computations on Edge and Cloud
611