delay is inserted into the shadow chain to permit
duplicate request removal.
The resynchronisation recovery process has been
evaluated. In the case of link disconnect the time of
failure is 10 seconds and the recovery occurs at 95ms
per request with overall throughput of 13.1 requests
per second over a one minute test duration. The
maximum request latency is 1.7s in this case. For the
node reboot failure case the recovery time is
marginally faster than the network disconnection, but
fewer requests need to be buffered and the
corresponding request replay time is longer at 235ms.
This is due to the node taking some time to return to
normal operation state. The maximum latency is also
1.7s in this case.
4 CONCLUSIONS
In hybrid edge/cloud microservice deployment it is
necessary to consider the resilience of stateful
services carefully in order to fully support continuous
operation without performance degradation. The
proposed approach, using Envoy based proxies, is a
promising method of permitting replication and
recovery of service state, by using shadowing,
without requiring performance limiting distributed
file system storage or database replication solutions
across the edge cloud boundary. The approach
proposed and evaluated within this paper has many
advantages as it does not require services or
underlying storage solutions to be modified and does
not impact significantly on throughput and latency
performance. For instance, 17% reduction in
throughput was observed for large update requests
and negligible additional latency is introduced in the
primary request paths. However, it is evident that
some limitations arise and have been mostly
overcome or avoided. In particular, the independence
between different client application sessions is
assumed. In which case, the use of sticky session load
balancing such as Maglev or hash tables can avoid the
need for a cluster-wide shared PV storage which can
be difficult or undesirable to support. Also, avoiding
the use of cross-session and service state that would
create conflicts between parallel sessions leading to
complex state synchronisation. In addition, it is
necessary that updates do not have time dependent
logic (or the original time of request need to be
emulated in the service). This approach has been
evaluated in the context of retail services which
support transaction handling, inventory management
and promotion services. These require consistent and
replicated service state data, across the edge/cloud,
without loss to permit continuous operation, with low
overhead, even under node failure and cloud network
disconnect events.
Further work is needed to evaluate the
improvements that are possible by integrating the
duplicate removal and caching functionality into the
Envoy proxies, rather than keeping them separate.
REFERENCES
Sampaio, A., Rubin, J., Beschastnikh, I. and Rosa, N.
(2019) Improving microservice-based applications with
runtime placement adaptation. J Internet Serv Appl 10,
4 (2019). https://doi.org/10.1186/s13174-019-0104-0
Vayghan, A. L., Saied, M., Toeroe, M. and Khendek, F.,
(2019) Microservice Based Architecture: Towards
High-Availability for Stateful Applications with
Kubernetes, 2019 IEEE 19th International Conference
on Software Quality, Reliability and Security (QRS),
2019, pp. 176-185, doi: 10.1109/QRS.2019.00034
Mendonca, N., Aderaldo, C., Camara, J. and Garlan, D.
(2020) Model-Based Analysis of Microservice
Resiliency Patterns, In 2020 IEEE International
Conference on Software Architecture (ICSA), 2020, pp.
114-124, doi: 10.1109/ICSA47634.2020.00019.
Wang, Y., Xia, Y., Zhang, Y., Melissourgos, D., Odegbile,
J., Chen, S. (2021) A Full Mirror Computation Model
for Edge-Cloud Computing, In IC3 '21: 2021
Thirteenth International Conference on Contemporary
Computing (IC3-2021) August 2021 Pages 132–139
https://doi.org/10.1145/3474124.3474142
Baboi, M., Iftene, A., Gîfu, D. (2019). Dynamic
Microservices to Create Scalable and Fault Tolerance
Architecture. In Procedia Computer Science, Volume
159, 2019, Pages 1035-1044, ISSN 1877-0509
Dhamane, R., Patino, M., Valerio, M., Peris, R. (2014)
Performance Evaluation of Database Replication
Systems. In Proceedings of the 18th International
Database Engineering & Applications Symposium
IDEAS, July 2014 Pages 288–293 https://doi.org/
10.1145/2628194.2628214
Kang, Z., An, K., Gokhale, A., Pazandak, P. (2021). A
Comprehensive Performance Evaluation of Different
Kubernetes CNI Plugins for Edge-based and
Containerized Publish/Subscribe Applications.
Conference: 9th IEEE International Conference on
Cloud Engineering 10.1109/IC2E52221.2021.00017.
Johansson, A. (2022). HTTP Load Balancing Performance
Evaluation of HAProxy, NGINX, Traefik and Envoy
with the Round-Robin Algorithm (Dissertation).
Retrieved from: http://urn.kb.se/resolve?urn=urn:nbn:
se:his:diva-21475
Eisenbud, D. E., Yi, C., Contavalli C., Smith, C., Kononov,
R., Mann-Hielscher, E., Cilingiroglu, A., Cheyney, B.,
Shang, W., Hosein, J. D., (2016) Maglev: A Fast and
Reliable Software Network Load Balancer, In 13th
USENIX Symposium on Networked Systems Design and
Implementation (NSDI 16), USENIX Association, Santa
Clara, CA (2016), pp. 523-535.