IEEE Transactions on Pattern Analysis and Machine
Intelligence.
He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep resid-
ual learning for image recognition. In Proceedings of
the IEEE conference on computer vision and pattern
recognition, pages 770–778.
He, Z., Zhang, T., and Lee, R. B. (2019). Model inversion
attacks against collaborative inference. In Proceed-
ings of the 35th Annual Computer Security Applica-
tions Conference, pages 148–162.
Herrmann, C., Bowen, R. S., and Zabih, R. (2020). Channel
selection using gumbel softmax. In European Confer-
ence on Computer Vision, pages 241–257. Springer.
Jang, E., Gu, S., and Poole, B. (2016). Categorical repa-
rameterization with gumbel-softmax. arXiv preprint
arXiv:1611.01144.
Jeong, H.-J., Lee, H.-J., Shin, C. H., and Moon, S.-
M. (2018). Ionn: Incremental offloading of neural
network computations from mobile devices to edge
servers. In Proceedings of the ACM Symposium on
Cloud Computing, pages 401–411.
Kang, Y., Hauswald, J., Gao, C., Rovinski, A., Mudge, T.,
Mars, J., and Tang, L. (2017). Neurosurgeon: Col-
laborative intelligence between the cloud and mobile
edge. ACM SIGARCH Computer Architecture News,
45(1):615–629.
Kingma, D. P. and Welling, M. (2013). Auto-encoding vari-
ational bayes. arXiv preprint arXiv:1312.6114.
Laskaridis, S., Venieris, S. I., Almeida, M., Leontiadis,
I., and Lane, N. D. (2020). Spinn: synergistic pro-
gressive inference of neural networks over device and
cloud. In Proceedings of the 26th annual interna-
tional conference on mobile computing and network-
ing, pages 1–15.
Li, E., Zeng, L., Zhou, Z., and Chen, X. (2019). Edge
ai: On-demand accelerating deep neural network in-
ference via edge computing. IEEE Transactions on
Wireless Communications, 19(1):447–457.
Matsubara, Y., Baidya, S., Callegaro, D., Levorato, M., and
Singh, S. (2019). Distilled split deep neural networks
for edge-assisted real-time systems. In Proceedings of
the 2019 Workshop on Hot Topics in Video Analytics
and Intelligent Edges, pages 21–26.
Matsubara, Y., Callegaro, D., Singh, S., Levorato, M.,
and Restuccia, F. (2022a). Bottlefit: Learning com-
pressed representations in deep neural networks for
effective and efficient split computing. arXiv preprint
arXiv:2201.02693.
Matsubara, Y., Levorato, M., and Restuccia, F. (2021). Split
computing and early exiting for deep learning applica-
tions: Survey and research challenges. ACM Comput-
ing Surveys (CSUR).
Matsubara, Y., Yang, R., Levorato, M., and Mandt,
S. (2022b). Supervised compression for resource-
constrained edge computing systems. In Proceedings
of the IEEE/CVF Winter Conference on Applications
of Computer Vision, pages 2685–2695.
Mentzer, F., Agustsson, E., Tschannen, M., Timofte, R., and
Van Gool, L. (2018). Conditional probability mod-
els for deep image compression. In Proceedings of
the IEEE Conference on Computer Vision and Pattern
Recognition, pages 4394–4402.
Oh, H. and Lee, Y. (2019). Exploring image reconstruc-
tion attack in deep learning computation offloading.
The 3rd International Workshop on Deep Learning for
Mobile Systems and Applications - EMDL ’19.
Pacheco, R. G., Couto, R. S., and Simeone, O. (2021).
Calibration-aided edge inference offloading via adap-
tive model partitioning of deep neural networks. In
ICC 2021-IEEE International Conference on Commu-
nications, pages 1–6. IEEE.
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., and
Chen, L.-C. (2018). Mobilenetv2: Inverted residu-
als and linear bottlenecks. In Proceedings of the IEEE
conference on computer vision and pattern recogni-
tion, pages 4510–4520.
Shao, J. and Zhang, J. (2020). Bottlenet++: An end-to-end
approach for feature compression in device-edge co-
inference systems. In 2020 IEEE International Con-
ference on Communications Workshops (ICC Work-
shops), pages 1–6. IEEE.
Teerapittayanon, S., McDanel, B., and Kung, H.-T. (2016).
Branchynet: Fast inference via early exiting from
deep neural networks. In 2016 23rd International
Conference on Pattern Recognition (ICPR), pages
2464–2469. IEEE.
Teerapittayanon, S., McDanel, B., and Kung, H.-T. (2017).
Distributed deep neural networks over the cloud, the
edge and end devices. In 2017 IEEE 37th interna-
tional conference on distributed computing systems
(ICDCS), pages 328–339. IEEE.
Veit, A. and Belongie, S. (2018). Convolutional networks
with adaptive inference graphs. In Proceedings of the
European Conference on Computer Vision (ECCV),
pages 3–18.
Veit, A., Wilber, M. J., and Belongie, S. (2016). Residual
networks behave like ensembles of relatively shallow
networks. Advances in neural information processing
systems, 29.
Venugopal, S., Gazzetti, M., Gkoufas, Y., and Katrinis,
K. (2018). Shadow puppets: Cloud-level accurate
{AI} inference at the speed and economy of edge. In
USENIX Workshop on Hot Topics in Edge Computing
(HotEdge 18).
Wu, M., Ye, D., Zhang, C., and Yu, R. (2021). Spears
and shields: attacking and defending deep model
co-inference in vehicular crowdsensing networks.
EURASIP Journal on Advances in Signal Processing,
2021(1):1–21.
Xia, W., Yin, H., Dai, X., and Jha, N. K. (2021). Fully
dynamic inference with deep neural networks. IEEE
Transactions on Emerging Topics in Computing.
Yao, S., Li, J., Liu, D., Wang, T., Liu, S., Shao, H., and
Abdelzaher, T. (2020). Deep compressive offloading:
Speeding up neural network inference by trading edge
computation for network latency. In Proceedings of
the 18th Conference on Embedded Networked Sensor
Systems, pages 476–488.
Zhou, Z., Chen, X., Li, E., Zeng, L., Luo, K., and Zhang, J.
(2019). Edge intelligence: Paving the last mile of arti-
ficial intelligence with edge computing. Proceedings
of the IEEE, 107(8):1738–1762.
Adaptive and Collaborative Inference: Towards a No-compromise Framework for Distributed Intelligent Systems
151