Layered Batch Inference Optimization Method for Convolutional Neural Networks Based on CPU

Hongzhi Zhao, Xun Liu, Jingzhen Zheng, Jingjing He

2023

Abstract

In recent years, CPU is still the most widely used computing system. And in CNN inference applications, batching is an essential technique utilized on many platforms. The arrival time and the sample number of the convolutional neural network inference requests are unpredictable, and the inference with the small batch size cannot make full use of the computation resources of the multi-threading in CPU. In this paper, we propose a layered batch inference optimization method for CNN based on CPU (LBCI). This method implements "layer-to-layer" optimal scheduling for being-processed and to-be-processed CNN inference tasks under the constraints of the user preference delay in a single batch. It conducts the dynamic batch inference by "layer-to-layer" optimal scheduling during the processing. The experimental results show that for the request with a single-sample inference task, LBCI reduces the inference time by 10.43%-52.43% compared with the traditional method; for the request with a multi-sample inference task, LBCI reduces the inference time by 4.32%-22.76% compared with the traditional method.

Download


Paper Citation


in Harvard Style

Zhao H., Liu X., Zheng J. and He J. (2023). Layered Batch Inference Optimization Method for Convolutional Neural Networks Based on CPU. In Proceedings of the 2nd International Seminar on Artificial Intelligence, Networking and Information Technology - Volume 1: ANIT; ISBN 978-989-758-677-4, SciTePress, pages 182-189. DOI: 10.5220/0012277100003807


in Bibtex Style

@conference{anit23,
author={Hongzhi Zhao and Xun Liu and Jingzhen Zheng and Jingjing He},
title={Layered Batch Inference Optimization Method for Convolutional Neural Networks Based on CPU},
booktitle={Proceedings of the 2nd International Seminar on Artificial Intelligence, Networking and Information Technology - Volume 1: ANIT},
year={2023},
pages={182-189},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012277100003807},
isbn={978-989-758-677-4},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 2nd International Seminar on Artificial Intelligence, Networking and Information Technology - Volume 1: ANIT
TI - Layered Batch Inference Optimization Method for Convolutional Neural Networks Based on CPU
SN - 978-989-758-677-4
AU - Zhao H.
AU - Liu X.
AU - Zheng J.
AU - He J.
PY - 2023
SP - 182
EP - 189
DO - 10.5220/0012277100003807
PB - SciTePress