On the Convergence of Stochastic Gradient Descent in Low-Precision Number Formats
Matteo Cacciola, Matteo Cacciola, Antonio Frangioni, Masoud Asgharian, Alireza Ghaffari, Vahid Partovi Nia
2023
Abstract
Deep learning models are dominating almost all artificial intelligence tasks such as vision, text, and speech processing. Stochastic Gradient Descent (SGD) is the main tool for training such models, where the computations are usually performed in single-precision floating-point number format. The convergence of single-precision SGD is normally aligned with the theoretical results of real numbers since they exhibit negligible error. However, the numerical error increases when the computations are performed in low-precision number formats. This provides compelling reasons to study the SGD convergence adapted for low-precision computations. We present both deterministic and stochastic analysis of the SGD algorithm, obtaining bounds that show the effect of number format. Such bounds can provide guidelines as to how SGD convergence is affected when constraints render the possibility of performing high-precision computations remote.
DownloadPaper Citation
in Harvard Style
Cacciola M., Frangioni A., Asgharian M., Ghaffari A. and Partovi Nia V. (2023). On the Convergence of Stochastic Gradient Descent in Low-Precision Number Formats. In Proceedings of the 12th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM, ISBN 978-989-758-626-2, pages 542-549. DOI: 10.5220/0011795500003411
in Bibtex Style
@conference{icpram23,
author={Matteo Cacciola and Antonio Frangioni and Masoud Asgharian and Alireza Ghaffari and Vahid Partovi Nia},
title={On the Convergence of Stochastic Gradient Descent in Low-Precision Number Formats},
booktitle={Proceedings of the 12th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,},
year={2023},
pages={542-549},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011795500003411},
isbn={978-989-758-626-2},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 12th International Conference on Pattern Recognition Applications and Methods - Volume 1: ICPRAM,
TI - On the Convergence of Stochastic Gradient Descent in Low-Precision Number Formats
SN - 978-989-758-626-2
AU - Cacciola M.
AU - Frangioni A.
AU - Asgharian M.
AU - Ghaffari A.
AU - Partovi Nia V.
PY - 2023
SP - 542
EP - 549
DO - 10.5220/0011795500003411