Transition of Model Performance in Dependence of the Amount of Data Corruption with Respect to Network Sizes

Thomas Seidler, Thomas Seidler, Markus Abel, Markus Abel

2024

Abstract

An important question for machine learning model concerns the achievable quality or performance of a model with respect to given data. In other words, we want to answer the question how robust a model is with respect to perturbation of the data. From statistical mechanics, a standard way to ”corrupt” input data is a study that uses additive noise to perturb data. This, in turn, corresponds to typical situations in processing data from any sensor as measurement noise. Larger models will often perform better, because they are able to capture more variance of the data. However, if the information content cannot be retrieved due to too large data corruptions a large network cannot compensate noise effects and no performance is gained by scaling the network. Here we study systematically the said effect, we add diffusive noise of increasing strength on a logarithmic scale to some well-known datasets for classification. As a result, we observe a sharp transition in training and test accuracy as a function of the noise strength. In addition, we study if the size of a network can counterbalance the described noise. The transition observed resembles a phase transition as described in the framework of statistical mechanics. We draw an analogy between systems in statistical mechanics and Machine Learning systems that suggests general upper bounds for certain types of problems, described as the tuple (data, model). This is a fundamental result that may have large impact on practical applications.

Download


Paper Citation


in Harvard Style

Seidler T. and Abel M. (2024). Transition of Model Performance in Dependence of the Amount of Data Corruption with Respect to Network Sizes. In Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART; ISBN 978-989-758-680-4, SciTePress, pages 1084-1091. DOI: 10.5220/0012435000003636


in Bibtex Style

@conference{icaart24,
author={Thomas Seidler and Markus Abel},
title={Transition of Model Performance in Dependence of the Amount of Data Corruption with Respect to Network Sizes},
booktitle={Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART},
year={2024},
pages={1084-1091},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012435000003636},
isbn={978-989-758-680-4},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART
TI - Transition of Model Performance in Dependence of the Amount of Data Corruption with Respect to Network Sizes
SN - 978-989-758-680-4
AU - Seidler T.
AU - Abel M.
PY - 2024
SP - 1084
EP - 1091
DO - 10.5220/0012435000003636
PB - SciTePress