Authors:
Gauri Vaidya
1
;
Luise Ilg
2
;
Meghana Kshirsagar
1
;
Enrique Naredo
1
and
Conor Ryan
1
Affiliations:
1
Biocomputing and Developmental Systems Group, Lero, The Science Foundation Ireland Research Centre for Software, Computer Science and Information System Department, University of Limerick, Limerick, Ireland
;
2
Science and Engineering, University of Limerick, Limerick, Ireland
Keyword(s):
Convolutional Neural Networks, Grammatical Evolution, Machine Learning, GPU, Business Modelling, Hyperparameters, Smart City.
Abstract:
Deep learning (DL) networks have the dual benefits due to over parameterization and regularization rendering them more accurate than conventional Machine Learning (ML) models. However, they consume massive amounts of resources in training and thus are computationally expensive. A single experimental run consumes a lot of computational resources, in such a way that it could cost millions of dollars thereby dramatically leading to massive project costs. Some of the factors for vast expenses for DL models can be attributed to the computational costs incurred during training, massive storage requirements, along with specialized hardware such as Graphical Processing Unit (GPUs). This research seeks to address some of the challenges mentioned above. Our approach, HyperEstimator, estimates the optimal values of hyperparameters for a given Convolutional Neural Networks (CNN) model and dataset using a suite of Machine Learning algorithms. Our approach consists of three stages: (i) obtaining c
andidate values for hyperparameters with Grammatical Evolution; (ii) prediction of optimal values of hyperparameters with supervised ML techniques; (iii) training CNN model for object detection. As a case study, the CNN models are validated by using a real-time video dataset representing road traffic captured in some Indian cities. The results are also compared against CIFAR10 and CIFAR100 benchmark datasets.
(More)