Optimizing Musical Genre Classiﬁcation Using Genetic Algorithms

Caio Grasso

1 a

, Thiago Carvalho

1,2 b

, Jos

e Franco Amaral

1 c

, Pedro Coelho

1 d

Robert Oliveira

1 e

and Giomar Olivera

1 f

FEN/UERJ, Rio de Janeiro State University, Rio de Janeiro, Brazil

Electrical Engineering Department, Pontiﬁcal Catholic University of Rio de Janeiro, Rio de Janeiro, Brazil

Keywords:

Genetic Algorithm, Music Classiﬁcation, Signal Processing, Deep Learning.

Abstract:

Classifying music into genres is a challenging yet fascinating task in audio analysis. By leveraging deep

learning techniques, we can automatically categorize music based on its acoustic characteristics, opening up

new possibilities for organizing and understanding large music collections.The main objective of this study is

to develop and evaluate deep learning models for the classiﬁcation of different musical styles. To optimize

the models, we utilized Genetic Algorithms (GA) to automatically determine the optimal hyperparameters

and model architecture selection, including Convolutional Neural Networks and Transformers. The results

demonstrated the effectiveness of GAs in exploring the hyperparameter space, leading to improved perfor-

mance across multiple architectures, with EfﬁcientNet models standing out for their consistent and robust

results. This work highlights the potential of automated optimization techniques in enhancing audio analysis

tasks and emphasizes the importance of integrating deep learning and evolutionary algorithms for tackling

complex music classiﬁcation problems.

1 INTRODUCTION

The classiﬁcation of musical genres is essential for in-

dexing and recommending songs, directly impacting

user experience on streaming platforms and the orga-

nization of the digital music industry. With distinct

traits like rhythm, instrumentation, and structure, mu-

sical genres classify works with common elements,

enabling the creation of personalized recommenda-

tion systems.

Given the growth and diversity of musical gen-

res, advanced classiﬁcation methods are necessary to

overcome challenges related to musical heterogene-

ity and provide more reﬁned recommendations. This

study aims to contribute to the advancement of au-

tomated classiﬁcation solutions, enabling the devel-

opment of more efﬁcient music curation and retrieval

systems.

This study focuses on the classiﬁcation of musical

genres using Deep Learning (DL), particularly Con-

volutional Neural Networks (CNNs), which analyze

https://orcid.org/0009-0008-3316-1026

https://orcid.org/0000-0001-8689-1438

https://orcid.org/0000-0003-4951-8532

https://orcid.org/0000-0003-3623-1313

https://orcid.org/0000-0003-0000-3001

https://orcid.org/0000-0002-7172-6525

audio signals represented visually through spectro-

grams. To enhance the performance of these models,

Genetic Algorithms (GAs) are employed for hyperpa-

rameter optimization, automating the search for opti-

mal conﬁgurations. The research compares different

CNN architectures, such as EfﬁcientNet and ResNet,

to identify the most effective models for genre clas-

siﬁcation. Additionally, the study evaluates the ad-

vantages of hyperparameter optimization in improv-

ing classiﬁcation outcomes and veriﬁes the efﬁciency

of transformer-based models for signal classiﬁcation.

The work contributes to more efﬁcient music cura-

tion and retrieval systems, offering insights into the

integration of DL and evolutionary algorithms for im-

proved accuracy and enhanced user experiences in

music streaming platforms.

The primary objective of this study is to optimize

both the model selection and hyperparameter tuning

with a singular focus: maximizing classiﬁcation ac-

curacy. The use of Genetic Algorithms is centered

on identifying the best-performing conﬁgurations that

lead to the most accurate genre predictions. By sys-

tematically reﬁning architectural choices and training

parameters, this approach ensures that improvements

are solely driven by their impact on classiﬁcation ac-

curacy, reinforcing the effectiveness of hyperparam-

eter optimization in deep learning-based music genre

classiﬁcation.

Grasso, C., Carvalho, T., Amaral, J. F., Coelho, P., Oliveira, R. and Olivera, G.

Optimizing Musical Genre Classiﬁcation Using Genetic Algorithms.

DOI: 10.5220/0013418200003929

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 27th International Conference on Enterprise Information Systems (ICEIS 2025) - Volume 1, pages 881-887

ISBN: 978-989-758-749-8; ISSN: 2184-4992

881

In the following section, a brief review of the lit-

erature on the use of genetic algorithms for model op-

timization in different DL models and DL models for

the music genre classiﬁcation task will be presented.

Section 3 will detail the proposed approach, includ-

ing the data preprocessing steps, model selection, and

the experimental protocol adopted. In Section 4, we

discuss the experiments conducted, outlining the con-

ﬁgurations and methodology employed. In Section

5, we present the discussion and results, emphasizing

how the genetic algorithm inﬂuenced hyperparame-

ter selection and model performance. Finally, Sec-

tion 6 concludes the study and highlights potential

directions for future research on applying genetic al-

gorithms to optimize hyperparameters for automatic

music genre classiﬁcation.

2 LITERATURE REVIEW

The study of CNNs for audio processing and audio

signal classiﬁcation has gained attention due to its im-

portance in material retrieval and music recommen-

dation tasks on digital platforms. Early studies fo-

cused on manual feature extraction of acoustic prop-

erties, such as Mel-Frequency Cepstral Coefﬁcients

(MFCCs), timbre, and rhythm, combined with clas-

sical machine learning algorithms, including Support

Vector Machines (SVM) and K-Nearest Neighbors

(KNN) (Tzanetakis and Cook, 2002). While these

methods showed some success, their effectiveness

was limited by the challenge of capturing the more

complex nuances of musical genres.

The representation of audio signals as images has

become a widely adopted approach in audio analy-

sis, particularly in the context of musical genre clas-

siﬁcation (M

uller, 2015). One of the most common

ways to achieve this representation is through spectro-

grams, which are visualizations that display the vari-

ation of sound frequencies over time. This type of

representation transforms the audio signal into a two-

dimensional format, enabling a more intuitive analy-

sis of acoustic characteristics (M

uller, 2015).

With advancements in Deep Learning techniques,

CNNs have emerged as powerful tools for audio anal-

ysis, utilizing visual representations like Mel spec-

trograms to identify patterns that distinguish gen-

res (Choi et al., 2017a). For instance, Choi et al.

(2017) demonstrated the effectiveness of CNNs in

genre classiﬁcation, outperforming traditional meth-

ods by leveraging the networks’ ability to learn hier-

archical features directly from raw data. Furthermore,

Dieleman and Schrauwen (Dieleman et al., 2011) ex-

plored end-to-end approaches, enabling CNNs to op-

erate directly on audio representations without requir-

ing manual feature extraction.

Another critical aspect of musical genre classi-

ﬁcation is model optimization, where hyperparame-

ter selection, such as network architecture, learning

rate, and the number of convolutional ﬁlters, plays

a pivotal role. Traditional tuning methods, such as

grid search or random search, often prove inefﬁcient

for deep networks due to high computational costs

(Bergstra and Bengio, 2012). In this context, GAs

have been explored as a promising alternative, en-

abling automated selection of optimal hyperparame-

ter conﬁgurations. For example, Young et al. (Young

et al., 2015) demonstrated the application of GAs for

neural network architecture optimization, while Real

et al. (Real et al., 2019) showcased advancements in

using these techniques to achieve competitive perfor-

mance on deep learning benchmarks.

These studies highlight the evolution of the ﬁeld,

from manual feature-based methods to the application

of deep learning and evolutionary algorithms. How-

ever, signiﬁcant gaps remain in effectively integrating

these technologies to capture the diversity and com-

plexity of musical genres, motivating the proposal of

this study.

3 PROPOSED APPROACH

This section outlines the methodology adopted for the

task of musical genre classiﬁcation. A step-by-step

ﬂowchart is presented in Figure 1, detailing the key

processes, starting from the visual representation of

audio signals to model optimization. First, audio data

is converted into spectrograms to enable visual anal-

ysis. In this case, CNNs are leveraged for feature ex-

traction and classiﬁcation. Finally, a GA is applied to

optimize the model’s architecture and hyperparame-

ters, ensuring the best conﬁguration for the task.

3.1 Visual Representation of Audio

Signals

By representing audio as an image, details such as

rhythmic patterns, timbres, and harmonic transitions,

often associated with different musical genres, can be

captured (McFee et al., 2015). Additionally, many

genres share similar characteristics, which can be di-

rectly observed in their visual representations. For ex-

ample, genres like rock and blues may exhibit compa-

rable frequency patterns due to the use of similar in-

struments, whereas electronic genres may stand out

through frequency peaks generated by synthesizers

(Pons et al., 2017). In Figures 2, 3, 4 and 5, we can

ICEIS 2025 - 27th International Conference on Enterprise Information Systems

882

Figure 1: General development ﬂowchart.

Figure 2: Example of a spectrogram from a Pop sample (1).

Figure 3: Example of a spectrogram from a Pop sample (2).

observe the visual similarity between musical genres

representations.

Another relevant aspect of this approach is the

ﬂexibility it provides in data manipulation. By adjust-

ing the frequency scale, such as using the Mel scale,

the visual representation can be aligned with human

auditory perception, facilitating the identiﬁcation of

relevant patterns (McFee et al., 2015). Additionally,

techniques like data augmentation, when applied di-

rectly to spectrograms, create variations in visual rep-

resentations, increasing data diversity and enhancing

the robustness of subsequent analyses. Representing

audio as images not only offers an efﬁcient analysis

method but also provides a unique perspective on the

similarities and differences between musical genres,

enabling a deeper understanding of the characteristics

that deﬁne each style (McFee et al., 2017).

Furthermore, the use of Mel scales and other

spectrogram-based representations has emerged in the

literature as an efﬁcient way to capture relevant sound

Figure 4: Example of a spectrogram from a Metal sample

(1).

Figure 5: Example of a spectrogram from a Metal sample

(2).

signal features for musical genre classiﬁcation (Choi

et al., 2017a). These representations are widely em-

ployed because they approximate the audio signal

to human perception, making the classiﬁcation task

more intuitive for deep learning models. Recent

studies have explored the combination of multiple

computer vision approaches to improve representa-

tion quality. For example, techniques such as data

augmentation and hybrid neural networks combining

CNNs with RNNs (Recurrent Neural Networks) have

been investigated to explore the temporal and spa-

tial relationships in spectrograms, thereby enhancing

model accuracy (Choi et al., 2017b).

3.2 CNN Models

In this study, we used the visual representations

of the audio to classify music genres using CNNs.

The adopted methodology relied on converting audio

signals into spectrograms, which were subsequently

used as inputs for image classiﬁcation models. This

approach has gained prominence in the literature due

to the ability of CNNs to extract complex and hierar-

chical features, signiﬁcantly improving classiﬁcation

performance (Hershey et al., 2017).

Traditional model architectures, such as ResNet

(He et al., 2016) and EfﬁcientNet (Tan and Le, 2019),

played a crucial role in this study. These models

demonstrated a strong ability to generalize to data

represented as images, even when derived from non-

visual sources such as audio (Choi et al., 2017a). The

application of CNNs for spectrogram classiﬁcation

proved particularly effective, capturing temporal and

Optimizing Musical Genre Classiﬁcation Using Genetic Algorithms

883

frequency patterns in sound signals that are vital for

identifying musical genres. Additionally, the trans-

fer learning technique might enable the reuse of pre-

trained models on image datasets such as ImageNet,

reducing computational costs and increasing accuracy

on the speciﬁc datasets used in this study.

3.3 Optimizing Model Architecture

Using Genetic Algorithm

Unlike the grid search method, which performs an

exhaustive and predeﬁned search across all hyperpa-

rameter combinations within a set range, GAs employ

a stochastic approach inspired by natural evolution

(Darwin, 2023). These algorithms operate based on a

population of individuals, each representing a poten-

tial solution to the problem. Through operators such

as selection, crossover, and mutation, GAs efﬁciently

explore the search space, dynamically adapting to

ﬁnd near-optimal solutions even in high-dimensional

spaces, such as hyperparameter tuning for a CNN

(Aszemi and Dominic, 2019). This approach allows

for the discovery of hyperparameter conﬁgurations

that may not be easily identiﬁable through traditional

methods.

GAs rely on a binary or numerical representa-

tion of individuals—each individual being a sequence

of encoded parameters—which simpliﬁes the imple-

mentation of mutation and crossover operations that

generate potential solutions. The selection process fa-

vors individuals with higher ﬁtness values, ensuring

that well-performing conﬁgurations are propagated

through generations. Figure 6 illustrates a ﬂowchart

of how a GA is implemented. The genes chosen for

optimization involved essential parameters for conﬁg-

uring the model. The following subsections detail the

four primary genes optimized: the model architecture,

learning rate, weight decay, and training batch size.

The ﬁrst gene to be optimized refers to the selec-

tion of the neural network model to be used. For the

task of music genre classiﬁcation from spectrograms,

three different CNN models were chosen: ResNet50,

EfﬁcientNetB0, and another baseline model. The op-

timization of this gene aims to determine which of

these architectures is most suitable for the speciﬁc

task, based on their performance during training and

validation. The choice of model directly impacts the

network’s ability to capture relevant features from the

spectrograms and, therefore, is crucial for the effec-

tiveness of the classiﬁcation.

The second gene to be optimized is the learning

rate, a critical hyperparameter that governs the speed

of convergence during training. Proper adjustments to

the learning rate can signiﬁcantly enhance the model’s

Figure 6: Flowchart of the Genetic Algorithm Used.

convergence, preventing both premature convergence,

which may result in suboptimal solutions, and exces-

sive error oscillation, which hinders stable training.

The third gene to be optimized is the weight de-

cay, a regularization technique that prevents overﬁt-

ting by penalizing large weights in the network. By

ﬁnding an optimal balance, weight decay helps main-

tain a model that generalizes well to unseen data while

avoiding excessive complexity that could lead to poor

performance on new inputs.

The fourth gene to be optimized is the training

batch size, a parameter that deﬁnes the number of

samples processed by the model before updating its

weights. Smaller batch sizes can improve generaliza-

ICEIS 2025 - 27th International Conference on Enterprise Information Systems

884

tion but may lead to noisier updates, whereas larger

batch sizes offer more stable updates but require care-

ful tuning to avoid memory limitations and training

inefﬁciencies. By optimizing this parameter, the ge-

netic algorithm seeks to ﬁnd a balance between com-

putational efﬁciency and model performance.

4 EXPERIMENTS

In this section, the experiments conducted for the task

of music genre classiﬁcation are described, including

the dataset used, the division of the data sets, and the

execution environment. To ensure transparency and

facilitate further research, all code, scripts, and con-

ﬁgurations used in this study have been made publicly

available.

4.1 Dataset

The dataset used in the experiments was GTZAN,

widely recognized in the literature as a standard

dataset for music genre classiﬁcation. The dataset

contains 1,000 audio ﬁles, evenly distributed across

10 music genres: blues, classical, country, disco, hip-

hop, jazz, metal, pop, reggae, and rock. Each ﬁle is

30 seconds long and is in WAV format, with a sam-

pling rate of 22,050 Hz. The choice of GTZAN is due

to its relevance and the diversity of the music genres,

which allows for a comprehensive evaluation of the

model across different styles. The dataset was split

into 70% for training, 20% for validation, and 10%

for testing.

4.2 Experiment Environment

The entire experimental procedure was conducted in

the Google Colab environment, which provides free

GPU support, facilitating the training of deep learn-

ing models. Google Colab was chosen for its acces-

sibility, integration with popular machine learning li-

braries such as PyTorch and TensorFlow, and support

for cloud collaboration.

The experiments were segmented to optimize the

use of available computational resources, including

the pre-processing of audio spectrograms, conﬁgur-

ing the GA with the PyGAD library, and training the

CNN. Additionally, intermediate results were period-

ically saved to Google Drive to ensure data integrity

and facilitate later analysis.

https://github.com/ciaograsso06/Music-Genre-

Classiﬁcation.git

Table 1: Number of training epochs used in each experiment

during the optimization of CNN models.

Experiment Epochs

Experiment 1 7

Experiment 2 10

Experiment 3 12

Experiment 4 15

Experiment 5 20

To evaluate the performance of the music genre

classiﬁcation model and the effectiveness of the GA

in hyperparameter optimization, four distinct experi-

ments were conducted. Each experiment varied the

number of training epochs (as shown in Table 1) and

produced speciﬁc conﬁgurations of optimized hyper-

parameters, such as the model, learning rate, and

weight decay. Therefore, the training epochs were not

considered as a gene for the optimization step.

For each experiment, the GA aim to optimize the

deﬁned hyperparameters, and the model was evalu-

ated based on the validation loss.

4.3 Experimental Protocol

In this study’s methodology, the PyGAD library (Gad,

2023) was utilized to implement the GA, enabling

the efﬁcient optimization of hyperparameters for the

CNN model. This library provides a robust and ﬂexi-

ble interface for deﬁning genes, genetic operators, and

evaluation criteria, facilitating experimentation with

various conﬁgurations.

The genetic operators used in PyGAD were the

default conﬁgurations, requiring no additional setup.

These include tournament selection with a size of 2,

single-point crossover with a crossover probability of

0.8, and random mutation with a mutation probability

of 0.1. The GA was executed for a total of 5 gen-

erations, ensuring sufﬁcient exploration of the hyper-

parameter search space while maintaining computa-

tional efﬁciency.

5 DISCUSSION AND RESULTS

The detailed results of each experiment are presented

in this section, including tables and comparative anal-

yses. These tables contain metrics such as accuracy

and validation loss for the optimized solution, en-

abling a detailed analysis of the impact of each exper-

iment on model performance. The Table 2 presents

the backbone chosen as the model to be ﬁnetuned.

Besides, the best ﬁtness of each experiment can be

observed in Table 3 and analyzed in more detail in

Figure 7.

Optimizing Musical Genre Classiﬁcation Using Genetic Algorithms

885

Table 2: Chosen models in the optimization process.

Experiment Model

1 google/efﬁcientnet-b0

2 google/efﬁcientnet-b1

3 google/efﬁcientnet-b0

4 microsoft/resnet-50

5 google/efﬁcientnet-b0

Table 3: Parameters Selected by the GA per Experiment and

the Corresponding Fitness.

Exp. LR W. Decay Batch Size Fitness

1 0.0033 0.0072 16 0.8096

2 0.0014 1e-06 22 0.8098

3 0.0012 1e-06 28 0.8159

4 0.0020 1e-06 25 0.7407

0.0003 0.0003 16 0.7370

Exp 1 Exp 2 Exp 3 Exp 4 Exp 5

0.700

0.720

0.740

0.760

0.780

0.800

0.820

Experiments

Fitness

Figure 7: Fitness x Experiments graphic.

Once the GA ﬁnished the optimization process,

training was carried out with these parameters, and

the model was subsequently tested on the test set, with

the results shown in Table 4.

Table 4: Test Accuracy for each experiment.

Exp. Run Time Test Accuracy

1 32 min 25 sec 66.0000%

2 39 min 14 sec 67.8201%

3 42 min 70.0000%

4 50 min 45 sec 72.0000%

5 1h 10 min 79.4126%

Although the importance of proper training with

an adequate number of epochs is evident, it is worth

noting that the use of the GA for selecting the best so-

lutions had a signiﬁcant positive impact. Even with a

reduced number of training epochs, the GA was able

to identify model conﬁgurations that achieved accept-

able accuracy results, demonstrating the algorithm’s

effectiveness in optimizing the training process and

obtaining robust models in less time.

By exploring a wider range of hyperparameter

conﬁgurations, the GA was able to identify high-

performing models that would have otherwise been

overlooked. This not only reduced the overall training

time but also ensured that the models achieved com-

petitive accuracy with fewer iterations, validating the

GA’s potential as a powerful tool for optimizing DL

workﬂows.

6 CONCLUSION

In this work, we explored the use of GA for hyper-

parameter optimization of convolutional neural net-

works in the music classiﬁcation task. The ex-

periments conducted demonstrated the efﬁciency of

GA in identifying hyperparameter conﬁgurations that

minimize the validation loss function across different

models, such as the EfﬁcientNet and ResNet architec-

tures.

The results indicated that GA successfully identi-

ﬁed combinations of learning rate, weight decay, and

batch size that signiﬁcantly improved model perfor-

mance, as evidenced by reduced validation loss val-

ues. The EfﬁcientNet-b0 model exhibited consistent

performance across multiple experiments, highlight-

ing its robustness for tasks with different conﬁgura-

tions. More complex architectures, such as ResNet-

50, also beneﬁted from optimization, underscoring

the applicability of GA to various models. This study

reafﬁrms that GA is a powerful tool for hyperparame-

ter optimization strategy. The approach is particularly

useful in scenarios where traditional methods, such as

grid search and random search, would be computa-

tionally expensive or less effective.

For future work, this approach will be evaluated

on additional datasets, comparing the performance

of the GA-based hyperparameter optimization with

other methods presented in the literature, including

simpler techniques. This will provide further insights

into the generalizability and robustness of the method

across different music genre classiﬁcation tasks. Ad-

ditionally, Neural Architecture Search (NAS) will be

explored to optimize the model architecture within

a multi-objective framework, considering not only

classiﬁcation accuracy but also hardware constraints

such as model size and inference time. Evaluating

these techniques in various domains will further clar-

ICEIS 2025 - 27th International Conference on Enterprise Information Systems

886

ify their comparative advantages in terms of compu-

tational efﬁciency and model performance.

ACKNOWLEDGEMENTS

This work was supported in part by the Coordenac¸

de Aperfeic¸oamento de Pessoal de N

ıvel Superior -

Brasil (CAPES) - Finance Code 001, Conselho Na-

cional de Desenvolvimento e Pesquisa (CNPq) un-

der Grant 140254/2021-8, and Fundac¸

ao de Amparo

a Pesquisa do Rio de Janeiro (FAPERJ)

REFERENCES

Aszemi, N. M. and Dominic, P. (2019). Hyperparameter op-

timization in convolutional neural network using ge-

netic algorithms. International Journal of Advanced

Computer Science and Applications, 10(6).

Bergstra, J. and Bengio, Y. (2012). Random search for

hyper-parameter optimization. Journal of machine

learning research, 13(2).

Choi, K., Fazekas, G., Sandler, M., and Cho, K. (2017a).

Convolutional recurrent neural networks for music

classiﬁcation. In 2017 IEEE International conference

on acoustics, speech and signal processing (ICASSP),

pages 2392–2396. IEEE.

Choi, K., Fazekas, G., Sandler, M., and McFee, M. B.

(2017b). A tutorial on deep learning for music infor-

mation retrieval. ACM Computing Surveys, 51(1):1–

34.

Darwin, C. (2023). Origin of the species. In British Politics

and the environment in the long nineteenth century,

pages 47–55. Routledge.

Dieleman, S., Brakel, P., and Schrauwen, B. (2011). Audio-

based music classiﬁcation with a pretrained convolu-

tional network. In 12th International Society for Mu-

sic Information Retrieval Conference (ISMIR-2011),

pages 669–674. University of Miami.

Gad, A. F. (2023). Pygad: An intuitive genetic algorithm

python library. Multimedia Tools and Applications,

pages 1–14.

He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep resid-

ual learning for image recognition. Proceedings of

the IEEE conference on computer vision and pattern

recognition (CVPR), pages 770–778.

Hershey, S., Chaudhuri, S., Ellis, D. P., Gemmeke, J. F.,

Jansen, A., Moore, R. C., Plakal, M., Platt, D.,

Saurous, R. A., Seybold, B., and Slaney, M. (2017).

Cnn architectures for large-scale audio classiﬁcation.

2017 IEEE international conference on acoustics,

speech and signal processing (ICASSP), pages 131–

135.

McFee, B., J

ulke, L., Salamon, J., and Ellis, D. P. (2017).

Learning multi-scale temporal features for music in-

formation retrieval. In Proceedings of the 18th Inter-

national Society for Music Information Retrieval Con-

ference (ISMIR), pages 252–258.

McFee, B., Raffel, C., Liang, D., Ellis, D. P., McVicar, M.,

Battenberg, E., and Nieto, O. (2015). librosa: Audio

and music signal analysis in python. Proceedings of

the 14th python in science conference, 8(1):18–25.

uller, M. (2015). Fundamentals of music processing:

Audio, analysis, algorithms, applications, volume 5.

Springer.

Pons, J., Slizovskaia, O., Gong, R., G

omez, E., and Serra,

X. (2017). Timbre analysis of music audio signals

with convolutional neural networks. In 2017 25th

European Signal Processing Conference (EUSIPCO),

pages 2744–2748. IEEE.

Real, E., Aggarwal, A., Huang, Y., and Le, Q. V. (2019).

Regularized evolution for image classiﬁer architecture

search. In Proceedings of the aaai conference on arti-

ﬁcial intelligence, volume 33, pages 4780–4789.

Tan, M. and Le, Q. (2019). Efﬁcientnet: Rethinking model

scaling for convolutional neural networks. In Interna-

tional conference on machine learning, pages 6105–

6114. PMLR.

Tzanetakis, G. and Cook, P. (2002). Musical genre classiﬁ-

cation of audio signals. IEEE Transactions on speech

and audio processing, 10(5):293–302.

Young, S. R., Rose, D. C., Karnowski, T. P., Lim, S.-H.,

and Patton, R. M. (2015). Optimizing deep learning

hyper-parameters through an evolutionary algorithm.

In Proceedings of the workshop on machine learning

in high-performance computing environments, pages

1–5.

Optimizing Musical Genre Classiﬁcation Using Genetic Algorithms

887