crucial to analyze music tastes. The music taste of the
audience differs between countries and changes over
time. Acoustic features, historical events, seasons,
holidays, and social influence are factors that may
affect a song’ popularity. Naturally, the effects of
these factors are dynamic and culture-specific. In
different countries, song popularity may be affected
by these factors differently over time. Geographical
and cultural distances between countries affect the
popularity of the songs. (Buda and Jarnowski, 2015)
In this paper, we used a time-varying penalized
regression model to predict the number of song
streams using acoustic features gathered using
Spotify API and monitor the regression parameters
over a period to discover the effects of the acoustic
features on the music taste. Our approach allows us to
smooth the extreme changes of the regression
parameters over time so that the shifts in the musical
preferences are reflected realistically. We also
compared the results for two countries; Turkey and
the U.S. to observe the cross-cultural differences in
the music tastes.
The rest of the paper is organized as follows.
Section 2 provides a review of related literature.
Section 3 describes the problem that we have worked
on and explains the methodology that we have used.
Section 4 explains the data scraping and preparation
steps and the results of the regression model. Finally,
Section 5 concludes the paper and discusses future
works.
2 RELATED WORK
Our motivation is to examine cross-cultural music
preferences and their shifts over time by using the
charts data and the acoustic features of songs.
Previous works that focused on similar problems exist
in the literature and the methodologies that they
follow in these works vary. There are applications of
predictive algorithms and survey-based approaches
for understanding the Spatio-temporal music
preferences of different populations.
A regression-based approach is used by (Suh,
2019) with Spotify’s acoustic features to predict
songs’ success on the charts. They use OLS
regression for prediction and analyse variable
significances for 6 different countries. (Pınarbaşı,
2019) analyse music popularity characteristics of
Turkey for a 6-month period by using decision tree
algorithm. They also cluster the acoustic features
gathered from the Spotify and concluded with 3
different clusters with similar acoustic features.
(Yadati et al., 2017) focuses on the change of the
musical preferences when the mood/activity change.
Their findings show that the acoustic features of the
songs and the genre/instrument information are not
sufficient for predicting the mood/activity change.
Classification models are also applied to predict
whether a song is a hit or not. A similar work from
(Al-Beitawi et al., 2020) shows that the musical
attributes from Spotify may help clustering the songs
and discovering the acoustic features that have
influence on the song popularity such as high
danceability and low instrumentalness. (Herremans
et al., 2014) compare classification methods such as
SVM, Naïve Bayes, logistic regression, and decision
tree for hit song prediction. Their dataset includes
Billboard’s Hot 100 charts with the acoustic features
from The Echo Nest which is owned by Spotify.
Same acoustic features used for analyzing the music
popularity by (Sciandra and Spera 2020). They
applied a Beta GLMM to detect the features that have
effects on song popularity. They found out that
energy, valence, and duration features affect the song
popularity positively. (Ni et al. 2011) discovered that
the hit songs having higher tempo and getting louder
over time as a result of their binary classification
study. However, their findings showed that over a 50-
years period harmonically simple songs are more
likely to be hit.
There are also works that are not data driven.
(LeBlanc et al., 2000) tested the music listening
preferences by surveying young listeners around 5
countries. They found that the tempo of the song,
listener’s age and country affect the music preference.
(Rentfrow and Gosling, 2003) collected over 3500
samples from different geographical regions and
discover 4 music preference dimensions such as
Reflective and Complex, Intense and Rebellious,
Upbeat and Conventional and Energetic and
Rhythmic. They explained and related the music
preferences with the personal characteristics, political
views, and cognitive abilities.
Time-varying coefficient models such as Kalman
filters, smoothing spline methods and time-varying
coefficient regressions are widely used to analyse
longitudinal data in different domains.
Ordinary Least Squares (OLS) to estimate
continuous values by using several independent
variables. An alternative for the OLS is Flexible Least
Squares (FLS) which is proposed by (Kalaba and
Tesfatsion, 1989) to solve time-varying linear
regressions. The method minimizes the difference
between coefficients of consecutive weeks in addition
to the sum of squared regression errors. FLS smooths
the regression coefficient changes over time. FLS is
used in different domains. (He,2001) used the FLS to