0.5787, and the MAPE is 42.85%. The R-squared
value of the Neural Network Regression Model with
3 Dense Layers and 400 Epochs is 0.5995, and the
MAPE is 95.24%. The R-squared value of the Neural
Network Regression Model with 3 Dense Layers and
800 Epochs is 0.5461, and the MAPE is 111.81%.
4.2 Discussion
It is observed that the Neural Network Regression
Model with 2 Dense Layers and 400 Epochs yields
the highest R-squared value of 0.6776, meaning that
67.76% of the variance in the dependent variable (i.e.,
the actual MVP win share) can be explained by the
independent variables. Such a high R-squared value
indicates that the model is a good fit for the Dataset.
However, the model also yields a MAPE of 86.61%,
meaning that the prediction of the dependent variable
(i.e., the actual MVP win share) is off by 86.61%.
Such a high MAPE also indicates that the prediction
result is far from being accurate (Akoglu 2018). One
typical reason for the simultaneous high R-squared
value and MAPE is that the model is overfitting,
meaning that the model fits the training data
exceptionally well but cannot be generalized to new,
unseen data (Dietterich 1995).
Similarly, the Neural Network Regression Model
with 2 Dense Layers and 200 Epochs yields the
lowest MAPE of 15.38%, meaning that the prediction
of the dependent variable (i.e., the actual MVP win
share) is only off by 15.38 %. Such a low MAPE
indicates that the prediction result is accurate.
However, the model yields an R-squared value of
0.4584, meaning that only 45.84% of the variance in
the dependent variable (i.e., the actual MVP win
share) can be explained by the independent variables.
Such an R-squared value also indicates that the model
is just a passable fit for the Dataset.
Therefore, upon providing the model with both a
good R-square value and a good MAPE, the Extreme
Gradient Boosting Regression Model is selected to be
the best model in predicting the MVP win share of an
arbitrary player in an arbitrary season. The R-squared
value of the Extreme Gradient Boosting Regression
Model is 0.6399, meaning that 63.99% of the variance
in the dependent variable (i.e., the actual MVP win
share) can be explained by the independent variables.
The model also yields a MAPE of 22.90%, meaning
that the prediction of the dependent variable (i.e., the
actual MVP win share) is only off by 22.90%.
5 CONCLUSION
In order to help audiences do a better job in predicting
the NBA regular season MVP under different
scenarios, this research applies different Machine
Learning Regression Models to predict the MVP win
share of an arbitrary player in an arbitrary NBA
season. More specifically, every single NBA player’s
statistics and MVP win share in the past 40 years are
collected, preprocessed, and used to train and test the
machine learning models. After comparing each
model’s R-squared value and MAPE, it is concluded
that the Extreme Gradient Boosting Regression
Model is the best model in predicting the MVP win
share of an arbitrary player in an arbitrary season,
with a R-squared value of 0.6399 and a MAPE of
22.90%.
While this research can certainly provide useful
information for future prediction of MVP, the final
result is not optimized due to the limited capability of
the facilities (i.e., computers). For instance, the tuning
of the hyperparameters of the neural network models
is not optimized, as only a few numbers of epochs and
dense layers are being tested. In future research
endeavors, it is conceivable to expand upon existing
machine learning models by incrementing the
hyperparameters such as the number of epochs and
dense layers. Researchers can also graph the
performance of these models with various
hyperparameters, thereby discovering the trends in
different models’ performance. It then becomes
feasible to pinpoint the optimal configuration that
maximizes the model's performance.
REFERENCES
F. Thabta, L.Zhang, N. Abdelhami., “NBA Game Result
Prediction Using Feature Analysis and Machine
Learning”, Annals of Data Science, vol. 6, no. 1, pp.
103-116, 2019.
Y. Chen, J. Dai, and C. Zhang, “A Neural Network Model
of the NBA Most Valued Player Selection Prediction”,
In Proceedings of the 2019 the International Conference
on Pattern Recognition and Artificial Intelligence
(PRAI '19), Association for Computing Machinery,
New York, NY, USA, pp. 16–20, 2019
J. Hu, H. Zhang, and J. Qiu, “Prediction of MVP Attribution
in NBA Regular Match Based on BP Neural Network
Model”, In Proceedings of the 2019 International
Conference on Artificial Intelligence and Advanced
Manufacturing (AIAM 2019), Article 43, pp. 1–5, 2019
A. L. Chapman, “The Application of Machine Learning to
Predict the NBA Regular Season MVP”, Doctoral
Dissertation, Utica University, 2023.