Stock Price Prediction Based on Deep Learning
Xuejiao Chen
School of Mathematics, The University of Sydney, Sydney 2006, Australia
Keywords: Deep Learning, Time Series Forecasting, Stock.
Abstract: Financial time series forecasting stands as a cornerstone in investment decision-making and risk management.
Nonetheless, traditional statistical models often grapple with capturing intricate nonlinear patterns and
enduring dependencies within data. To enhance prediction accuracy, this study delves into the feasibility of
employing deep learning technology in financial time series forecasting. 5 deep learning models have been
constructed, containing deep multi-layer perceptron (DMLP), convolutional neural networks (CNN), long
short-term memory networks (LSTM), recurrent neural networks (RNN), and auto-encoders (AE), leveraging
real transaction market data to forecast log returns. Through empirical comparison, we ascertain that the CNN
model excels in harnessing data features, outperforming other models in prediction accuracy. Nevertheless,
AE models exhibit the poorest performance in this task, attributed to their deficiency in modeling time
dependencies. Overall, this study validates the possible usefulness for predicting financial time series data
and furnishes valuable insights for future research endeavors.
1 INTRODUCTION
With the rapid development of economic
globalization, the dynamics and complexity of
financial markets are increasing, making the
prediction of financial product prices and their
volatility a key and challenging issue. This study
focuses on constructing effective financial time series
forecasting models. Financial time series differ from
typical time series in that they often exhibit complex
characteristics such as non-linearity, non-nationality,
and high auto-correlation. These characteristics limit
the effectiveness of traditional forecasting models
like ARMA and GARCH in practice, as these models
often rely on assumptions of linearity and stationarity,
which do not capture the true dynamics of financial
markets.
With the development of artificial intelligence
technology, machine learning methods such as
Support Vector Machines (SVM), Random Forests,
and Gradient Boosting Trees have been used to
predict financial time series data and have been
compared for accuracy with traditional methods, and
efficiently handle bias and variance in time series data
(Jiang, 2021). Nevertheless, these preliminary
machine learning technologies also show certain
limitations, especially in handling high-dimensional
data and over-fitting issues. Moreover, these methods
often overlook the auto-correlation characteristic of
time series.
To overcome these limitations, this study
introduces deep learning methods, which are
extensions of machine learning algorithms inspired
by the human brain and utilize multi-layer neural
networks to simulate decision-making processes. In
this experiment, we leverage their strong non-linear
modeling capabilities to address the complexities of
financial time series. Deep learning models, such as
DMLP, CNN, LSTM, RNN and AE have proven
effective in various domains. These models are better
at capturing the auto-correlation of time series and
addressing non-nationality issues (Neagoe et al.,
2018).
The main contributions of this paper are as
follows: First, we systematically compare various
deep learning models in the prediction of financial
time series; second, we discuss these models'
effectiveness in handling high-dimensional data and
preventing over-fitting. The structure of the paper is
organized as follows: we begin with an introduction
to the research background and problem statement,
then detail the experimental methods and the deep
learning models used, followed by reporting and
analyzing the experimental results, and conclude with
a summary of findings and future research directions.
Chen, X.
Stock Price Prediction Based on Deep Learning.
DOI: 10.5220/0013004900004601
Paper published under CC license (CC BY-NC-ND 4.0)
In Proceedings of the 1st International Conference on Innovations in Applied Mathematics, Physics and Astronomy (IAMPA 2024), pages 151-159
ISBN: 978-989-758-722-1
Proceedings Copyright © 2024 by SCITEPRESS Science and Technology Publications, Lda.
151
2 RELATED WORK
Initially, stock price predictions relied primarily on
mathematical models and statistical methods. Ding et
al. (2010) researched the volatility of financial
products using SV, GARCH, and EWMA models.
The findings indicated that the EWMA model
excelled in volatility prediction, whereas the SV
model outperformed GARCH when volatility was
highly variable and random (Ding and Meade, 2010).
Wahyudi et al. (2017) found that the ARIMA model
has strong short-term predictive capabilities,
effectively competing with existing stock price
prediction technologies (Wahyudi, 2017). Rouf et al.
(2021) compared the application of SVM and neural
networks in forecasting financial time series. The
study discovered that although SVM is a commonly
used prediction method, it is slow in forecasting,
whereas simple deep learning methods are unable to
make accurate predictions due to randomness issues
(Rouf et al, 2021).
Lu (2024) addressed the limitations of RNN and
LSTM by developing an efficient Time-Series
Recurrent Neural Network (TRNN), which
compresses time series data to enhance the accuracy
of stock price predictions (Lu and Xu, 2024). Zaheer
et al. (2023) experimented with the Shanghai
Composite Index (000001) and mixed existing deep
learning methods, comparing models such as CNN-
RNN and CNN-LSTM. The results showed that
CNN-RNN performed best among these hybrid
models (Zaheer et al., 2023). Fang et al. (2023)
proposed an improved LSTM-based model,
incorporating a cross-entropy loss function and an
adaptive network mechanism, capable of making
precise predictions in time series with significant
long-term volatility (Fang et al., 2023). By using
genetic algorithm to optimize parameters of RNN, Al
et al. (2023) successfully improved the prediction
performance of the model and found the most suitable
parameter configuration (Haromainy et al., 2023).
Masini et al. (2023) investigates the application of
supervised machine learning techniques to stock price
prediction, conducts in-depth analysis of linear
models, especially regularization models such as
ridge regression, and also explores nonlinear models
and integration methods including random forests
(Masini et al., 2023).
3 METHOD
3.1 DMLP
MLP is the most traditional form of DNN (Wang,
Yan and Oates, 2017), and DMLP is one type of
advanced MLP with more hidden layer, which can
handle both linearly separable and nonlinear
separable data.
Figure 1: DMLP structure (Sutskever et al., 2013).
Hidden layer, as the most important part of the
model, comprises four hidden layers, each with 64
neurons and utilizing the ReLU activation function.
The output layer ultimately produces the final
prediction, with a single node for regression
problems, generating continuous value predictions
such as Log_return (Figure 1). The DMLP model
employs back-propagation for learning, propagating
errors from the output layer neurons back to the
hidden layer to iteratively optimize the algorithm of
the model, specifically the weights of the connections
between layers (Sutskever et al., 2013).
3.2 RNN
RNN is meticulously crafted to handle sequential data
effectively. Unlike DMLP, RNN is capable of
processing sequences of arbitrary length, with hidden
layers passing information between different points in
time (Freeborough and Van, 2022). The input layer
adjusts log_return feature values to sequence data in
(sample, time_steps, function) format for processing
by the RNN. The hidden layer, which consists of two
loop layers, each layer uses 64 SimpleRNN units
(Figure 2). When receiving new data at each time
step, the current log_return is weighted with the
previous hidden state and the hidden layer state is
updated through the ReLU activation function. The
output layer integrates information from all points in
time to produce the final predicted value.
At every time step h
IAMPA 2024 - International Conference on Innovations in Applied Mathematics, Physics and Astronomy
152
Obtain the input at present point of time x
and the
latent representation at the previous point of time
h

.
Compute the updated latent representation at
present point of time h
:
h
= 𝑓(𝑊∙𝑥
+ 𝑈∙ℎ

+ 𝑏) (1)
- W and U represent the parametric transformation
matrix that map the input and the previous latent state,
respectively, onto the current latent state.
- b represents a bias.
- f represents the activation function, such as ReLU,
tanh, etc.
Generate the output of at present y
.
y
= V h
+ c (2)
- V represents the parametric transformation matrix
from the previous latent state to the output.
- c represents a bias (Yan & Ouyang 2018).
3.3 LSTM
Common RNN is prone to gradient vanishing or
gradient explosions when dealing with long
sequences, which limits their ability to learn long-
distance dependencies. To solve this problem, we
introduce the LSTM (Sherstinsky, 2020), which
features an intricate internal structure composed of
four primary interactive elements: a unit cell state
(responsible for long-term memory storage) and three
gates that regulate the flow of information (input gate,
forget gate, and output gate). This design enables the
LSTM to selectively introduce or discard information
from the cell state as necessary, thereby effectively
preserving and updating long-term memory
representations (Figure 3).
Hidden layer:
The following experiment uses 4 hidden layers, and
the structure of each layer is as follows:
Cell state: The most important part of LSTM,
allowing the network to pass and maintain long-term
information.
𝐶
= 𝑓
∙𝐶

+ 𝑖
∙𝑐
(3)
Input gate: regulates which aspects of the newly
introduced log_return data should be incorporated
into the cell state. Its functionality is governed by
sigmoid activation functions.
Figure 2: RNN structure (Introduction to recurrent neural network).
Figure 3. LSTM structure (LSTM Recurrent Neural Networks)
Stock Price Prediction Based on Deep Learning
153
𝑖
= 𝜎(𝑊

, 𝑥
+ 𝑏
) (4)
𝑐
= 𝑡𝑎𝑛ℎ(𝑊

, 𝑥
+ 𝑏
) (5)
W
and W
represent the parametric
transformation matrix that map the input and the
previous cell state onto the input gate and candidate
cell state representations, respectively.b
and b
are
the corresponding biases.
Forget gate: The activation function is used to
determine the information in the log_return unit state
that needs to be forgotten or discarded.
f
= σ(W
h

,x
+b
) (6)
σ represents the sigmoid activation function. W
represents the parametric transformation matrix of the
forgetting gate. b
represents a bias (Yan and
Ouyang, 2020).
3.4 CNN
CNN is unique in their layer type and structure and
are designed to process data with clearly defined grid
patterns, such as images. The synergistic combination
of convolutional layers and pooling layers within the
CNN enables the efficient extraction of localized
patterns and global information from the raw time
series data, ultimately facilitating the final prediction
task (Mehtab and Sen, 2021).
The convolution layer learns multiple convolution
cores from input data log_return through convolution
operations, and uses 64 filters to extract local feature
patterns (Figure 4).
The activation layer employs the ReLU activation
function, thereby introducing non-linear
transformations to the data.
The pooling layer down-samples the features
output by the convolutional layer, reducing the spatial
dimension and the number of parameters while
preserving the most significant local features. Then,
before the fully connected layers, the high-
dimensional data obtained from the convolution layer
and the pooling layer are flattened into a one-
dimensional vector, all features are integrated, and the
regression predicted value log_return is finally
output.
𝑜𝑢𝑡𝑝𝑢𝑡 = 𝑅𝑒𝐿𝑈(𝑊∙𝑖nput + b) (8)
3.5 Auto-Encoder(AE)
The goal of the AE is to reconstruct itself from the
input data, that is, it attempts to map the input data to
itself, predicting the data through this process (Figure
5). AE usually consists of encoder and decoder two
parts.
Figure 5. AE structure (Applied Deep Learning - Part 3).
𝑜𝑢𝑡𝑝𝑢𝑡𝑖 =
𝑘𝑒𝑟𝑛𝑒𝑙𝑘 𝑖𝑛𝑝𝑢𝑡𝑖 + 𝑘) + 𝑏𝑖𝑎𝑠
(7)
Figure 4. CNN structure (Zhang et al., 2024)
IAMPA 2024 - International Conference on Innovations in Applied Mathematics, Physics and Astronomy
154
The primary function of an encoder is to compress
high-dimensional input features into a lower-
dimensional encoded representation through multiple
hidden layers. In this particular model, the encoder
employs a fully connected layer with a ReLU
activation function to compress and encode input
features into a 32-dimensional representation.
Conversely, the decoder undertakes the task to
reconstruct the primordial high-dimensional input
features from the low-dimensional encoded
representation. It shares a similar structure to the
encoder but with the arrangement of hidden layers
and neurons reversed. In this model, the decoder
utilizes a linearly activated fully connected layer that
generates vectors with the same feature dimension as
the original input, with the objective of reconstructing
the input data (Sezer et al., 2020).
4 EXPERIMENT
Real life financial market data tends to be highly
uncertain and complex. Traditional statistical models,
while widely used in finance, may have limitations
when dealing with these properties. The aim of the
experiment is to evaluate these models (DMLP, RNN,
LSTM, CNN, AE) on the logarithmic return of
financial derivatives in the actual trading market.
Data set
The data set used in this experiment comes from
Kaggle and contains relevant stock market data on the
actual execution of financial markets. The data set is
recorded in seconds, carefully reflecting the rapid
changes in real financial markets. This data set is used
to predict future log_returns to better reflect stock
price movements.
Firstly, calculate the logarithmic return rate of
each successive time point for each different time_id
in the data set, analyze the market dynamics of each
time period, and do further time series analysis.
Logarithmic return for each interval is calculated
using:
𝑙𝑜𝑔_𝑟𝑒𝑡𝑢𝑟𝑛 = 𝑙𝑜𝑔(
  
 
) (9)
4.1 Evaluation Indicators
To assess the model's predictive accuracy, this study
employed three widely adopted regression
performance metrics: MSE, RMSE, and MAE. These
measures facilitated a quantitative comparison
between the model's predicted values and the actual
observed values obtained from the experimental data
(Figure 6, figure 7 and figure 8).
Figure 6. MSE Comparison among models (Picture credit: Original).
Figure 7. RMSE Comparison among models (Picture credit: Original).
Stock Price Prediction Based on Deep Learning
155
Figure 8. MAE Comparison among models (Picture credit: Original).
Table 1. Indicators comparison.
Method MSE RMSE MAE
DMLP 2.995e-07 0.000547 0.000544
RNN 2.897e-07 0.000538 0.000537
LSTM 2.534e-07 0.000503 0.000503
CNN 2.416e-07 0.000492 0.000489
AE 0.941 0.969998 0.413337
Figure 9. Prediction vs Actual values plots for DMLP (Picture credit: Original).
According to the above experimental results
(Table 1), AE has the worst performance and CNN
has the best three indicators, although there is little
difference in the indicators of DMLP, RNN, LSTM
and CNN. Therefore, the comprehensive
performance of CNN model is the best, and the
comprehensive performance of AE model is the
worst. This shows that the CNN model can fit and
predict the log_return more accurately, and can
capture the characteristics and rules of the data, while
the performance of AE model is relatively poor.
4.2 Prediction Vs Actual Values Plots
In addition to the above indicators that can evaluate
the performance of the model, Prediction vs Actual
values Plots can also help observe whether
Log_return fluctuations have been captured.
IAMPA 2024 - International Conference on Innovations in Applied Mathematics, Physics and Astronomy
156
Figure 10. Prediction vs Actual values plots for RNN(Picture credit: Original).
Figure 11. Prediction vs Actual values plots for LSTM (Picture credit: Original).
Figure 12. Prediction vs Actual values plots for CNN (Picture credit: Original).
Stock Price Prediction Based on Deep Learning
157
Figure 13. Prediction vs Actual values plots for AE (Picture credit: Original)
.
These 5 deep learning models exhibit distinct
characteristics in forecasting financial time series.
The DMLP model shows significant prediction
discrepancies due to its inability to effectively model
long-term dependencies (Figure 9). The RNN model
improves upon the DMLP but still exhibits large
deviations, constrained by gradient vanishing or
exploding issues (Figure 10). In contrast, the LSTM
model (Figure 11), by incorporating gating
mechanisms and memory units, captures the long-
range interdependence inherent in time-series data,
culminating in more precise prognostications. The
CNN model can efficiently capture localized features
and mitigate noise, outperforming LSTM.
Conversely, the Acoustic Emission (AE) model
performs poorly in time series prediction tasks as it is
designed to learn a compressed representation of data
only (Figure 12). In conclusion, for financial time
series prediction tasks, CNN and LSTM models can
more effectively capture features and dependencies,
yet further exploration and improvements are needed
to enhance accuracy and stability in deep learning
models (Figure 13).
Therefore, different deep learning models show
obvious differences in dealing with financial time
series prediction due to their differences in structure
and principle. Compared with other models, CNN
model and LSTM model can capture the features and
dependencies of time series data more effectively, so
they achieve better performance in this task.
However, deep learning models still need to be
further explored and improved to improve the
accuracy and stability of financial time series
predictions.
4.3 Discussion
Mainly analyse the reasons for the above results and
the differences between CNN and AE models, and
review other models.
CNN is good at capturing local features and can
effectively extract important features from input data.
This is very helpful for problems that require analysis
and prediction of time series data. By sliding over the
input data through the convolution kernel, the CNN
model can identify and extract important features to
maximize the use of the structural information of the
input data. The CNN model has strong generalization
ability and can better fit the complex data distribution,
resulting in more accurate predicting outcomes. AE
model aims to unveil the latent feature representations
inherent within the input data to enable compression
and reconstruction of the data. However, in this task
of time series prediction, simply learning the feature
representation of the data may not adequately capture
important information such as time dependence in the
data.
The DMLP model demonstrates inadequacy in
capturing persistent dependencies among time series
data. Conversely, the RNN model introduces loop
joins to model sequence data, improving predictive
performance. However, issues like disappearing
gradients or explosions constrain its full potential.
The LSTM model effectively addresses the gradient
problem of RNN for long series, yielding more
accurate predictions with consistent trends.
Consequently, AE and other models may
underperform compared to CNN in this task. CNN's
structure and characteristics make them well-suited
for solving time series forecasting challenges,
leveraging spatio-temporal features for optimal
performance in this domain.
IAMPA 2024 - International Conference on Innovations in Applied Mathematics, Physics and Astronomy
158
5 CONCLUSION
This study investigates the performance comparison
of various deep learning models by analyzing real
trading market data. The log_return was employed as
the target feature, and multiple deep neural network
algorithms, including DMLP, CNN, LSTM, RNN,
and AE, were constructed to predict the log_return.
The obtained results were comprehensively
discussed, and experimental evaluations were
conducted. Through rigorous experimental
comparisons, it was discovered that the CNN model
structure and features are better suited for processing
time series prediction tasks. This is because the CNN
model can effectively capture and utilize the inherent
characteristics of the data, enabling it to outperform
other models in this specific application. However,
AE and other models exhibited relatively poor
performance due to its lack of capability in modeling
time dependencies present in the data. Moving
forward, future research endeavors could focus on
further exploring the potential application of
improved AE models in the realm of time series
prediction. Alternatively, efforts could be directed
towards developing more sophisticated and efficient
models to enhance the preciseness and efficiency of
predicting financial time series. Such advancements
would contribute to unlocking the full potential of
deep learning techniques in this crucial domain.
REFERENCES
Jiang, W. Applications of deep learning in stock market
prediction: recent progress. Wiley Interdisciplinary
Reviews: Data Mining and Knowledge Discovery ,
2021, 184, 115537.
Neagoe, V. E., Ciotec, A. D., & Cucu, G. S. Deep
convolutional Neural Networks versus multilayer
perceptron for financial prediction. In 2018 International
Conf. on Communications.2018, pp. 201-206.
Ding, J. and Meade, N. ‘Forecasting accuracy of stochastic
volatility, GARCH and EWMA models under different
volatility scenarios’, Applied Financial Economics,
2010, 20(10), pp. 771–783.
Wahyudi, S. T. The ARIMA Model for the Indonesia Stock
Price. International Journal of Economics &
Management, 2017, 11.
Rouf, N., Malik, M. B., Arif, T., Sharma, S., Singh, S.,
Aich, S., & Kim, H. C. Stock market prediction using
machine learning techniques: a decade survey on
methodologies, recent developments, and future
directions. Electronics, 2021, 10(21), 2717.
Lu, M., & Xu, X. TRNN: An efficient time-series recurrent
neural network for stock price prediction. Information
Sciences, 2024, 657, 119951.
Zaheer, S., Anjum, N., Hussain, S., Algarni, A. D., Iqbal,
J., Bourouis, S., & Ullah, S. S. A multi parameter
forecasting for stock time series data using LSTM and
deep learning model. Mathematics, 2023, 11(3), 590.
Fang, Z., Ma, X., Pan, H., Yang, G., & Arce, G. R.
Movement forecasting of financial time series based on
adaptive LSTM-BN network. Expert Systems with
Applications, 2023, 213, 119207.
Al Haromainy, M. M., Prasetya, D. A., & Sari, A. P.
Improving Performance of RNN-Based Models With
Genetic Algorithm Optimization For Time Series
Data. TIERS Information Technology Journal,
2023, 4(1), 16-24.
Masini, R. P., Medeiros, M. C., & Mendes, E. F. Machine
learning advances for time series forecasting. Journal of
economic surveys, 2023, 37(1), 76-111.
Wang, Z., Yan, W., & Oates, T. Time series classification
from scratch with deep neural networks: A strong
baseline. In 2017 International joint conference on
neural networks, 2017, pp. 1578-1585.
Sutskever, I., Martens, J., Dahl, G., & Hinton, G. On the
importance of initialization and momentum in deep
learning. In International conference on machine
learning. 2013, pp.1139-1147.
Freeborough, W., & van Zyl, T. Investigating explainability
methods in recurrent neural network architectures for
financial time series data. Applied Sciences, 2022,
12(3), 1427.
Introduction to recurrent neural network https://www.Geek
sforgeeks.org/introduction-to-recurrent-neural-network/
Yan, H., & Ouyang, H. Financial time series prediction based
on deep learning. Wireless Personal Communications,
2018, 102, 683-700.
Sherstinsky, A. Fundamentals of recurrent neural network
(RNN) and long short-term memory (LSTM)
network. Physica D, 2020, 404, 132306.
LSTM Recurrent Neural Networks — How to Teach a
Network to Remember the Past, https://towardsda
tascience.com/lstm-recurrent-neural-networks-how-to-
teach-a-network-to-remember-the-past-55e54c2ff22e
Yan, H., & Ouyang, H.. Financial time series prediction
based on deep learning. Wireless Personal
Communications, 2020, 102, 683-700.
Mehtab, S., & Sen, J. Analysis and forecasting of financial
time series using CNN and LSTM-based deep
learning models. In Advances in Distributed
Computing and Machine Learning 2021. pp. 405-423.
Zhang, C., Sjarif, N. N. A., & Ibrahim, R. Deep learning
models for price forecasting of financial time series: A
review of recent advancements: 2020–2022. Wiley
Interdisciplinary Reviews: Data Mining and
Knowledge Discovery, 2024, 14(1), e1519.
Applied Deep Learning - Part 3: Autoencoders,
https://towardsdatascience.com/applied-deep-learning-
part-3-autoencoders-1c083af4d798
Sezer, O. B., Gudelek, M. U., & Ozbayoglu, A. M.
Financial time series forecasting with deep learning:
A systematic literature review: 2005–2019. ASC, 2020,
90, 106181.
Stock Price Prediction Based on Deep Learning
159