Stock Price Prediction Based on CNN, LSTM and CNN- LSTM Model

Yufei Wang

Stony Brook Institute, Anhui University, Longhe Campus, Hefei, 230000, China

Keywords: Stocks Prediction, CNN Model, LSTM Model, CNN-LSTM Hybrid Model.

Abstract: Stock investments are perennially recognized for their high potential returns and commensurate risk levels.

While CNN and LSTM models individually demonstrate proficiency in data prediction, each has its inherent

limitations. In pursuit of overcoming these limitations, this research proposes a composite CNN-LSTM

model. Initially, this paper selects two disparate stocks for evaluation, employing the individual CNN and

LSTM models to predict their prices. Subsequently, the construction of the hybrid model involves utilizing

the CNN layers to extract spatial features, which are then transformed into a one-dimensional vector. This

vector is subsequently fed into the LSTM layer to capitalize on its sequence data handling capabilities,

culminating in the model's predictive execution. The final phase of this study entails a comparative analysis

of the predictive performance. The results show that the CNN-LSTM model inherits the advantages of the

two individual models and is both highly stable and extremely efficient in making predictions on big stock

data. This enhancement is particularly notable when compared to the single CNN and LSTM models,

underscoring the efficacy of integrating these two distinct computational approaches into a unified predictive

framework.

1 INTRODUCTION

With high risk and high returns, stocks have always

been popular and have become a favorite form of

investment. The stock market is a significant and

volatile component of the financial market, which is

a very complicated and uncertain field. However, the

high volatility of stocks in the stock exchange market

poses a challenge to traditional quantitative trading

strategies (Kong et al., 2024). When forecasting

stocks, investors should not only build models to

forecast stock data, but also pay attention to

macroeconomic factors, company fundamentals,

market sentiment and other multi-dimensional

information to comprehensively analyze the

dynamics of the stock market. With the advent of the

era of big data analytics, machine learning and deep

learning have shown great potential to surpass

traditional machine learning in data prediction (Hu et

al., 2019) When dealing with enormous amounts of

stock data, some academics have used different ways.

Nowadays, neural network model, random forest

model, deep learning model and time series model are

widely used by academics. These approaches have

produced stock trend forecasting models that have a

higher forecasting accuracy than individual forecasts

(Zhang 2023). However, scholars have found that

many times models do not predict stock prices.

Therefore, it is particularly important to construct a

model that can be used scientifically with high

accuracy and stability. In the following, this paper

will introduce two widely used models, the CNN

model and the LSTM model. In this paper, these two

models will be fused to obtain the CNN-LSTM model

and compare the performance of them.

Convolutional Neural Networks (CNNs) have a

great advantage in extracting features because CNNs

have a convolution layer and the convolution layer

can extract features very well. Since CNNs can share

the convolution Kernel, CNNs can also handle high

dimensional data very well. In addition, the CNN

model has an efficient operating speed, which is a

great advantage when dealing with large datasets.

However, the disadvantages of CNN are also obvious.

For example, the correlation between the local and the

overall could be ignored by the Pooling layer, which

could lose a lot of important data. In conclusion, when

predicting stocks, CNN models have the advantages

of strong feature extraction capability, the ability to

handle time series data, and higher prediction

accuracy in some cases. However, it also suffers from

high requirements on the size and diversity of the

Wang, Y.

Stock Price Prediction Based on CNN, LSTM and CNN- LSTM Model.

DOI: 10.5220/0012982600004601

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 1st International Conference on Innovations in Applied Mathematics, Physics and Astronomy (IAMPA 2024), pages 22-30

ISBN: 978-989-758-722-1

dataset, model complexity needs to be balanced, and

there may be overfitting and other shortcomings.

Long Short Term Memory (LSTM) model is a

variant of RNN. LSTM has forgetting gate and

memory gate so that he can remember and use past

information. This is different from the traditional

RNN model where the original information fades

away over time. Therefore, LSTM is very suitable for

processing long sequential data and has high practical

value for predicting the results of stock data.

Although LSTM is well received by many scholars,

in fact, it still has some defects. First of all, the LSTM

model is complex and takes a long time to run. Its four

main parts-memory cells, input gates, forgetting

gates, and memory gates-have complex calculations.

Secondly, LSTM is also likely to end up with

overfitting results when processing data, as it over-

memorizes and uses past information. In conclusion,

when predicting stocks, the LSTM model has the

advantages of the ability to handle long-term

dependencies, flexibility, and high prediction

accuracy. However, it also suffers from long

computation time and high model complexity.

This paper believes that CNN model and LSTM

model each has its own advantages and disadvantages

in stock prediction. CNN are usually computationally

more efficient, which gives CNNs an advantage when

dealing with large-scale datasets. However, the

convolutional kernel of CNNs usually has a fixed

size, which limits the scope of its observations. As a

result, CNNs may not be as effective as LSTMs in

capturing long-term dependencies. LSTM are able to

capture and model long-term dependencies in data

through their unique gating mechanism. This makes

LSTM perform well in handling sequential data with

long-term dependencies. However, LSTM needs to

maintain internal state at each time step, which causes

it to be computationally more complex and time-

consuming than CNN.

The motivation behind this paper is to address the

limitations and performance differences between

CNNs and LSTMs in stock prediction by exploring

and comparing their capabilities. Both models have

their own shortcomings when it comes to predicting

stocks, necessitating a deeper comparative analysis to

uncover their nuanced performance characteristics.

Additionally, this paper not only compares the

traditional CNN and LSTM models individually but

also examines the CNN-LSTM fusion model to better

understand the performance differences among these

methods. Through rigorous comparative analysis, the

paper seeks to provide valuable insights into the

potential of hybrid approaches for enhancing the

accuracy and efficiency of stock prediction models.

In conclusion, both CNN and LSTM models have

their own shortcomings when it comes to predicting

stocks. Their performance differences need to be

further explored from the comparative analysis, in

addition, this paper adds CNN-LSTM fusion model

to increase the comparative relationship to compare

the performance differences of these methods.

2 METHOD

First, this paper will construct CNN model, LSTM

model and CNN-LSTM model and predict two

different stock data of different sizes and visualize the

prediction results. Second, this paper will introduce

several evaluation metrics to evaluate each model.

Finally, we will compare the models, analyze the

reasons for the different results, and give appropriate

advice to investors.

2.1 Model Construction

 CNN Model

In the fields of text classification, natural language

processing, and imagine recognition, CNN is

recognized as a well-developed technology with

strong feature extraction capabilities in both data

space and time series (Lecun et al., 1989). The

structure of CNN usually consists of Input layer,

Convolution layer, ReLU, Pooling layer,

Rasterization, Fully Connected layer, ReLU and

Output layer (Figure 1). Moreover, Convolution

layer, ReLU and Pooling layer are stackable and

reusable, which is the core structure of CNN. The

CNN network receives multi-dimensional data as

input from the input layer, and to extract features

from the data, convolution operations are performed

on the input data by a convolution layer made up of

several convolution kernels. After the convolution

process, the pooling layer is used to minimize the

feature dimension. Furthermore, because of its local

invariance, the pooling layer improves network

stability and lessens model overfitting (Qi, 2024). The

pooling layer also helps the model capture key

features in the stock data and ignore unimportant

details. The fully connected layer located at the end

of the CNN can integrate the features extracted from

the previous layers very well. It spreads the feature

maps output from the convolutional and pooling

layers and performs a linear transformation through

the weight matrix and bias terms. Finally, the

activation function generates the final forecast

(Auntie and Chen, 2023).

Stock Price Prediction Based on CNN, LSTM and CNN- LSTM Model

Figure 1: CNN model structure (Picture credit: Original).

 LSTM

Due to the gradient explosion and gradient vanishing

problems of traditional recurrent neural networks,

LSTM was created to solve this problem (Luo et al.,

2024). LSTM is a Long Short-Term Memory

network, a special kind of RNN (Recurrent Neural

Network). Compared with traditional RNNs, LSTM

is able to solve long sequence problems effectively by

introducing the concepts of memory cells, input gates,

output gates and forgetting gates. As a result, LSTM

is more appropriate for processing and forecasting

significant events with long intervals in a time series

(Liu et al., 2024).

Forget gates and memory gates are structures

unique to the LSTM and also its strengths. Some

"information" about the previous state of the LSTM

may become "outdated" with the passage of time. The

"Forge gate" is responsible for selectively forgetting

specific elements of the previous cell state in order to

prevent excessive memory from impairing the neural

network's processing of the current input. The

following equation can be used to describe the

computation of the forgetting gate and is the output

vector of the sigmoid neural layer

𝑓



𝜎



𝑊







ℎ



,𝑥







𝑏



(1)

Whenever a new input value is fed, the LSTM will

first decide which memories to forget based on the

new input value and the output of the previous

moment. The Memory Gate can judge itself and

decide whether or not to incorporate the current data

into

the control unit of the unit state. The memory

gate also performs two more crucial functions. The

first is to retrieve the valid information from the

current input, which is determined by the formula:

𝐶





tanh 𝑊







ℎ



,𝑥





𝑏



 (2)

The other important operation is to filter and rate the

extracted valid information, which is calculated as:

𝑖



𝜎𝑊







ℎ



,𝑥





𝑏



 (3)

In practical calculations, the LSTM model has

some drawbacks. First, the computational complexity

of the LSTM model is high. Because LSTM uses a

gating mechanism and a long-term memory

mechanism compared to traditional RNN, the

computation of the LSTM model is greatly increased,

so it takes a lot of time when running large data.

Secondly, when using the LSTM model to predict the

data, this is a time series prediction, so the results

obtained will have a certain lag

 CNN-LSTM

CNN, as a network feature extractor, has excellent

feature extraction capability, in this model, CNN

model will be used for feature extraction. The idea of

extraction of Hua’s research (2021) is to first input

the data from the input layer (Hua , 2021); then use

multiple convolutional layers to extract the spatial

features of the input data; add pooling layers (e.g.,

max-pooling) between the convolutional layers to

reduce the spatial dimensions of the data while

retaining the important features; and finally use an

activation function (ReLU) after each convolutional

layer to increase the nonlinearity of the model. At this

point, the obtained reshape the output of the CNN

layer into a shape suitable for the LSTM input before

passing it to the LSTM layer for processing (Zhou

2022). The LSTM, in turn, introduces gating units to

preserve and forget the time series features, thus

realizing improved prediction accuracy (Figure 2).

Finally, a fully connected layer is added after the

LSTM layer as an output layer to output the

prediction results. The CNN-LSTM model, which

combines the advantages of CNN and LSTM

networks, is able to extract the time domain features

while taking into account the effects of multiple

influencing factors on the network, so that the

prediction results are accurate and at the same time

fast and efficient (Liu et al., 2024 & Hua, 2021)

Table 1: Stock data.

Date Open High Low Close Adj Close Volume

2010/2/1 6.870357 7 6.832143 6.954643 5.887797 7.5E+08

2010/2/2 6.996786 7.01 6.906429 6.995 5.921964 6.98E+08

2010/2/3 6.970357 7.15 6.943571 7.115357 6.023856 6.15E+08

2010/2/4 7.026071 7.084643 6.841786 6.858929 5.806765 7.58E+08

IAMPA 2024 - International Conference on Innovations in Applied Mathematics, Physics and Astronomy

Figure 2: CNN-LSTM model structure (Picture credit:

Original).

3 RESULTS

3.1 Data Choosing

Accurate data is essential before research beginning.

In this paper, two datasets are used. Apple's stock

price from January 8, 2010 to April 5, 2024 and

Nike's price from January 8, 2022 to April 5, 2024,

respectively. They both from Yahoo platform, in

addition, it contains not only the closing price, but

also the opening price, sales volume, the highest price

of the day, and the lowest price of the day. The table

1 shows part of Apple's data

3.2 Evaluation Indicators

In this section, four indicators, Root Mean Squared

Error, Mean Relative Error, symmetric Mean

Absolute Percentage Error and R-Square are chosen

to evaluate the prediction accuracy of the stock price

prediction model. The following are their formulas:

𝑅𝑀𝑆𝐸 



∑



















(4)

𝑀𝐴𝐸 

∑

|













(5)

𝑅



1

∑



















∑

















(6)

𝑀𝐴𝑃𝐸 





∑





















(7)

Both RMSE and MAE represent the degree of

deviation between the predicted and true values, so the

smaller the RMSE and MAE, the better the model.

Generally, R² ranges from 0 to 1. The better the

variables in the equation explain y and the more closely

the model fits the data, the closer R2 is to 1; conversely,

the closer it is to 0, the less well the model fits the data.

MAPE is expressed as a percentage. It can be used to

indicate the magnitude of the model error, so the closer

the MAPE is to 0 the more accurate the model is done

by Zhang in 2021 (Zhang, 2021)

In addition to these four metrics, this paper will

also calculate the running time of each model for both

datasets. (The time here is measured in seconds).

3.3 Apple’s Stock Data

Apple's stock data contains 3584 sets of data, which

is a larger dataset. When building models, there is

need to set look_back=n. It means to predict

tomorrow's closing price using data from the previous

n days at the current point in time. In making

predictions on Apple Inc.'s corporate data, all of three

models set look_bask=60. (When debugging the

model, after changing the value of look_back several

times, it was finally found that the model could fit the

data well when look_back=60).

3.3.1 CNN

CNN demonstrates its strength in handling long time

series (Figure 3). As can be seen from the prediction

curve, the true and predicted values are very close to

each other. There is only a small error at the local

extreme points

3.3.2 LSTM

According to the prediction curve in the figure 4, it is

LSTM demonstrates its strength in handling long time

series. But LSTM model still has a lag (i.e., the true

value lags behind the predicted value).

Stock Price Prediction Based on CNN, LSTM and CNN- LSTM Model

Figure 3: CNN predict result of APPLE’s (Picture credit: Original).

Figure 4: LSTM predict result of APPLE's (Picture credit: Original).

3.3.3 CNN-LSTM

The figure 5 shows the prediction results of the CNN-

LSTM model. From the prediction curve, it can be

seen that the CNN-LSTM composite model can

demonstrate better results than both the CNN model

and the LSTM model alone in predicting larger time

series. This paper argues that the CNN model

effectively reduces noise when extracting features.

Compared to the LSTM model which will extract all

the data when processing the data, the CNN-LSTM

model takes advantage of the CNN in extracting the

features and greatly reduces the error of the results

and it solves the problem of lagging prediction

results.

3.4 Nike’s Stock Data

Compared to Apple's stock data, Nike's stock data is

a smaller data set with only 562 data sets. In making

predictions on Nike’s stock data, all of three models

set look_bask=10. (When debugging the model, after

changing the value of look_back several times, it was

finally found that the model could fit the data well

when look_back=10.)

3.4.1 CNN

As can be seen from the figure 6, the CNN model

performs equally well when dealing with small

datasets. Although it is not as good as when dealing

with APPLE's dataset, the true and predicted values

are very close to each other and fit well.

IAMPA 2024 - International Conference on Innovations in Applied Mathematics, Physics and Astronomy

Figure 5: CNN-LSTM predict result of APPLE's (Picture credit: Original).

Figure 6: CNN predict result of NIKE's (Picture credit: Original).

Table 2: Summary of evaluation indicators.

Apple’s stock

data

Method RMSE MAE R^2 SMAPE Time

CNN 6.6710 1.7869 0.9979 4.3039% 7.97s

LSTM 43.4755 34.3995 0.7315 11.1565% 92.87s

CNN-LSTM 45.4162 35.7618 0.7174 11.5060% 7.91s

Nike’s stock

data

Method RMSE MAE R^2 SMAPE Time

CNN 11.6657 2.5883 0.9244 2.3619% 2.44s

LSTM 9.5071 7.3367 0.2957 5.0222% 6.35s

CNN-LSTM 9.0999 7.0301 0.6696 2.8232% 1.72s

Stock Price Prediction Based on CNN, LSTM and CNN- LSTM Model

Figure 7: LSTM predict result of NIKE's (Picture credit: Original).

Figure 8: CNN-LSTM predict result of NIKE's (Picture credit: Original).

3.4.2 LSTM

The figure 7 shows the prediction results of the LSTM

model. From the prediction curves, it can be found

that the LSTM model is still stable in its prediction

for small datasets. But the shortcoming is that the lag

of LSTM model is still obvious.

3.4.3 CNN-LSTM

From the prediction results (Figure 8), it can be seen

that the CNN-LSTM model is not very effective in

handling small datasets. the CNN model extracts the

features and then puts the features into the LSTM

layer for processing, which solves the problem of

lagging prediction results of the LSTM model.

However, at the same time, the CNN model extracts

too few features, which leads to insufficient training

of the LSTM model, and obvious errors can be seen

in the final prediction results.

3.5 Summary of Evaluation Indicators

By comparing the evaluation metrics of these three

models (Table 2), this paper draws the following

conclusions:

(1) In overall, the CNN model outperforms the

LSTM model in stock prediction. First of all, the

CNN model fits better in terms of prediction

results, both when predicting large stock data

and small stock data, and there is no lag in the

prediction results, as is the case with the LSTM

model. Secondly, CNN model takes very less

time in predicting stocks. Especially when

predicting large stock data, the CNN model

shows great efficiency and accuracy. As can be

IAMPA 2024 - International Conference on Innovations in Applied Mathematics, Physics and Astronomy

seen in previous experiments, when predicting

APPLES' stock data, the CNN model took only

7.97 seconds, but the LSTM model took 92.87

seconds.

(2) The CNN-LSTM model has an excellent fit. It

not only solves the defect of lagging prediction

results of LSTM model, but also obtains

extremely high efficiency. As can be seen from

the runtime result data in the chart, the CNN-

LSTM model took only 7.91 seconds to process

APPLE's stock data. The CNN-LSTM model

does not perform very well when predicting

small stock data. From the prediction curve, the

curve fitting is not very good and there is a

significant error. Although the CNN-LSTM

model can still achieve short time consumption

and no lag when predicting small stock data, the

prediction error is relatively large and the curve

is not fitted very well. Through the analysis, this

paper finds that since the CNN-LSTM model

extracts features from the convolutional layer in

the CNN model and then uses the features as

inputs to the LSTM layer, this can easily lead to

the output of the CNN that may lose some

important information of the original data that is

important for the LSTM, and this can lead to a

degradation of the fusion model's performance.

(3) From the results, CNN model has better

accuracy, stability and short time consuming

when predicting stock prices. CNN-LSTM

model performs well when dealing with large

stock data, not only efficient but also accurate.

However, when dealing with small stock data, it

can lead to poor fitting due to too little feature

data. Thus, in this paper, CNN-LSTM model is

considered to be poor in stability. LSTM model

will show deviation between the predicted curve

and the real curve when predicting the stock

price, and high time consuming is also its defect.

Nevertheless, this paper does not consider the

LSTM model and CNN-LSTM model as

unsuitable for predicting stock prices compared

to the CNN model because in machine learning

and financial forecasting, simply performing

well on a training set is not enough to show that

the model works just as well in real-world

applications. When predict the stock data, both

LSTM and CNN-LSTM models consider the

effect of feature values on the target value (stock

closing price) and are accompanied by long term

memory, so the predict results of them have

more real-world applications value, while CNN

models do not have these properties. Therefore,

in this paper, we believe that the prediction

results of LSTM and CNN-LSTM models have

the higher reference value for investors

4 CONCLUSION

Stock price prediction is a multifaceted and intricate

endeavor influenced by a myriad of factors. While

both CNN and LSTM models have demonstrated

efficacy in this domain, each exhibits inherent

limitations. To address this challenge, this study

introduces a novel stock price prediction

methodology leveraging a hybrid CNN-LSTM

model. By amalgamating the feature extraction

prowess of CNN with LSTM's adeptness in sequence

data processing, the resultant model achieves

heightened accuracy and enhanced stability. In the

experimental setup, we initially curate two distinct

stock datasets varying significantly in size,

accompanied by four evaluation metrics.

Subsequently, standalone CNN and LSTM models

are trained and evaluated on these datasets

individually. Prediction outcomes are obtained and

corresponding evaluation indices computed.

Thereafter, the hybrid CNN-LSTM model is

formulated, trained on the same datasets, and

evaluated using the established metrics. Comparative

analysis of the prediction outcomes across the three

models and the four evaluation metrics ensues.

The analysis shows that the CNN model exhibits

short time-consumption, stability, and accuracy when

dealing with both large and small stock data. The

LSTM model, on the other hand, possesses the

drawbacks of long time-consumption and the

generation of deviations between the prediction

curves and the true-value curves. The CNN-LSTM

model solves the two problems existing in the LSTM

model, however, since the features extracted by the

CNN model lose some important information about

the original data, this makes the data processed by the

LSTM layer incomplete, and this can lead to a

degradation in the performance of the fusion model.

Although both the LSTM model and CNN-LSTM

model are not as suitable as the CNN model for stock

price prediction in terms of the results, their

prediction results have higher practical application

value because they put the feature values into the

model to join the training and introduce the memory

gate and forgetting gate.

Finally, this paper argues that the CNN-LSTM

fusion model is suitable for predicting large stock

data because this model is characterized by high

efficiency, accuracy, and realistic applications. For

investors, this paper holds that both CNN model and

Stock Price Prediction Based on CNN, LSTM and CNN- LSTM Model

LSTM model are good prediction models when

pred

icting stocks, but CNN-LSTM fusion model

is a better choice when predicting stock data of

long time series

REFERENCES

Kong Y., et al. Design of stock quantitative trading

algorithms under deep reinforcement learning. Journal

of Nanchang University (Science Edition)

48.01(2024):24-29+35.

Hu Y, Luo D., Hua K, et al. A review and discussion on

deep learning. Journal of Intelligent Systems,

2019, 14(1):1-19.

Zhang H., Research on short-term trend prediction method

of stock based on deep learning. 2023. Beijing

University of Posts and Telecommunications, MA

thesis.

Lecun Y., Boser B, Denker J S, et al. Backpropagation

applied to handwritten zip code recognition. Neural

Computation, 1989, 1(4): 541-551

Qi Liang. Prediction of lithium-ion battery charge state

based on CNN-LSTM. Electronic Quality

.01(2024):1-5.

Auntie X. Chen. A CNN-LSTM stock price prediction

model based on Bayesian optimization. 2023.

Lanzhou University, MA thesis. doi:

10.27204/d.cnki.glzhu.2023.002202.

Luo G., Wang X., and Dai J., Randomized feature graph

neural network model for IoT intrusion detection.

Computer Engineering and Applications 1-12.

Liu Z., et al. A study on the rapid detection of total

flavonoid content in Dendrobium officinale based on

Raman spectroscopy combined with CNN-LSTM deep

learning method. Spectroscopy and Spectral Analysis

44.04(2024):1018-1024.

Hua S., A CNN-LSTM PM2.5 concentration prediction

model based on attention mechanism. 2021. Zhejiang

University of Technology, MA thesis.

Zhou Q., Application of CNN and LSTM in short-term

stock price rise and fall prediction of cyclical

stocks. 2022. Zhejiang University, MA

thesis.doi:10.27461/d.cnki.gzjdx.2022.000387.

Zhang Y., Design of stock price prediction and quantitative

investment based on CNN-LSTM hybrid model.

2021.Huazhong University of Science and Technology,

MA thesis.

IAMPA 2024 - International Conference on Innovations in Applied Mathematics, Physics and Astronomy