Deep Reinforcing Learning for Trading with TD(λ)

ZiHao Wang, Qiang Gao

2022

Abstract

In recent years, reinforcement learning algorithms for financial asset trading have been extensively studied. Since in financial markets the state observed by the agent is not exactly equal to the state of the market, which could affect the performance of reinforcement learning strategies, the trading decision problem is a Partially Observable Markov Decision Process(POMDP). However, few studies have considered the impact of the degree of Markov property of financial markets on the performance of reinforcement learning strategies. In this paper, we analyze the efficiency and effectiveness of Monte Carlo(MC) and temporal difference(TD) methods, followed by analyze how TD(λ) combines the two methods to reduce the performance loss caused by partially observable Markov property with the bootstrap parameter λ and truncated horizon h. Then considering the non-stationary nature of the financial time market,we design a stepwise approach to update the trading model and Update the model online during the transaction. Finally, we test the model on IF300(index futures of China stock market) data and the results show that TD(λ) performs better in terms of return and sharpe ratio than TD and MC methods and online updates can be better adapted to changes in the market, thus increasing profit and reducing the maximum drawdown.

Download


Paper Citation


in Harvard Style

Wang Z. and Gao Q. (2022). Deep Reinforcing Learning for Trading with TD(λ). In Proceedings of the 3rd International Symposium on Automation, Information and Computing - Volume 1: ISAIC; ISBN 978-989-758-622-4, SciTePress, pages 488-494. DOI: 10.5220/0011953600003612


in Bibtex Style

@conference{isaic22,
author={ZiHao Wang and Qiang Gao},
title={Deep Reinforcing Learning for Trading with TD(λ)},
booktitle={Proceedings of the 3rd International Symposium on Automation, Information and Computing - Volume 1: ISAIC},
year={2022},
pages={488-494},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011953600003612},
isbn={978-989-758-622-4},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 3rd International Symposium on Automation, Information and Computing - Volume 1: ISAIC
TI - Deep Reinforcing Learning for Trading with TD(λ)
SN - 978-989-758-622-4
AU - Wang Z.
AU - Gao Q.
PY - 2022
SP - 488
EP - 494
DO - 10.5220/0011953600003612
PB - SciTePress