e-ISSN 2231-8526
ISSN 0128-7680

Home / Regular Issue / JST Vol. 30 (4) Oct. 2022 / JST-3266-2021


Automated Cryptocurrency Trading Bot Implementing DRL

Aisha Peng, Sau Loong Ang and Chia Yean Lim

Pertanika Journal of Science & Technology, Volume 30, Issue 4, October 2022


Keywords: Automated trading system, deep neural network, reinforcement learning

Published on: 28 September 2022

A year ago, one thousand USD invested in Bitcoin (BTC) alone would have appreciated to three thousand five hundred USD. Deep reinforcement learning (DRL) recent outstanding performance has opened up the possibilities to predict price fluctuations in changing markets and determine effective trading points, making a significant contribution to the finance sector. Several DRL methods have been tested in the trading domain. However, this research proposes implementing the proximal policy optimisation (PPO) algorithm, which has not been integrated into an automated trading system (ATS). Furthermore, behavioural biases in human decision-making often cloud one’s judgement to perform emotionally. ATS may alleviate these problems by identifying and using the best potential strategy for maximising profit over time. Motivated by the factors mentioned, this research aims to develop a stable, accurate, and robust automated trading system that implements a deep neural network and reinforcement learning to predict price movements to maximise investment returns by performing optimal trading points. Experiments and evaluations illustrated that this research model has outperformed the baseline buy and hold method and exceeded models of other similar works.

  • Anthony, M., Bartlett, P. L., & Bartlett, P. L. (1999). Neural network learning: Theoretical Foundations. Cambridge University Press.

  • Azulay, A., & Weiss, Y. (2018). Why do deep convolutional networks generalize so poorly to small image transformations? Journal of Machine Learning, 20(184), 1-25.

  • Benesty, J., Chen, J., Huang, Y., & Cohen, I. (2009). On the importance of the Pearson correlation coefficient in noise reduction. IEEE Transactions on Audio, Speech, and Language Processing, 16(4), 757-765.

  • Bishop, C. M. (1995). Neural networks for pattern recognition. Oxford University Press.

  • Cartwright, T. J. (1991). Planning and chaos theory. Journal of the American Planning Association, 57(1), 44-56.

  • Chollet, F. (2017). Deep learning with Python. Simon and Schuster.

  • Cowpertwait, P. S. P., & Metcalfe, A. V. (2009) Time series data. In Introductory time series with R (pp. 1-25). Springer.

  • Dempster, M. A. H., & Romahi, Y. S. (2002). Intraday FX trading: An evolutionary reinforcement learning approach. In H. Yin, N. Allinson, R. Freeman, J. Keane & S. Hubbard (Eds.), Intelligent Data Engineering and Automated Learning - IDEAL 2002 (pp. 347-358). Springer.

  • Dempster, M. A., & Leemans, V. (2006). An automated FX trading system using adaptive reinforcement learning. Expert Systems with Applications, 30(3), 543-552.

  • Fang, F., Ventre, C., Basios, M., Kong, H., Kanthan, L., Li, L., Martinez-Regoband, D., & Wu, F. (2022). Cryptocurrency trading: A comprehensive survey. Financial Innovation, 8(13).

  • Ganesh, S., Vadori, N., Xu, M., Zheng, H., Reddy, P., & Veloso, M. (2019). Reinforcement learning for market making in a multi-agent dealer market. arXiv Preprint.

  • Graves, A. (2012). Long short-term memory. In Supervised sequence labelling with recurrent neural networks (pp. 37-45). Springer.

  • Grondman, I., Busoniu, L., Lopes, G. A., & Babuska, R. (2012). A survey of actor-critic reinforcement learning: Standard and natural policy gradients. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 42(6), 1291-1307.

  • Gronwald, M. (2014). The economics of bitcoins - Market characteristics and price jumps. (Working Paper No. 5121).

  • Haferkorn, M., & Diaz, J. M. Q. (2014). Seasonality and interconnectivity within cryptocurrencies - An analysis on the basis of bitcoin, litecoin and namecoin. In A. Lugmayr (Ed). International Workshop on Enterprise Applications and Services in the Finance Industry (pp. 106-120). Springer.

  • Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735-1780.

  • Hossin, M., & Sulaiman, M. N. (2015). A review on evaluation metrics for data classification evaluations. International Journal of Data Mining & Knowledge Management Process, 5(2).

  • Huang, B., Huan, Y., Xu, L. D., Zheng, L., & Zou, Z. (2019). Automated trading systems statistical and machine learning methods and hardware implementation: A survey. Enterprise Information Systems, 13(1), 132-144.

  • Huang, C. Y. (2018). Financial trading as a game: A deep reinforcement learning approach. arXiv Preprint.

  • Hyndman, R. J., & Athanasopoulos, G. (2018). Forecasting: Principles and practice. OTexts.

  • Jeong, G., & Kim, H. Y. (2019). Improving financial trading decisions using deep Q-learning: Predicting the number of shares, action strategies, and transfer learning. Expert Systems with Applications, 117, 125-138.

  • Kolm, P. N., & Ritter, G. (2019). Modern perspectives on reinforcement learning in finance. Journal of Machine Learning in Finance, 1(1).

  • Konda, V. R., & Tsitsiklis, J. N. (1999). Actor-critic algorithms. In S. Solla, T. Leen & K. Müller (Eds.), NIPS’99: Proceedings of the 12th International Conference on Neural Information Processing Systems (pp. 1008-1014). MIT Press.

  • Kotsiantis, S. B., Kanellopoulos, D., & Pintelas, P. E. (2006). Data preprocessing for supervised leaning. International Journal of Computer and Information Engineering, 1(12), 4104-4109.

  • Li, Y. (2017). Deep reinforcement learning: An overview. arXiv Preprint.

  • Liao, S., Wang, J., Yu, R., Sato, K., & Cheng, Z. (2017). CNN for situations understanding based on sentiment analysis of twitter data. Procedia Computer Science, 111, 376-381.

  • Liu, X. Y., Yang, H., Chen, Q., Zhang, R., Yang, L., Xiao, B., & Wang, C. (2020). FinRL: A deep reinforcement learning library for automated stock trading in quantitative finance. arXiv Preprint.

  • Livieris, I. E., Pintelas, E., & Pintelas, P. (2020). A CNN-LSTM model for gold price time-series forecasting. Neural Computing and Applications, 32, 17351-17360.

  • Lu, W., Li, J., Li, Y., Sun, A., & Wang, J. (2020). A CNN-LSTM-based model to forecast stock prices. Artificial Intelligence for Smart System Simulation, 2020, Article 6622927

  • Lucarelli, G., & Borrotti, M. (2019). A deep reinforcement learning approach for automated cryptocurrency trading. In J. MacIntyre, I. Maglogiannis, L. Iliadis & E. Pimenidis (Eds.), Artificial Intelligence Applications and Innovations (pp. 247-258). Springer.

  • Moody, J., Saffell, M., Andrew, W. L., Abu-Mostafa, Y. S., LeBaraon, B., & Weigend, A. S. (1999). Minimizing downside risk via stochastic dynamic programming. Computational Finance, 403-415.

  • Neely, C. J., Rapach, D. E., Tu, J., & Zhou, G. (2014). Forecasting the equity risk premium: The role of technical indicators. Management Science, 60(7), 1772-1791.

  • Pan, S. J., & Yang, Q. (2009). A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22(10), 1345-1359.

  • Rosenblatt, F. (1958). The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65(6), 386-408.

  • Sattarov, O., Muminov, A., Lee, C. W., Kang, H. K., Oh, R., Ahn, J., Oh, H. J., & Jeon, H. S. (2020). Recommending cryptocurrency trading points with deep reinforcement learning approach. Applied Sciences, 10(4), Article 1506.

  • Schulman, J., Moritz, P., Levine, S., Jordan, M., & Abbeel, P. (2015). High-dimensional continuous control using generalized advantage estimation. arXiv Preprint.

  • Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal policy optimization algorithms. arXiv Preprint.

  • Shrestha, A., & Mahmood, A. (2019). Review of deep learning algorithms and architectures. IEEE Access, 7, 53040-53065.

  • Sola, J., & Sevilla, J. (1997). Importance of input data normalization for the application of neural networks to complex industrial problems. IEEE Transactions on Nuclear Science, 44(3), 1464-1468.

  • Tucnik, P. (2010). Optimization of automated trading system’s interaction with market environment. In P. Forbrig & H. Günther (Eds.), Perspectives in Business Informatics Research (pp. 55-61). Springer.

  • Van Otterlo, M., & Wiering, M. (2012). Reinforcement learning and Markov decision processes. In M. Wiering & M. Van Otterlo (Eds.), Reinforcement Learning. Adaptation, Learning, and Optimization (pp. 3-42). Springer.

  • Vrigazova, B. (2021). The proportion for splitting data into training and test set for the bootstrap in classification problems. Business Systems Research Journal, 12(1) 228-242.

  • Wu, C. H., Lu, C. C., Ma, Y. F., & Lu, R. S. (2018). A new forecasting framework for bitcoin price with LSTM. In 2018 IEEE International Conference on Data Mining Workshops (ICDMW) (pp. 168-175). IEEE Publishing.

  • Xiong, Z., Liu, X. Y., Zhong, S., Yang, H., & Walid, A. (2018). Practical deep reinforcement learning approach for stock trading. arXiv Preprint.

  • Yang, H., Liu, X. Y., Zhong, S., & Walid, A. (2020, October 15-16). Deep reinforcement learning for automated stock trading: An ensemble strategy. In Proceedings of the First ACM International Conference on AI in Finance (pp. 1-8). ACM Publishing.

  • Zhang, W., Yang, Z., Shen, J., Liu, M., Huang, Y., Zhang, X., Tang, R., & Li, Z. (2021). Learning to build high-fidelity and robust environment models. In N. Oliver, F. Pérez-Cruz, S. Kramer, J. Read & J. A. Lozano (Eds.), Machine Learning and Knowledge Discovery in Databases (pp. 104-121). Springer. https://doi:10.1007/978-3-030-86486-6_7

  • Zhang, Z., Zhang, Y., & Li, Z. (2018). Removing the feature correlation effect of multiplicative noise. arXiv Preprint.

ISSN 0128-7680

e-ISSN 2231-8526

Article ID


Download Full Article PDF

Share this article

Recent Articles