Methods for reinforcement learning policy improvement for a single market maker

  • Abbas Haider

Student thesis: Doctoral Thesis

Abstract

Market Making (MM) is a sequential decision-making financial problem. Market makers are a particular class of traders who are obligated to trade with other market participants to make profits and provide market liquidity. The buyers can easily find sellers and vice-versa. The profit made by a market maker (MMer) is the compensation for holding an inventory position. Profits and market liquidity are related. More profit means more trade, and more trade means higher liquidity. Reinforcement Learning (RL) has been a popular choice to solve MM. The reason is the RL framework’s suitability to represent MMer’s objective effectively. Reward maximization is synonymous with the profit maximization behaviour of an MMer. The RL policy denotes MMer’s strategy to make profits. Moreover, the state-space can effectively represent the market. RL can solve problems which are challenging to tackle with traditional machine learning. For instance, supervised learning requires a labelled training dataset. Obtaining such a dataset would be challenging due to market uncertainty.

This thesis reports three research studies (one for each RL component i.e. reward, value-function and function approximation (FA)) to improve the state-of-the-art MM policies by Spooner et al. (2018) and by Kumar (2020). The first study develops a novel reward function. The outcome is a faster convergence to a MM policy than the approach by Spooner et al. (2018). The second study has two contributions, 1) the “predictive market making” model, and 2) the first solution to multi-asset market making problem based on multi-task learning and deep RL. The third study develops a novel FA method. The overall outcome of this work is the set of novel methods that produce a better generalized and more profitable MM policy than the current state-of-the-art.

Date of AwardApr 2023
Original languageEnglish
SponsorsVCRS
SupervisorGlenn Hawe (Supervisor), Hui Wang (Supervisor) & Bryan Scotney (Supervisor)

Keywords

  • function approximation
  • value-function
  • deep learning
  • multi-task learning

Cite this

'