Q-Learning: Solutions for Grid World Problem with Forward and Backward Reward Propagrations

Snobin Antony, Raghi Roy, Y Bi

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

191 Downloads (Pure)

Abstract

In the area of adaptive and responsive problems, Reinforcement Learning algorithms have made significant progress. This paper presents solutions to the grid world problem using the model-free reinforcement learning method known as the Q-Learning algorithm. The solutions are developed with forward and backward reward propagations under the assumption of a Markov decision process for the grid world problem. This study detail the implementation and comparison of these two reward calculations. The paper also illustrates how an agent interact with the gird environment during both forward and backward propagations and compare their benefits followed by hyperparameter tuning for better understanding of the model’s convergence.
Original languageEnglish
Title of host publicationAI-2023 Forty-third SGAI International Conference on Artificial Intelligence CAMBRIDGE, ENGLAND 12-14 DECEMBER 2023
Number of pages8
Publication statusPublished (in print/issue) - 2023
EventAI-2023 Forty-third SGAI International Conference on Artificial Intelligence
CAMBRIDGE, ENGLAND 12-14 DECEMBER 2023
- CAMBRIDGE, ENGLAND, United Kingdom
Duration: 12 Dec 202314 Dec 2023
http://www.bcs-sgai.org/ai2023/

Conference

ConferenceAI-2023 Forty-third SGAI International Conference on Artificial Intelligence
CAMBRIDGE, ENGLAND 12-14 DECEMBER 2023
Country/TerritoryUnited Kingdom
CityENGLAND
Period12/12/2314/12/23
Internet address

Keywords

  • Reinforcement learning
  • Q-learning
  • Markov chain

Fingerprint

Dive into the research topics of 'Q-Learning: Solutions for Grid World Problem with Forward and Backward Reward Propagrations'. Together they form a unique fingerprint.

Cite this