Abstract
In the area of adaptive and responsive problems, Reinforcement Learning algorithms have made significant progress. This paper presents solutions to the grid world problem using the model-free reinforcement learning method known as the Q-Learning algorithm. The solutions are developed with forward and backward reward propagations under the assumption of a Markov decision process for the grid world problem. This study detail the implementation and comparison of these two reward calculations. The paper also illustrates how an agent interact with the gird environment during both forward and backward propagations and compare their benefits followed by hyperparameter tuning for better understanding of the model’s convergence.
Original language | English |
---|---|
Title of host publication | AI-2023 Forty-third SGAI International Conference on Artificial Intelligence CAMBRIDGE, ENGLAND 12-14 DECEMBER 2023 |
Number of pages | 8 |
Publication status | Published (in print/issue) - 2023 |
Event | AI-2023 Forty-third SGAI International Conference on Artificial Intelligence CAMBRIDGE, ENGLAND 12-14 DECEMBER 2023 - CAMBRIDGE, ENGLAND, United Kingdom Duration: 12 Dec 2023 → 14 Dec 2023 http://www.bcs-sgai.org/ai2023/ |
Conference
Conference | AI-2023 Forty-third SGAI International Conference on Artificial Intelligence CAMBRIDGE, ENGLAND 12-14 DECEMBER 2023 |
---|---|
Country/Territory | United Kingdom |
City | ENGLAND |
Period | 12/12/23 → 14/12/23 |
Internet address |
Keywords
- Reinforcement learning
- Q-learning
- Markov chain