Abstract
Q-learning is a form of reinforcement learning that employs agents to perform actions in an environment under a policy to reach ultimate goals. Q-learning is also thought as a goal-directed learning to maximize the expected value of the cumulative rewards via optimizing policies. Deterministic and scholastic policies are commonly used in reinforcement learning. However, they perform quite different in Markov decision processes. In this study, we conduct a comparative study on these two policies in the context of a grid world problem with Q-learning and provide an insight into the superiority of the deterministic policy over the scholastic one.
Original language | English |
---|---|
Title of host publication | 2023 4th International Conference on Artificial Intelligence, Robotics and Control, AIRC 2023 |
Publisher | IEEE |
Pages | 110-114 |
Number of pages | 5 |
ISBN (Electronic) | 9798350348248 |
ISBN (Print) | 979-8-3503-4825-5 |
DOIs | |
Publication status | Published online - 2 Nov 2023 |
Event | the 4th International Conference on Artificial Intelligence, Robotics and Control (AIRC 2023) - British University, Cairo, Egypt Duration: 9 May 2023 → 11 May 2023 |
Publication series
Name | 2023 4th International Conference on Artificial Intelligence, Robotics and Control, AIRC 2023 |
---|
Conference
Conference | the 4th International Conference on Artificial Intelligence, Robotics and Control (AIRC 2023) |
---|---|
Abbreviated title | AIRC 2023 |
Country/Territory | Egypt |
City | Cairo |
Period | 9/05/23 → 11/05/23 |
Bibliographical note
Publisher Copyright:© 2023 IEEE.
Keywords
- Reinforcement Learning
- Q-Learning
- Markov Decision Process
- Deterministic and stochastic policies
- GridWorld