A Comparative Study of Deterministic and Stochastic Policies for Q-learning

Y Bi, Adam Thomas-Mitchell, Wei Zhai, Naveed Khan

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

89 Downloads (Pure)

Abstract

Q-learning is a form of reinforcement learning that employs agents to perform actions in an environment under a policy to reach ultimate goals. Q-learning is also thought as a goal-directed learning to maximize the expected value of the cumulative rewards via optimizing policies. Deterministic and scholastic policies are commonly used in reinforcement learning. However, they perform quite different in Markov decision processes. In this study, we conduct a comparative study on these two policies in the context of a grid world problem with Q-learning and provide an insight into the superiority of the deterministic policy over the scholastic one.
Original languageEnglish
Title of host publication2023 4th International Conference on Artificial Intelligence, Robotics and Control, AIRC 2023
PublisherIEEE
Pages110-114
Number of pages5
ISBN (Electronic)9798350348248
ISBN (Print)979-8-3503-4825-5
DOIs
Publication statusPublished online - 2 Nov 2023
Eventthe 4th International Conference on Artificial Intelligence, Robotics and Control (AIRC 2023) - British University, Cairo, Egypt
Duration: 9 May 202311 May 2023

Publication series

Name2023 4th International Conference on Artificial Intelligence, Robotics and Control, AIRC 2023

Conference

Conferencethe 4th International Conference on Artificial Intelligence, Robotics and Control (AIRC 2023)
Abbreviated titleAIRC 2023
Country/TerritoryEgypt
CityCairo
Period9/05/2311/05/23

Bibliographical note

Publisher Copyright:
© 2023 IEEE.

Keywords

  • Reinforcement Learning
  • Q-Learning
  • Markov Decision Process
  • Deterministic and stochastic policies
  • GridWorld

Fingerprint

Dive into the research topics of 'A Comparative Study of Deterministic and Stochastic Policies for Q-learning'. Together they form a unique fingerprint.

Cite this