Skip to main navigation Skip to search Skip to main content

Q_learning based on active backup and memory mechanism

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Exploration is used in Q_learning because the agent will be caught in locally optimal policies due to blind exploitation. However excessive exploration will degrade the performance of Q_learning and it is difficult to meet the trade-off between exploration and exploitation. In this paper, the active backup is introduced into Q_learning and the corresponding algorithm AB_Q_learning based on Dijkstra backup in dynamic programming is proposed. Then, the memory mechanism based MEAB_Q_learning algorithm is given for the agent to learn in completely unknown environment. The experimental results show that these two algorithms not only converge more quickly, but also solve the problem of local optimization.

Original languageEnglish
Title of host publicationProceedings of 2004 International Conference on Machine Learning and Cybernetics
Pages271-275
Number of pages5
StatePublished - 2004
Externally publishedYes
EventProceedings of 2004 International Conference on Machine Learning and Cybernetics - Shanghai, China
Duration: 26 Aug 200429 Aug 2004

Publication series

NameProceedings of 2004 International Conference on Machine Learning and Cybernetics
Volume1

Conference

ConferenceProceedings of 2004 International Conference on Machine Learning and Cybernetics
Country/TerritoryChina
CityShanghai
Period26/08/0429/08/04

Keywords

  • Dijkstra backup
  • Q_learning
  • Reinforcement learning

Fingerprint

Dive into the research topics of 'Q_learning based on active backup and memory mechanism'. Together they form a unique fingerprint.

Cite this