Skip to main navigation Skip to search Skip to main content

An average-reward reinforcement learning algorithm based on Schweitzer's Transformation

  • Jianjun Li*
  • , Jiangong Ren
  • , Yanjie Li
  • *Corresponding author for this work
  • Harbin Institute of Technology Shenzhen

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In this paper, we propose a relative value iteration reinforcement learning (RVI-RL) algorithm based on Schweitzer's Transformation for Markov decision processes (MDP) with average reward. An equivalent average reward optimality equation and a new form of action-value function are presented via Schweitzer's Transformation. Then, combined with the theory of relative value iteration, this RVI-RL algorithm doesn't only omit the estimation of the average reward in the learning, but also improves the convergence rate. Finally, a simulation experiment for the navigation of autonomous mobile robot is considered, which illustrates the effectiveness and applicability of the algorithm.

Original languageEnglish
Title of host publicationProceedings of the 31st Chinese Control Conference, CCC 2012
Pages2966-2970
Number of pages5
StatePublished - 2012
Externally publishedYes
Event31st Chinese Control Conference, CCC 2012 - Hefei, China
Duration: 25 Jul 201227 Jul 2012

Publication series

NameChinese Control Conference, CCC
ISSN (Print)1934-1768
ISSN (Electronic)2161-2927

Conference

Conference31st Chinese Control Conference, CCC 2012
Country/TerritoryChina
CityHefei
Period25/07/1227/07/12

Keywords

  • Average reward
  • Reinforcement Learning
  • Relative value iteration
  • Robotic navigation

Fingerprint

Dive into the research topics of 'An average-reward reinforcement learning algorithm based on Schweitzer's Transformation'. Together they form a unique fingerprint.

Cite this