Skip to main navigation Skip to search Skip to main content

Evolutionary algorithm based reinforcement learning in the uncertain environments

  • Hai Tao Liu*
  • , Bing Rong Hong
  • , Song Hao Piao
  • , Xue Mei Wang
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Reinforcement learning (RL) problems with uncertainty and hidden state present significant obstacles to prevailing RL methods. In this paper, a novel approximate algorithm, called Memetic algorithm based Q-Learning (MA-Q-Learning), is proposed as a means to solve the POMDP problems which has such uncertainty problems. The policies are evolved using memetic algorithms, whereas the improved Q-learning obtains predictive rewards to indicate fitness of the evolved policies. In order to solve the hidden state problem, historical information is incorporated with the current belief state to aid in finding the optimal policy. Finally, the search efficiency is improved by a hybrid search method, in which an adjustment factor is used to help keep the diversity of population and guide the crossover based on the combination of multiple kinds of crossover and mutation. The experiments conducted on benchmark datasets show that the proposed methodology is superior to other state-of-the-art POMDP approximate methods.

Original languageEnglish
Pages (from-to)1356-1360
Number of pages5
JournalTien Tzu Hsueh Pao/Acta Electronica Sinica
Volume34
Issue number7
StatePublished - Jul 2006
Externally publishedYes

Keywords

  • Belief state
  • Hidden state
  • Memetic algorithm
  • POMDP
  • Q-learning

Fingerprint

Dive into the research topics of 'Evolutionary algorithm based reinforcement learning in the uncertain environments'. Together they form a unique fingerprint.

Cite this