Skip to main navigation Skip to search Skip to main content

Value function optimistic initialization with uncertainty and confidence awareness in lifelong reinforcement learning

  • Soumia Mehimeh*
  • , Xianglong Tang
  • , Wei Zhao
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

We study the case of value function initialization in a reinforcement learning agent that deals with a set of tasks varying in terms of reward and sampled in a lifelong manner. The existing works tackling this setting of transfer reinforcement learning often consider a uniform sampling of tasks in their experiments, while optimistically transferring the knowledge by using the maximum seen outcome. However, due to the uncertainty of the real world, infrequent events cause the distribution to be non-uniform. As a consequence, the optimistic initialization seems impractical because it gives equally high importance to both frequent and infrequent tasks causing sample complexity to increase. We argue that to overcome such limitation, the agent must be able to assess how optimism is influenced by its uncertainty and confidence; two intercorrelated notions that play a crucial role in decision-making. Therefore we propose a novel approach UCOI (Uncertainty and Confidence aware Optimistic Initialization) that applies optimism only in adequate situations and we prove that our approach shows advantageous results over the existing works, especially for tasks coming from a non-uniform distribution.

Original languageEnglish
Article number111036
JournalKnowledge-Based Systems
Volume280
DOIs
StatePublished - 25 Nov 2023

Keywords

  • Knowledge transfer
  • Lifelong learning
  • Optimistic initialization
  • Reinforcement learning

Fingerprint

Dive into the research topics of 'Value function optimistic initialization with uncertainty and confidence awareness in lifelong reinforcement learning'. Together they form a unique fingerprint.

Cite this