Improving Intrinsic Exploration by Creating Stationary Objectives

AI-generated keywords: Exploration bonuses Reinforcement learning Intrinsic objectives Stationary Objectives For Exploration (SOFE) framework Count-based methods

AI-generated Key Points

  • Exploration bonuses in reinforcement learning to guide long-horizon exploration
  • Limitations of count-based methods in larger state spaces and continuous environments
  • Stationary Objectives For Exploration (SOFE) framework to transform non-stationary rewards into stationary ones
  • SOFE improves agents' performance in challenging exploration problems
  • Experiments demonstrate SOFE's effectiveness in sparse-reward tasks, pixel-based observations, 3D navigation, and procedurally generated environments
  • Introduces a novel framework for improving intrinsic exploration in reinforcement learning
  • Promising results in addressing challenges related to count-based methods and optimizing agents' objectives in complex environments
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Roger Creus Castanyer, Joshua Romoff, Glen Berseth

Under Review at ICLR 2024
License: CC BY 4.0

Abstract: Exploration bonuses in reinforcement learning guide long-horizon exploration by defining custom intrinsic objectives. Count-based methods use the frequency of state visits to derive an exploration bonus. In this paper, we identify that any intrinsic reward function derived from count-based methods is non-stationary and hence induces a difficult objective to optimize for the agent. The key contribution of our work lies in transforming the original non-stationary rewards into stationary rewards through an augmented state representation. For this purpose, we introduce the Stationary Objectives For Exploration (SOFE) framework. SOFE requires identifying sufficient statistics for different exploration bonuses and finding an efficient encoding of these statistics to use as input to a deep network. SOFE is based on proposing state augmentations that expand the state space but hold the promise of simplifying the optimization of the agent's objective. Our experiments show that SOFE improves the agents' performance in challenging exploration problems, including sparse-reward tasks, pixel-based observations, 3D navigation, and procedurally generated environments.

Submitted to arXiv on 27 Oct. 2023

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2310.18144v1

This paper discusses the use of exploration bonuses in reinforcement learning to guide long-horizon exploration by defining custom intrinsic objectives. It addresses the limitations of count-based methods, which have been shown to perform well in MDPs with a finite and small set of states but introduce unstable learning dynamics in larger state spaces and continuous environments. The authors propose the Stationary Objectives For Exploration (SOFE) framework to transform non-stationary rewards into stationary ones through an augmented state representation. This approach improves agents' performance in challenging exploration problems by simplifying the optimization of their objective. Experiments demonstrate that SOFE enhances agents' performance in various scenarios, including sparse-reward tasks, pixel-based observations, 3D navigation, and procedurally generated environments. Overall, this paper introduces a novel framework for improving intrinsic exploration in reinforcement learning and shows promising results in addressing challenges related to count-based methods and optimizing agents' objectives in complex environments.
Created on 10 Jan. 2024

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.