A* Search Without Expansions: Learning Heuristic Functions with Deep Q-Networks

AI-generated keywords: Q* search Artificial Intelligence Deep Q-networks Action Space Heuristic Function

AI-generated Key Points

  • Efficiently solving problems with large action spaces is a challenge in AI
  • A* search algorithm requires significant computation and memory resources
  • Q* search is a novel algorithm that uses deep Q-networks to guide the search process
  • Q* search reduces computation time and only generates one node per iteration
  • Q* search solves the Rubik's cube problem with a large action space more efficiently than A* search
  • Q* search is up to 129 times faster and generates up to 1288 times fewer nodes than A* search
  • Q* search guarantees finding a shortest path with an admissible heuristic function
  • Experiments were conducted on other puzzles like Lights Out and the 35-Pancake puzzle
  • Additional theoretical results support the admissibility of Q* search as a search algorithm.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Forest Agostinelli, Alexander Shmakov, Stephen McAleer, Roy Fox, Pierre Baldi

Added theoretical results to show that Q* search is an admissible search algorithm. Added comparisons to deferred heuristic evaluation. Added experiments with Lights Out and the 35-Pancake puzzle
License: CC BY 4.0

Abstract: Efficiently solving problems with large action spaces using A* search has been of importance to the artificial intelligence community for decades. This is because the computation and memory requirements of A* search grow linearly with the size of the action space. This burden becomes even more apparent when A* search uses a heuristic function learned by computationally expensive function approximators, such as deep neural networks. To address this problem, we introduce Q* search, a search algorithm that uses deep Q-networks to guide search in order to take advantage of the fact that the sum of the transition costs and heuristic values of the children of a node can be computed with a single forward pass through a deep Q-network without explicitly generating those children. This significantly reduces computation time and requires only one node to be generated per iteration. We use Q* search to solve the Rubik's cube when formulated with a large action space that includes 1872 meta-actions and find that this 157-fold increase in the size of the action space incurs less than a 4-fold increase in computation time and less than a 3-fold increase in number of nodes generated when performing Q* search. Furthermore, Q* search is up to 129 times faster and generates up to 1288 times fewer nodes than A* search. Finally, although obtaining admissible heuristic functions from deep neural networks is an ongoing area of research, we prove that Q* search is guaranteed to find a shortest path given a heuristic function that neither overestimates the cost of a shortest path nor underestimates the transition cost.

Submitted to arXiv on 08 Feb. 2021

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2102.04518v2

Efficiently solving problems with large action spaces has long been a challenge in the field of artificial intelligence. The A* search algorithm is effective but requires significant computation and memory resources that grow linearly with the size of the action space. To address this issue, the authors introduce Q* search, a novel search algorithm that utilizes deep Q-networks to guide the search process. This approach significantly reduces computation time and only requires one node to be generated per iteration. The authors apply Q* search to solve the Rubik's cube problem formulated with a large action space consisting of 1872 meta-actions and find that despite the 157-fold increase in action space size, Q* search incurs less than a 4-fold increase in computation time and less than a 3-fold increase in the number of nodes generated compared to A* search. In fact, it is up to 129 times faster and generates up to 1288 times fewer nodes than A* search. Furthermore, they prove that Q* search is guaranteed to find a shortest path given an admissible heuristic function—one that neither overestimates nor underestimates transition costs. While obtaining such heuristic functions from deep neural networks is an ongoing research area, this theoretical result provides confidence in its effectiveness. Experiments were conducted on other puzzles like Lights Out and the 35-Pancake puzzle as well as comparisons with deferred heuristic evaluation and additional theoretical results supporting its admissibility as a search algorithm. Overall, Q* search offers a promising approach for efficiently solving problems with large action spaces by leveraging deep Q-networks through reduced computation time and node generation compared to A* search.
Created on 23 Nov. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.