SparseDrive: End-to-End Autonomous Driving via Sparse Scene Representation

AI-generated keywords: Autonomous driving Modular systems End-to-end paradigms SparseDrive Safety

AI-generated Key Points

The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.

  • Traditional modular systems in autonomous driving handle perception, prediction, and planning tasks separately.
  • End-to-end paradigms aim to unify these tasks into a single framework for optimization with a focus on planning.
  • Current methods face challenges in performance and efficiency, especially in ensuring planning safety due to computational complexity and simplistic design.
  • SparseDrive introduces a symmetric sparse perception module integrating detection, tracking, and online mapping tasks using a fully sparse representation of the driving scene.
  • SparseDrive adopts a parallel design approach for motion prediction and planning tasks, treating planning as a multi-modal problem with a hierarchical planning selection strategy for safe trajectories.
  • SparseDrive outperforms existing methods in task performance while achieving higher training and inference efficiency.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Wenchao Sun, Xuewu Lin, Yining Shi, Chuang Zhang, Haoran Wu, Sifa Zheng

Abstract: The well-established modular autonomous driving system is decoupled into different standalone tasks, e.g. perception, prediction and planning, suffering from information loss and error accumulation across modules. In contrast, end-to-end paradigms unify multi-tasks into a fully differentiable framework, allowing for optimization in a planning-oriented spirit. Despite the great potential of end-to-end paradigms, both the performance and efficiency of existing methods are not satisfactory, particularly in terms of planning safety. We attribute this to the computationally expensive BEV (bird's eye view) features and the straightforward design for prediction and planning. To this end, we explore the sparse representation and review the task design for end-to-end autonomous driving, proposing a new paradigm named SparseDrive. Concretely, SparseDrive consists of a symmetric sparse perception module and a parallel motion planner. The sparse perception module unifies detection, tracking and online mapping with a symmetric model architecture, learning a fully sparse representation of the driving scene. For motion prediction and planning, we review the great similarity between these two tasks, leading to a parallel design for motion planner. Based on this parallel design, which models planning as a multi-modal problem, we propose a hierarchical planning selection strategy , which incorporates a collision-aware rescore module, to select a rational and safe trajectory as the final planning output. With such effective designs, SparseDrive surpasses previous state-of-the-arts by a large margin in performance of all tasks, while achieving much higher training and inference efficiency. Code will be avaliable at https://github.com/swc-17/SparseDrive for facilitating future research.

Submitted to arXiv on 30 May. 2024

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2405.19620v2

This paper's license doesn't allow us to build upon its content and the summarizing process is here made with the paper's metadata rather than the article.

In the field of autonomous driving, traditional modular systems have been widely used. These systems handle different tasks such as perception, prediction, and planning separately. However, this approach often leads to information loss and error accumulation across modules. In contrast, end-to-end paradigms aim to unify these tasks into a single framework that is fully differentiable. This allows for optimization with a focus on planning. Despite their potential benefits, current methods still face challenges in terms of performance and efficiency. This is especially true when it comes to ensuring planning safety due to the computational complexity of bird's eye view (BEV) features and the simplistic design of prediction and planning components. To address these limitations, a new paradigm called SparseDrive has been proposed. introduces a symmetric sparse perception module that integrates detection, tracking, and online mapping tasks using a model architecture that learns a fully sparse representation of the driving scene. Additionally, for motion prediction and planning tasks, a parallel design approach is adopted based on observations of similarities between these two tasks. The parallel motion planner in models planning as a multi-modal problem and incorporates a hierarchical planning selection strategy with a collision-aware rescore module. This strategy aims to select rational and safe trajectories as final planning outputs. With its innovative designs, significantly outperforms existing state-of-the-art methods in terms of task performance while also achieving higher training and inference efficiency. The authors Wenchao Sun, Xuewu Lin, Yining Shi, Chuang Zhang, Haoran Wu,and Sifa Zheng have provided code for on GitHub to facilitate further research in the field of autonomous driving.
Created on 19 Aug. 2024

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.