Developing a Compressed Object Detection Model based on YOLOv4 for Deployment on Embedded GPU Platform of Autonomous System

AI-generated keywords: YOffleNet Autonomous System KITTI Dataset Embedded GPU System Object Detection

AI-generated Key Points

  • YOffleNet is a new object detection model designed for real-time and safe driving applications on autonomous systems.
  • Existing CNN-based models are accurate but require high-performance GPUs, making them unsuitable for embedded systems with limited memory space.
  • Lightweight detection models have low accuracy for safe driving applications.
  • YOffleNet is based on the YOLOv4 backbone network architecture but replaces the high-calculation-load CSP DenseNet with lighter modules from ShuffleNet.
  • YOffleNet achieves a 4.7 times higher compression ratio compared to YOLOv4-s.
  • Experiments using the KITTI dataset show that YOffleNet achieves real-time performance with as fast as 46 FPS on an embedded GPU system (NVIDIA Jetson AGX Xavier).
  • Despite the high compression ratio, the accuracy of YOffleNet is only slightly reduced to 85.8% mAP, which is just 2.6% lower than YOLOv4-s.
  • YOffleNet offers a promising solution for overcoming memory limitations in embedded systems without compromising performance or safety requirements in autonomous vehicles.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Issac Sim, Ju-Hyung Lim, Young-Wan Jang, JiHwan You, SeonTaek Oh, Young-Keun Kim

in Chinese language
License: CC BY 4.0

Abstract: Latest CNN-based object detection models are quite accurate but require a high-performance GPU to run in real-time. They still are heavy in terms of memory size and speed for an embedded system with limited memory space. Since the object detection for autonomous system is run on an embedded processor, it is preferable to compress the detection network as light as possible while preserving the detection accuracy. There are several popular lightweight detection models but their accuracy is too low for safe driving applications. Therefore, this paper proposes a new object detection model, referred as YOffleNet, which is compressed at a high ratio while minimizing the accuracy loss for real-time and safe driving application on an autonomous system. The backbone network architecture is based on YOLOv4, but we could compress the network greatly by replacing the high-calculation-load CSP DenseNet with the lighter modules of ShuffleNet. Experiments with KITTI dataset showed that the proposed YOffleNet is compressed by 4.7 times than the YOLOv4-s that could achieve as fast as 46 FPS on an embedded GPU system(NVIDIA Jetson AGX Xavier). Compared to the high compression ratio, the accuracy is reduced slightly to 85.8% mAP, that is only 2.6% lower than YOLOv4-s. Thus, the proposed network showed a high potential to be deployed on the embedded system of the autonomous system for the real-time and accurate object detection applications.

Submitted to arXiv on 01 Aug. 2021

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2108.00392v1

This paper presents a new object detection model called YOffleNet, which is designed to be compressed at a high ratio while minimizing accuracy loss for real-time and safe driving applications on an autonomous system. The existing CNN-based object detection models are accurate but require high-performance GPUs, making them unsuitable for embedded systems with limited memory space. While there are lightweight detection models available, their accuracy is too low for safe driving applications. To address these challenges, the proposed YOffleNet model is based on the YOLOv4 backbone network architecture. However, instead of using the high-calculation-load CSP DenseNet, the model replaces it with lighter modules from ShuffleNet. This significant compression allows YOffleNet to achieve a 4.7 times higher compression ratio compared to YOLOv4-s. The experiments conducted using the KITTI dataset demonstrate that YOffleNet can achieve real-time performance with as fast as 46 frames per second (FPS) on an embedded GPU system (NVIDIA Jetson AGX Xavier). Despite the high compression ratio, the accuracy of YOffleNet is only slightly reduced to 85.8% mean average precision (mAP), which is just 2.6% lower than YOLOv4-s. Overall, this study highlights the potential of deploying the proposed YOffleNet network on embedded systems in autonomous vehicles for real-time and accurate object detection applications. By compressing the network while preserving accuracy, this model offers a promising solution for overcoming memory limitations in embedded systems without compromising performance or safety requirements.
Created on 26 Dec. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.