Pelee: A Real-Time Object Detection System on Mobile Devices

AI-generated keywords: Deep Learning Convolutional Neural Network (CNN) Efficient Architectures PeleeNet Real-time Object Detection

AI-generated Key Points

The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.

  • Growing demand for efficient Convolutional Neural Network (CNN) models for mobile devices with limited computational power and memory resources
  • Emergence of efficient architectures like MobileNet, ShuffleNet, and MobileNetV2 relying on depthwise separable convolution
  • Introduction of PeleeNet architecture by researchers led by Robert J. Wang, Xiang Li, and Charles X. Ling using conventional convolution instead of depthwise separable convolution
  • PeleeNet achieved higher accuracy and ran over 1.8 times faster than MobileNet and MobileNetV2 on NVIDIA TX2 hardware while being only 66% of the size of MobileNet
  • Development of real-time object detection system named Pelee combining PeleeNet with SSD method optimized for speed
  • Impressive results: Pelee achieved mean average precision (mAP) of 76.4% on PASCAL VOC2007 dataset and 22.4 mAP on MS COCO dataset at speeds of 23.6 FPS on iPhone 8 and 125 FPS on NVIDIA TX2 hardware
  • Outperformed YOLOv2 in precision with a computational cost that was 13.6 times lower and a model size that was 11.3 times smaller
  • Demonstrates potential impact of efficient model design in enabling real-time object detection systems to operate seamlessly on mobile devices without compromising performance or accuracy levels
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Robert J. Wang, Xiang Li, Charles X. Ling

Accepted to NeurIPS 2018

Abstract: An increasing need of running Convolutional Neural Network (CNN) models on mobile devices with limited computing power and memory resource encourages studies on efficient model design. A number of efficient architectures have been proposed in recent years, for example, MobileNet, ShuffleNet, and MobileNetV2. However, all these models are heavily dependent on depthwise separable convolution which lacks efficient implementation in most deep learning frameworks. In this study, we propose an efficient architecture named PeleeNet, which is built with conventional convolution instead. On ImageNet ILSVRC 2012 dataset, our proposed PeleeNet achieves a higher accuracy and over 1.8 times faster speed than MobileNet and MobileNetV2 on NVIDIA TX2. Meanwhile, PeleeNet is only 66% of the model size of MobileNet. We then propose a real-time object detection system by combining PeleeNet with Single Shot MultiBox Detector (SSD) method and optimizing the architecture for fast speed. Our proposed detection system2, named Pelee, achieves 76.4% mAP (mean average precision) on PASCAL VOC2007 and 22.4 mAP on MS COCO dataset at the speed of 23.6 FPS on iPhone 8 and 125 FPS on NVIDIA TX2. The result on COCO outperforms YOLOv2 in consideration of a higher precision, 13.6 times lower computational cost and 11.3 times smaller model size.

Submitted to arXiv on 18 Apr. 2018

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 1804.06882v3

This paper's license doesn't allow us to build upon its content and the summarizing process is here made with the paper's metadata rather than the article.

In the realm of deep learning, there is a growing demand for efficient Convolutional Neural Network (CNN) models that can run on mobile devices with limited computational power and memory resources. This necessity has spurred research into designing models that are both accurate and fast. Recent years have seen the emergence of several efficient architectures such as MobileNet, ShuffleNet, and MobileNetV2, all of which heavily rely on depthwise separable convolution. However, this particular type of convolution lacks efficient implementation in most deep learning frameworks. To address this challenge, a team of researchers led by Robert J. Wang, Xiang Li, and Charles X. Ling proposed an innovative architecture called PeleeNet. Unlike its counterparts, PeleeNet is built using conventional convolution instead of depthwise separable convolution. The researchers conducted experiments on the ImageNet ILSVRC 2012 dataset and found that PeleeNet not only achieved higher accuracy but also ran over 1.8 times faster than MobileNet and MobileNetV2 on NVIDIA TX2 hardware. Additionally, PeleeNet boasted a model size that was only 66% of MobileNet's size. Building upon the success of PeleeNet, the researchers went on to develop a real-time object detection system named Pelee by combining PeleeNet with the Single Shot MultiBox Detector (SSD) method and optimizing the architecture for speed. The results were impressive - Pelee achieved a mean average precision (mAP) of 76.4% on the PASCAL VOC2007 dataset and 22.4 mAP on the MS COCO dataset while running at speeds of 23.6 frames per second (FPS) on an iPhone 8 and 125 FPS on NVIDIA TX2 hardware. Notably, Pelee outperformed YOLOv2 in terms of precision while offering a computational cost that was 13.6 times lower and a model size that was 11.3 times smaller. These findings demonstrate the potential impact of efficient model design in enabling real-time object detection systems to operate seamlessly on mobile devices without compromising performance or accuracy levels. This groundbreaking research by Wang et al., accepted at NeurIPS 2018, showcases how innovative architectural choices can lead to significant advancements in deep learning applications for mobile platforms.
Created on 29 Mar. 2024

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

The license of this specific paper does not allow us to build upon its content and the summarizing tools will be run using the paper metadata rather than the full article. However, it still does a good job, and you can also try our tools on papers with more open licenses.

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.