DAMO-YOLO : A Report on Real-Time Object Detection Design

AI-generated keywords: Object Detection

AI-generated Key Points

The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.

  • Authors introduce novel object detection method called DAMO-YOLO
  • Incorporates cutting-edge technologies such as Neural Architecture Search (NAS), Reparameterized Generalized-FPN, lightweight head with AlignedOTA label assignment, and distillation enhancement
  • Optimization of detection backbone using MAE-NAS guided by principle of maximum entropy
  • Structures resembling ResNet/CSP with spatial pyramid pooling and focus modules
  • Integration of Generalized-FPN with accelerated queen-fusion for detector neck, enhanced CSPNet with efficient layer aggregation networks (ELAN) and reparameterization
  • Study on detector head size impact on accuracy, favoring heavy neck with single task projection layer
  • Introduction of AlignedOTA to address misalignment issues in label assignment, distillation schema for performance enhancement
  • Development of range of models tailored to different scenarios: DAMO-YOLO-T/S/M/L for general industry requirements achieving mAPs of 43.6/47.7/50.2/51.9 on COCO dataset with latencies ranging from 2.78 to 7.95 ms on T4 GPUs; DAMO-YOLO-Ns/Nm/Nl lightweight models for edge devices achieving mAPs of 32.3/38.2/40.5 on COCO with latencies between 4.08 and 6.69 ms on X86-CPU
  • Outperforms existing YOLO series models in various application scenarios due to innovative technologies and scalable model designs tailored to specific needs
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Xianzhe Xu, Yiqi Jiang, Weihua Chen, Yilun Huang, Yuan Zhang, Xiuyu Sun

Project Website: https://github.com/tinyvision/damo-yolo

Abstract: In this report, we present a fast and accurate object detection method dubbed DAMO-YOLO, which achieves higher performance than the state-of-the-art YOLO series. DAMO-YOLO is extended from YOLO with some new technologies, including Neural Architecture Search (NAS), efficient Reparameterized Generalized-FPN (RepGFPN), a lightweight head with AlignedOTA label assignment, and distillation enhancement. In particular, we use MAE-NAS, a method guided by the principle of maximum entropy, to search our detection backbone under the constraints of low latency and high performance, producing ResNet/CSP-like structures with spatial pyramid pooling and focus modules. In the design of necks and heads, we follow the rule of ``large neck, small head''.We import Generalized-FPN with accelerated queen-fusion to build the detector neck and upgrade its CSPNet with efficient layer aggregation networks (ELAN) and reparameterization. Then we investigate how detector head size affects detection performance and find that a heavy neck with only one task projection layer would yield better results.In addition, AlignedOTA is proposed to solve the misalignment problem in label assignment. And a distillation schema is introduced to improve performance to a higher level. Based on these new techs, we build a suite of models at various scales to meet the needs of different scenarios. For general industry requirements, we propose DAMO-YOLO-T/S/M/L. They can achieve 43.6/47.7/50.2/51.9 mAPs on COCO with the latency of 2.78/3.83/5.62/7.95 ms on T4 GPUs respectively. Additionally, for edge devices with limited computing power, we have also proposed DAMO-YOLO-Ns/Nm/Nl lightweight models. They can achieve 32.3/38.2/40.5 mAPs on COCO with the latency of 4.08/5.05/6.69 ms on X86-CPU. Our proposed general and lightweight models have outperformed other YOLO series models in their respective application scenarios.

Submitted to arXiv on 23 Nov. 2022

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2211.15444v4

This paper's license doesn't allow us to build upon its content and the summarizing process is here made with the paper's metadata rather than the article.

In their report titled "DAMO-YOLO: A Report on Real-Time Object Detection Design," authors Xianzhe Xu, Yiqi Jiang, Weihua Chen, Yilun Huang, Yuan Zhang, and Xiuyu Sun introduce a novel object detection method called DAMO-YOLO. This method surpasses the performance of the well-known YOLO series by incorporating cutting-edge technologies such as Neural Architecture Search (NAS), efficient Reparameterized Generalized-FPN (RepGFPN), a lightweight head with AlignedOTA label assignment, and distillation enhancement. The authors employ MAE-NAS guided by the principle of maximum entropy to optimize the detection backbone for low latency and high performance. This results in structures resembling ResNet/CSP with spatial pyramid pooling and focus modules. Following the design philosophy of "large neck, small head," they integrate Generalized-FPN with accelerated queen-fusion for the detector neck and enhance CSPNet with efficient layer aggregation networks (ELAN) and reparameterization. Furthermore, the study explores how detector head size impacts detection accuracy, concluding that a heavy neck with a single task projection layer yields superior results. The introduction of AlignedOTA addresses misalignment issues in label assignment, while a distillation schema enhances overall performance. Based on these advancements, the authors develop a range of models tailored to different scenarios. For general industry requirements, they propose DAMO-YOLO-T/S/M/L models achieving mAPs of 43.6/47.7/50.2/51.9 on COCO dataset with latencies ranging from 2.78 to 7.95 ms on T4 GPUs. Additionally, for edge devices with limited computing power, they introduce DAMO-YOLO-Ns/Nm/Nl lightweight models achieving mAPs of 32.3/38.2/40.5 on COCO with latencies between 4.08 and 6.69 ms on X86-CPU. Overall, the proposed DAMO-YOLO method outperforms existing YOLO series models in various application scenarios due to its innovative technologies and scalable model designs tailored to specific needs.
Created on 21 Feb. 2026

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.