Learning RoI Transformer for Detecting Oriented Objects in Aerial Images

AI-generated keywords: Computer vision

AI-generated Key Points

The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.

  • Object detection in aerial images presents unique challenges due to bird's eye view perspective, complex backgrounds, and diverse appearances of objects
  • Traditional methods relying on horizontal proposals can lead to misalignments between Region of Interests (RoIs) and actual objects, affecting classification confidence and localization accuracy
  • RoI Transformer introduces a novel approach with a Rotated RoI (RRoI) learner and a Rotated Position Sensitive RoI Align (RPS-RoI-Align) module to address challenges in detecting densely packed objects in aerial images
  • The proposed RoI Transformer is lightweight, easily integrable into detectors for oriented object detection, and achieves state-of-the-art performance on challenging aerial datasets such as DOTA and HRSC2016 while maintaining detection speed
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Jian Ding, Nan Xue, Yang Long, Gui-Song Xia, Qikai Lu

Abstract: Object detection in aerial images is an active yet challenging task in computer vision because of the birdview perspective, the highly complex backgrounds, and the variant appearances of objects. Especially when detecting densely packed objects in aerial images, methods relying on horizontal proposals for common object detection often introduce mismatches between the Region of Interests (RoIs) and objects. This leads to the common misalignment between the final object classification confidence and localization accuracy. Although rotated anchors have been used to tackle this problem, the design of them always multiplies the number of anchors and dramatically increases the computational complexity. In this paper, we propose a RoI Transformer to address these problems. More precisely, to improve the quality of region proposals, we first designed a Rotated RoI (RRoI) learner to transform a Horizontal Region of Interest (HRoI) into a Rotated Region of Interest (RRoI). Based on the RRoIs, we then proposed a Rotated Position Sensitive RoI Align (RPS-RoI-Align) module to extract rotation-invariant features from them for boosting subsequent classification and regression. Our RoI Transformer is with light weight and can be easily embedded into detectors for oriented object detection. A simple implementation of the RoI Transformer has achieved state-of-the-art performances on two common and challenging aerial datasets, i.e., DOTA and HRSC2016, with a neglectable reduction to detection speed. Our RoI Transformer exceeds the deformable Position Sensitive RoI pooling when oriented bounding-box annotations are available. Extensive experiments have also validated the flexibility and effectiveness of our RoI Transformer. The results demonstrate that it can be easily integrated with other detector architectures and significantly improve the performances.

Submitted to arXiv on 01 Dec. 2018

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 1812.00155v1

This paper's license doesn't allow us to build upon its content and the summarizing process is here made with the paper's metadata rather than the article.

, , , , In the field of computer vision, object detection in aerial images presents a unique set of challenges due to the bird's eye view perspective, complex backgrounds, and diverse appearances of objects. Detecting densely packed objects in aerial images is particularly difficult as traditional methods relying on horizontal proposals often result in misalignments between Region of Interests (RoIs) and actual objects. This can lead to inaccuracies in classification confidence and localization accuracy. While rotated anchors have been utilized to address this issue, they come with a drawback of increasing computational complexity by multiplying the number of anchors. To tackle these challenges, this paper introduces a novel approach called RoI Transformer. The RoI Transformer consists of two key components: a Rotated RoI (RRoI) learner that transforms Horizontal RoIs (HRoIs) into Rotated RoIs (RRoIs), and a Rotated Position Sensitive RoI Align (RPS-RoI-Align) module that extracts rotation-invariant features from RRoIs to enhance subsequent classification and regression tasks. Notably, the proposed RoI Transformer is lightweight and easily integrable into detectors for oriented object detection. Experimental results demonstrate that the implementation of the RoI Transformer achieves state-of-the-art performance on challenging aerial datasets such as DOTA and HRSC2016 while maintaining detection speed. Furthermore, when compared to deformable Position Sensitive RoI pooling with oriented bounding-box annotations, the RoI Transformer surpasses in performance. The flexibility and effectiveness of the proposed approach are validated through extensive experiments, showcasing its potential for integration with various detector architectures to significantly enhance object detection performances in aerial imagery applications.
Created on 08 Mar. 2026

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.