Dense Contrastive Learning for Self-Supervised Visual Pre-Training
AI-generated Key Points
⚠The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.
- The paper addresses limitations of existing self-supervised learning methods for image classification tasks
- Existing pre-trained models often fail in dense prediction tasks due to disparity between image-level and pixel-level predictions
- Proposed approach called dense contrastive learning focuses on pixel-level features and their correspondence
- Introduces pairwise contrastive (dis)similarity loss at the pixel level between two views of input images
- Achieves self-supervised learning and captures local feature correspondences effectively
- Incurs negligible computational overhead compared to baseline method MoCo-v2
- Outperforms MoCo-v2 in downstream dense prediction tasks such as object detection, semantic segmentation, and instance segmentation
- Significant improvements over MoCo-v2 baseline: 2.0% AP improvement on PASCAL VOC object detection, 1.1% AP improvement on COCO object detection, 0.9% AP improvement on COCO instance segmentation, 3.0% mIoU improvement on PASCAL VOC semantic segmentation, and 1.8% mIoU improvement on Cityscapes semantic segmentation
- Provides an effective solution for self-supervised visual pre-training by considering pixel-level features and their correspondence directly
Authors: Xinlong Wang, Rufeng Zhang, Chunhua Shen, Tao Kong, Lei Li
Abstract: To date, most existing self-supervised learning methods are designed and optimized for image classification. These pre-trained models can be sub-optimal for dense prediction tasks due to the discrepancy between image-level prediction and pixel-level prediction. To fill this gap, we aim to design an effective, dense self-supervised learning method that directly works at the level of pixels (or local features) by taking into account the correspondence between local features. We present dense contrastive learning, which implements self-supervised learning by optimizing a pairwise contrastive (dis)similarity loss at the pixel level between two views of input images. Compared to the baseline method MoCo-v2, our method introduces negligible computation overhead (only <1% slower), but demonstrates consistently superior performance when transferring to downstream dense prediction tasks including object detection, semantic segmentation and instance segmentation; and outperforms the state-of-the-art methods by a large margin. Specifically, over the strong MoCo-v2 baseline, our method achieves significant improvements of 2.0% AP on PASCAL VOC object detection, 1.1% AP on COCO object detection, 0.9% AP on COCO instance segmentation, 3.0% mIoU on PASCAL VOC semantic segmentation and 1.8% mIoU on Cityscapes semantic segmentation. Code is available at: https://git.io/AdelaiDet
Ask questions about this paper to our AI assistant
You can also chat with multiple papers at once here.
⚠The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.
Assess the quality of the AI-generated content by voting
Score: 0
Why do we need votes?
Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.
The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.
⚠The license of this specific paper does not allow us to build upon its content and the summarizing tools will be run using the paper metadata rather than the full article. However, it still does a good job, and you can also try our tools on papers with more open licenses.
Similar papers summarized with our AI tools
Navigate through even more similar papers through a
tree representationLook for similar papers (in beta version)
By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.
Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.