A Study on the Intersection of GPU Utilization and CNN Inference

AI-generated keywords: GPU utilization CNN inference Neural Architecture Search Deep Learning Applications Resource Usage

AI-generated Key Points

The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.

  • Significant progress in developing neural network architectures for high predictive performance and application-level inference throughput
  • Importance of GPU utilization during inference
  • High GPU utilization crucial for increasing application-level throughput and ROI
  • Analysis of GPU utilization of convolutional neural network (CNN) inference
  • Many CNNs have room to enhance their GPU utilization
  • Exploration of GPU utilization within a neural architecture search (NAS) search space
  • Proposal to use GPU utilization as a metric to accelerate NAS itself
  • Designing more efficient networks by considering GPU utilization during architecture search process
  • Need to improve inference-time GPU utilization of CNNs
  • Knowledge of GPU utilization can benefit applications beyond optimizing resource usage
  • Findings hope to inspire future innovation in designing more efficient and GPU-utilization-friendly neural networks
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Jack Kosaian, Amar Phanishayee

License: CC BY-NC-ND 4.0

Abstract: There has been significant progress in developing neural network architectures that both achieve high predictive performance and that also achieve high application-level inference throughput (e.g., frames per second). Another metric of increasing importance is GPU utilization during inference: the measurement of how well a deployed neural network uses the computational capabilities of the GPU on which it runs. Achieving high GPU utilization is critical to increasing application-level throughput and ensuring a good return on investment for deploying GPUs. This paper analyzes the GPU utilization of convolutional neural network (CNN) inference. We first survey the GPU utilization of CNNs to show that there is room to improve the GPU utilization of many of these CNNs. We then investigate the GPU utilization of networks within a neural architecture search (NAS) search space, and explore how using GPU utilization as a metric could potentially be used to accelerate NAS itself. Our study makes the case that there is room to improve the inference-time GPU utilization of CNNs and that knowledge of GPU utilization has the potential to benefit even applications that do not target utilization itself. We hope that the results of this study will spur future innovation in designing GPU-efficient neural networks.

Submitted to arXiv on 15 Dec. 2022

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2212.07936v1

This paper's license doesn't allow us to build upon its content and the summarizing process is here made with the paper's metadata rather than the article.

In recent years, there has been significant progress in developing neural network architectures that achieve high predictive performance and application-level inference throughput. However, another important metric that is gaining importance is GPU utilization during inference. GPU utilization measures how effectively a deployed neural network utilizes the computational capabilities of the GPU on which it runs. Achieving high GPU utilization is crucial for increasing application-level throughput and ensuring a good return on investment for deploying GPUs. This paper focuses on analyzing the GPU utilization of convolutional neural network (CNN) inference. The authors first survey the GPU utilization of CNNs and identify areas where improvement is needed. They find that many CNNs have room to enhance their GPU utilization. To further investigate this issue, they explore the GPU utilization of networks within a neural architecture search (NAS) search space. The authors also propose using GPU utilization as a metric to potentially accelerate NAS itself. By considering GPU utilization during the architecture search process, researchers can design more efficient networks that make better use of available computational resources. The study highlights the need to improve the inference-time GPU utilization of CNNs and emphasizes that knowledge of GPU utilization can benefit applications beyond just optimizing resource usage. The authors hope that their findings will inspire future innovation in designing more efficient and GPU-utilization-friendly neural networks. Overall, this research sheds light on the intersection between GPU utilization and CNN inference, providing insights into how to improve efficiency and maximize the benefits of deploying GPUs in deep learning applications.
Created on 09 Aug. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.