Hunting CUDA Bugs at Scale with cuFuzz

AI-generated keywords: GPUs software development memory-safety concurrency bugs cuFuzz

AI-generated Key Points

GPUs are increasingly important in modern software development
GPU programs face challenges such as memory-safety and concurrency bugs
Fuzz-testing combined with dynamic error checking tools is a promising solution for detecting bugs in GPU programs
Prior GPU fuzzing efforts have encountered obstacles like kernel-level fuzzing, lack of device-side coverage feedback, and compatibility issues between tools
cuFuzz is a CUDA-oriented fuzzer that addresses these challenges effectively
cuFuzz discovered 43 previously unknown bugs across 14 CUDA programs, including illegal memory accesses, uninitialized reads, and data races
cuFuzz outperforms baseline approaches by uncovering more edges and unique inputs, especially on closed-source targets
The artifact for cuFuzz is publicly available on Zenodo with source code, usage instructions, and evaluation scripts
Acknowledgments are extended to reviewers and contributors who helped address bug reports uncovered by cuFuzz

Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Mohamed Tarek Ibn ziad, Christos Kozyrakis

arXiv: 2603.12485v1 - DOI (cs.CR)

Accepted for publication at the International Conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA 2026)

License: CC BY 4.0

Abstract: GPUs play an increasingly important role in modern software. However, the heterogeneous host-device execution model and expanding software stacks make GPU programs prone to memory-safety and concurrency bugs that evade static analysis. While fuzz-testing, combined with dynamic error checking tools, offers a plausible solution, it remains underutilized for GPUs. In this work, we identify three main obstacles limiting prior GPU fuzzing efforts: (1) kernel-level fuzzing leading to false positives, (2) lack of device-side coverage-guided feedback, and (3) incompatibility between coverage and sanitization tools. We present cuFuzz, the first CUDA-oriented fuzzer that makes GPU fuzzing practical by addressing these obstacles. cuFuzz uses whole program fuzzing to avoid false positives from independently fuzzing device-side kernels. It leverages NVBit to instrument device-side instructions and merges the resultant coverage with compiler-based host coverage. Finally, cuFuzz decouples sanitization from coverage collection by executing host- and device-side sanitizers in separate processes. cuFuzz uncovers 43 previously unknown bugs (19 in commercial libraries) across 14 CUDA programs, including illegal memory accesses, uninitialized reads, and data races. cuFuzz achieves significantly more discovered edges and unique inputs compared to baseline approaches, especially on closed-source targets. Moreover, we quantify the execution time overheads of the different cuFuzz components and add persistent-mode support to improve the overall fuzzing throughput. Our results demonstrate that cuFuzz is an effective and deployable addition to the GPU testing toolbox. cuFuzz is publicly available at https://github.com/NVlabs/cuFuzz/.

Submitted to arXiv on 12 Mar. 2026

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2603.12485v1

Comprehensive Summary
Key points
Layman's Summary
Blog article

In the realm of modern software development, GPUs are playing an increasingly crucial role. However, the complex host-device execution model and expanding software stacks have made GPU programs susceptible to memory-safety and concurrency bugs that are challenging to detect through static analysis alone. While fuzz-testing, coupled with dynamic error checking tools, presents a promising solution, its application in the realm of GPUs remains largely untapped. This gap in utilization can be attributed to three main obstacles encountered in prior GPU fuzzing efforts: (1) kernel-level fuzzing leading to false positives, (2) the absence of device-side coverage-guided feedback, and (3) compatibility issues between coverage and sanitization tools. To address these challenges effectively, a groundbreaking CUDA-oriented fuzzer called cuFuzz has been introduced. By employing whole program fuzzing instead of independently fuzzing device-side kernels, cuFuzz successfully avoids false positives. Leveraging NVBit for instrumenting device-side instructions enables cuFuzz to merge resulting coverage with compiler-based host coverage seamlessly. Furthermore, cuFuzz separates sanitization from coverage collection by executing host- and device-side sanitizers in separate processes. The efficacy of cuFuzz is evidenced by its discovery of 43 previously unknown bugs (including 19 in commercial libraries) across 14 CUDA programs. These bugs encompass illegal memory accesses, uninitialized reads, and data races. Notably, cuFuzz outperforms baseline approaches by uncovering significantly more discovered edges and unique inputs – particularly on closed-source targets. The execution time overheads of various cuFuzz components have been quantified, with persistent-mode support added to enhance overall fuzzing throughput. The results underscore that cuFuzz represents a valuable addition to the GPU testing toolbox due to its effectiveness and deployability. The artifact is publicly available on Zenodo [44], comprising source code, usage instructions, and evaluation scripts for replicating key experiments outlined in this paper. Acknowledgments are extended to reviewers for their insightful feedback and individuals who contributed towards addressing HeCBench bug reports uncovered by cuFuzz as well as handling reported bugs within NVIDIA's CUDA-accelerated libraries. Valuable technical discussions were also facilitated by Aamer Jaleel, Mark Stephenson, Sana Damani, and members of the Architecture Research Group at NVIDIA Research.

- GPUs are increasingly important in modern software development
- GPU programs face challenges such as memory-safety and concurrency bugs
- Fuzz-testing combined with dynamic error checking tools is a promising solution for detecting bugs in GPU programs
- Prior GPU fuzzing efforts have encountered obstacles like kernel-level fuzzing, lack of device-side coverage feedback, and compatibility issues between tools
- cuFuzz is a CUDA-oriented fuzzer that addresses these challenges effectively
- cuFuzz discovered 43 previously unknown bugs across 14 CUDA programs, including illegal memory accesses, uninitialized reads, and data races
- cuFuzz outperforms baseline approaches by uncovering more edges and unique inputs, especially on closed-source targets
- The artifact for cuFuzz is publicly available on Zenodo with source code, usage instructions, and evaluation scripts
- Acknowledgments are extended to reviewers and contributors who helped address bug reports uncovered by cuFuzz

Summary- Graphics processing units (GPUs) are important in making new computer programs. - Programs for GPUs can have problems like memory issues and bugs that happen when things are done at the same time. - Testing tools combined with error checkers can help find these problems in GPU programs. - A special tool called cuFuzz helps find many bugs in programs made for CUDA, a type of GPU programming language. - cuFuzz found 43 new bugs in 14 CUDA programs, like mistakes with memory and data. Definitions- GPUs: Graphics Processing Units - special computer parts that help make images and run programs faster. - Bugs: Mistakes or problems in computer programs that need to be fixed. - Fuzz-testing: Trying different inputs to see if there are any unexpected results or errors. - Concurrency: Doing multiple things at the same time in a program. - Kernel-level fuzzing: Testing at a deep level within the operating system of a computer.

In recent years, GPUs have become an integral part of modern software development. Their ability to handle complex calculations and process large amounts of data has made them a crucial component in various industries such as gaming, artificial intelligence, and scientific research. However, with the increasing complexity of GPU programs and expanding software stacks, they have also become susceptible to memory-safety and concurrency bugs that are difficult to detect through traditional static analysis methods. To address these challenges, a team of researchers from NVIDIA Research has introduced a groundbreaking CUDA-oriented fuzzer called cuFuzz. This tool aims to improve the effectiveness and deployability of GPU testing by addressing three main obstacles encountered in prior GPU fuzzing efforts: kernel-level fuzzing leading to false positives, the absence of device-side coverage-guided feedback, and compatibility issues between coverage and sanitization tools. The first challenge addressed by cuFuzz is the issue of false positives caused by kernel-level fuzzing. Previous approaches focused on independently fuzzing device-side kernels which often resulted in a high number of false positives due to incomplete code coverage. To overcome this limitation, cuFuzz employs whole program fuzzing where both host- and device-side code are tested together. This approach significantly reduces false positives and improves overall bug detection accuracy. Another key feature of cuFuzz is its ability to merge resulting coverage from both host- and device-side code seamlessly. This is made possible by leveraging NVBit for instrumenting device-side instructions. By combining compiler-based host coverage with device-side coverage guided feedback, cuFuzz provides comprehensive code coverage that helps identify potential bugs more efficiently. One major hurdle faced by previous GPU fuzzers was the lack of compatibility between coverage-guided feedback tools and sanitization tools. CuFuzz addresses this issue by separating sanitization from coverage collection through separate processes for executing host- and device-side sanitizers. This allows for better coordination between different components without compromising on performance or effectiveness. To evaluate the effectiveness of cuFuzz, the researchers conducted experiments on 14 CUDA programs and discovered 43 previously unknown bugs, including 19 in commercial libraries. These bugs ranged from illegal memory accesses to data races, highlighting the importance of thorough testing for GPU programs. CuFuzz also outperformed baseline approaches by uncovering significantly more unique inputs and discovered edges, particularly on closed-source targets. In addition to its effectiveness, cuFuzz also offers improved deployability with a publicly available artifact on Zenodo. This includes source code, usage instructions, and evaluation scripts for replicating key experiments outlined in the research paper. The team also acknowledges valuable technical discussions with industry experts and individuals who contributed towards addressing bug reports uncovered by cuFuzz. Overall, cuFuzz represents a valuable addition to the GPU testing toolbox due to its ability to address key challenges faced by previous fuzzers. Its whole program fuzzing approach coupled with comprehensive coverage-guided feedback makes it an effective tool for detecting memory-safety and concurrency bugs in complex GPU programs. With its publicly available artifact and promising results in bug detection, cuFuzz has the potential to greatly improve the reliability of software utilizing GPUs.

Created on 30 Mar. 2026

Assess the quality of the AI-generated content by voting

Score: 0

Similar papers summarized with our AI tools

51.2%

ATLANTIS: AI-driven Threat Localization, Analysis, and Triage Intelligence Sy…

cs.CR

47.2%

FuzzSplore: Visualizing Feedback-Driven Fuzzing Techniques

cs.CR

45.4%

Loki: Hardening Code Obfuscation Against Automated Attacks

cs.CR

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.