BaCO: A Fast and Portable Bayesian Compiler Optimization Framework

AI-generated keywords: Bayesian Compiler Optimization Autotuning Performance Versatile Efficiency

AI-generated Key Points

Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Erik Hellsten, Artur Souza, Johannes Lenfers, Rubens Lacouture, Olivia Hsu, Adel Ejjeh, Fredrik Kjolstad, Michel Steuwer, Kunle Olukotun, Luigi Nardi

arXiv: 2212.11142v2 - DOI (cs.PL)

License: CC BY 4.0

Abstract: We introduce the Bayesian Compiler Optimization framework (BaCO), a general purpose autotuner for modern compilers targeting CPUs, GPUs, and FPGAs. BaCO provides the flexibility needed to handle the requirements of modern autotuning tasks. Particularly, it deals with permutation, ordered, and continuous parameter types along with both known and unknown parameter constraints. To reason about these parameter types and efficiently deliver high-quality code, BaCO uses Bayesian optimiza tion algorithms specialized towards the autotuning domain. We demonstrate BaCO's effectiveness on three modern compiler systems: TACO, RISE & ELEVATE, and HPVM2FPGA for CPUs, GPUs, and FPGAs respectively. For these domains, BaCO outperforms current state-of-the-art autotuners by delivering on average 1.36x-1.56x faster code with a tiny search budget, and BaCO is able to reach expert-level performance 2.9x-3.9x faster.

Submitted to arXiv on 01 Dec. 2022

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2212.11142v2

Comprehensive Summary
Key points
Layman's Summary
Blog article

The Bayesian Compiler Optimization framework (BaCO) is a versatile autotuner designed for modern compilers that target CPUs, GPUs, and FPGAs. BaCO is capable of handling the complex requirements of modern autotuning tasks by dealing with permutation, ordered, and continuous parameter types along with both known and unknown parameter constraints. To efficiently deliver high-quality code, BaCO uses Bayesian optimization algorithms specialized towards the autotuning domain. In this study, we demonstrate the effectiveness of BaCO on three real-world compiler systems: TACO, RISE & ELEVATE, and HPVM2FPGA for CPUs, GPUs, and FPGAs respectively. Our evaluation shows that BaCO outperforms current state-of-the-art autotuners by delivering on average 1.36x-1.56x faster code with a tiny search budget. Additionally, BaCO can reach expert-level performance 2.9x-3.9x faster than other methods. We validate the efficiency, effectiveness, and generalizability of BaCO through extensive empirical results on all the frameworks and benchmarks in Appendix A. We answer several research questions to evaluate the performance of BaCO: Firstly (RQ1), we investigate whether BaCO achieves high performance with a limited autotuning budget; Secondly (RQ2), we analyze whether BaCO can handle different types of parameters effectively across various benchmarks from different domains such as image processing or machine learning applications; Thirdly (RQ3), we examine whether our method is portable across different hardware platforms such as CPUs or GPUs without requiring any changes to its architecture or algorithmic design; Finally (RQ4), we investigate the findings of a qualitative user study with domain experts. Our results show that even with a small budget of 20-40 evaluations depending on benchmark complexity, BaCO achieves significantly better performance than state-of-the-art baselines; it performs well across all domains tested; it is highly portable and can deliver expert-level performance across different hardware platforms; and it is feasible and effective for real-world applications as it can deliver high quality code with minimal effort from the user. In conclusion, BaCO is a fast and portable Bayesian compiler optimization framework that outperforms state-of-the art autotuners in delivering high quality code across various domains and hardware platforms.

Error: needs to be re-run

I'm sorry, but there is no information or context provided for me to create a summary and definitions. Could you please provide more details or context?

Error: needs to be re-run

Created on 29 May. 2023

Assess the quality of the AI-generated content by voting

Score: 0

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Similar papers summarized with our AI tools

51.6%

MetaTune: Meta-Learning Based Cost Model for Fast and Efficient Auto-tuning F…

cs.LG

47.7%

Fast and Slow Planning

cs.AI

47.7%

A decomposition strategy for decision problems with endogenous uncertainty us…

math.OC

47.3%

Compiler Optimization for Irregular Memory Access Patterns in PGAS Programs

cs.DC

47.3%

A nonparametric algorithm for optimal stopping based on robust optimization

math.OC

47.2%

Learning Compiler Pass Orders using Coreset and Normalized Value Prediction

cs.PL

47.0%

Autocalibration and Tweedie-dominance for Insurance Pricing with Machine Lear…

stat.ML

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.