Matching Linear Algebra and Tensor Code to Specialized Hardware Accelerators
AI-generated Key Points
⚠The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.
- The research paper focuses on achieving impressive performance gains in modern applications that rely on linear algebra.
- Dedicated tensor accelerators have potential, but their adoption is hindered by the need for programmers to rewrite code using vendor APIs.
- Existing approaches to matching and replacing patterns within code are often fragile and fail to cope with the diversity of real-world codes.
- The authors develop ATC, a compiler that uses program synthesis to map regions of code to specific APIs.
- ATC explores a combinatorially large mapping space, requiring the development of program classification, dynamic analysis, variable constraint generation, and lexical distance matching techniques to make it tractable.
- The authors apply ATC to real-world tensor and linear algebra codes and evaluate them against four state-of-the-art approaches.
- ATC accelerates between 2.6x and 7x more programs than existing methods, leading to over an order of magnitude performance improvement.
- Program synthesis can be used to optimize code for specialized hardware accelerators without requiring extensive manual rewriting of code.
Authors: Pablo Antonio Martínez, Jackson Woodruff, Jordi Armengol-Estapé, Gregorio Bernabé, José Manuel García, Michael F. P. O'Boyle
Abstract: Dedicated tensor accelerators demonstrate the importance of linear algebra in modern applications. Such accelerators have the potential for impressive performance gains, but require programmers to rewrite code using vendor APIs - a barrier to wider scale adoption. Recent work overcomes this by matching and replacing patterns within code, but such approaches are fragile and fail to cope with the diversity of real-world codes. We develop ATC, a compiler that uses program synthesis to map regions of code to specific APIs. The mapping space that ATC explores is combinatorially large, requiring the development of program classification, dynamic analysis, variable constraint generation and lexical distance matching techniques to make it tractable. We apply ATC to real-world tensor and linear algebra codes and evaluate them against four state-of-the-art approaches. We accelerate between 2.6x and 7x more programs, leading to over an order of magnitude performance improvement.
Ask questions about this paper to our AI assistant
You can also chat with multiple papers at once here.
⚠The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.
Assess the quality of the AI-generated content by voting
Score: 0
Why do we need votes?
Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.
Similar papers summarized with our AI tools
Navigate through even more similar papers through a
tree representationLook for similar papers (in beta version)
By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.
Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.