Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning

AI-generated keywords: Alpa Deep Learning Model-Parallel Training Inter-Operator Parallelism Intra-Operator Parallelism

AI-generated Key Points

The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.

  • Alpa is a system that automates the model-parallel training of large deep learning (DL) models
  • Alpa generates execution plans that unify data, operator, and pipeline parallelism
  • Alpa views parallelisms as two hierarchical levels: inter-operator and intra-operator parallelisms
  • Alpa constructs a new hierarchical space for massive model-parallel execution plans
  • Alpa designs a number of compilation passes to automatically derive the optimal parallel execution plan in each independent parallelism level and implements an efficient runtime to orchestrate the two-level parallel execution on distributed compute devices
  • The evaluation shows that Alpa generates parallelization plans that match or outperform hand-tuned model-parallel training systems even on models they are designed for.
  • Unlike specialized systems, Alpa generalizes to models with heterogeneous architectures and models without manually designed plans.
  • The authors of Alpa are Lianmin Zheng, Zhuohan Li, Hao Zhang, Yonghao Zhuang, Zhifeng Chen, Yanping Huang, Yida Wang, Yuanzhong Xu, Danyang Zhuo Joseph E. Gonzalez and Ion Stoica.
  • Overall, Alpa is a promising system that has the potential to significantly improve the efficiency and scalability of deep learning training on distributed compute devices.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Lianmin Zheng, Zhuohan Li, Hao Zhang, Yonghao Zhuang, Zhifeng Chen, Yanping Huang, Yida Wang, Yuanzhong Xu, Danyang Zhuo, Joseph E. Gonzalez, Ion Stoica

Abstract: Alpa automates model-parallel training of large deep learning (DL) models by generating execution plans that unify data, operator, and pipeline parallelism. Existing model-parallel training systems either require users to manually create a parallelization plan or automatically generate one from a limited space of model parallelism configurations, which does not suffice to scale out complex DL models on distributed compute devices. Alpa distributes the training of large DL models by viewing parallelisms as two hierarchical levels: inter-operator and intra-operator parallelisms. Based on it, Alpa constructs a new hierarchical space for massive model-parallel execution plans. Alpa designs a number of compilation passes to automatically derive the optimal parallel execution plan in each independent parallelism level and implements an efficient runtime to orchestrate the two-level parallel execution on distributed compute devices. Our evaluation shows Alpa generates parallelization plans that match or outperform hand-tuned model-parallel training systems even on models they are designed for. Unlike specialized systems, Alpa also generalizes to models with heterogeneous architectures and models without manually-designed plans.

Submitted to arXiv on 28 Jan. 2022

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2201.12023v1

This paper's license doesn't allow us to build upon its content and the summarizing process is here made with the paper's metadata rather than the article.

Alpa is a system that automates the model-parallel training of large deep learning (DL) models by generating execution plans that unify data, operator, and pipeline parallelism. Unlike existing model-parallel training systems which require users to manually create a parallelization plan or automatically generate one from a limited space of model parallelism configurations, Alpa distributes the training of large DL models by viewing parallelisms as two hierarchical levels: inter-operator and intra-operator parallelisms. This approach allows Alpa to construct a new hierarchical space for massive model-parallel execution plans. To achieve this, Alpa designs a number of compilation passes to automatically derive the optimal parallel execution plan in each independent parallelism level and implements an efficient runtime to orchestrate the two-level parallel execution on distributed compute devices. The evaluation shows that Alpa generates parallelization plans that match or outperform hand-tuned model-parallel training systems even on models they are designed for. Moreover, unlike specialized systems, Alpa generalizes to models with heterogeneous architectures and models without manually designed plans. The authors of Alpa are Lianmin Zheng, Zhuohan Li, Hao Zhang, Yonghao Zhuang, Zhifeng Chen, Yanping Huang, Yida Wang, Yuanzhong Xu, Danyang Zhuo Joseph E. Gonzalez and Ion Stoica. Their paper titled "Alpa: Automating Inter- and Intra-Operator Parallelism for Distributed Deep Learning" presents an innovative solution to the problem of scaling out complex DL models on distributed compute devices by automating inter- and intra-operator parallelism. Overall, Alpa is a promising system that has the potential to significantly improve the efficiency and scalability of deep learning training on distributed compute devices.
Created on 29 Apr. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.