Federated Fine-tuning of Billion-Sized Language Models across Mobile Devices

AI-generated keywords: Mobile Intelligence

AI-generated Key Points

  • Large Language Models (LLMs) have revolutionized mobile intelligence
  • Federated Learning (FL) is key for fine-tuning LLMs for mobile tasks
  • FedLLM preserves user data privacy but faces challenges like high memory consumption and slow convergence
  • FwdLLM introduces innovative FL protocol using BP-free training methods for improved memory and time efficiency
  • FwdLLM focuses on combining BP-free training with parameter-efficient methods, optimal computational load allocation, and selective sampling of perturbed predictions
  • Benefits of FwdLLM include faster convergence, reduced memory footprint, and enabling federated learning of billion-parameter LLMs on COTS mobile devices
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Mengwei Xu, Yaozong Wu, Dongqi Cai, Xiang Li, Shangguang Wang

under review
License: CC BY 4.0

Abstract: Large Language Models (LLMs) are transforming the landscape of mobile intelligence. Federated Learning (FL), a method to preserve user data privacy, is often employed in fine-tuning LLMs to downstream mobile tasks, an approach known as FedLLM. Though recent efforts have addressed the network issue induced by the vast model size, they have not practically mitigated vital challenges concerning integration with mobile devices, such as significant memory consumption and sluggish model convergence. In response to these challenges, this work introduces FwdLLM, an innovative FL protocol designed to enhance the FedLLM efficiency. The key idea of FwdLLM to employ backpropagation (BP)-free training methods, requiring devices only to execute ``perturbed inferences''. Consequently, FwdLLM delivers way better memory efficiency and time efficiency (expedited by mobile NPUs and an expanded array of participant devices). FwdLLM centers around three key designs: (1) it combines BP-free training with parameter-efficient training methods, an essential way to scale the approach to the LLM era; (2) it systematically and adaptively allocates computational loads across devices, striking a careful balance between convergence speed and accuracy; (3) it discriminatively samples perturbed predictions that are more valuable to model convergence. Comprehensive experiments with five LLMs and three NLP tasks illustrate FwdLLM's significant advantages over conventional methods, including up to three orders of magnitude faster convergence and a 14.6x reduction in memory footprint. Uniquely, FwdLLM paves the way for federated learning of billion-parameter LLMs such as LLaMA on COTS mobile devices -- a feat previously unattained.

Submitted to arXiv on 26 Aug. 2023

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2308.13894v1

, , , , In the realm of mobile intelligence, Large Language Models (LLMs) have revolutionized the landscape, with Federated Learning (FL) being a key method to fine-tune these models for downstream mobile tasks. This approach, known as FedLLM, has been instrumental in preserving user data privacy. However, challenges persist in integrating these vast models with mobile devices, such as high memory consumption and slow model convergence. To address these issues, a new innovative FL protocol called FwdLLM has been introduced. FwdLLM leverages backpropagation (BP)-free training methods, requiring devices to execute "perturbed inferences" instead. This unique approach results in significantly improved memory efficiency and time efficiency, aided by mobile NPUs and an expanded array of participant devices. FwdLLM focuses on three key designs: combining BP-free training with parameter-efficient methods to scale to the LLM era; systematically allocating computational loads across devices for optimal convergence speed and accuracy; and selectively sampling perturbed predictions crucial for model convergence. Extensive experiments involving five LLMs and three NLP tasks demonstrate the remarkable advantages of FwdLLM over traditional methods. These benefits include up to three orders of magnitude faster convergence and a substantial 14.6x reduction in memory footprint. Notably, FwdLLM opens up possibilities for federated learning of billion-parameter LLMs like LLaMA on commercial off-the-shelf (COTS) mobile devices – a previously unattained achievement. The research conducted by Mengwei Xu, Yaozong Wu, Dongqi Cai, Xiang Li, and Shangguang Wang showcases how FwdLLM is poised to enhance the efficiency of FedLLM and pave the way for more advanced applications of large language models on mobile platforms.
Created on 21 Apr. 2024

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.