An Empirical Study of Instruction-tuning Large Language Models in Chinese

AI-generated keywords: Empirical Study

AI-generated Key Points

The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.

  • Authors Qingyi Si, Tong Wang, Zheng Lin, Xu Zhang, Yanan Cao, and Weiping Wang conduct an empirical study on instruction-tuning Large Language Models (LLMs) in Chinese.
  • The study aims to provide insights for customizing LLMs to effectively respond to Chinese instructions.
  • Key elements explored include LLM bases, parameter-efficient methods, and instruction data types crucial for instruction-tuning.
  • Experiments analyze the impact of factors like chain-of-thought data and human-value alignment on instruction-tuning.
  • Findings are expected to contribute significantly to developing an open Chinese version of ChatGPT and advancing large language models for diverse linguistic contexts.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Qingyi Si, Tong Wang, Zheng Lin, Xu Zhang, Yanan Cao, Weiping Wang

EMNLP 2023

Abstract: The success of ChatGPT validates the potential of large language models (LLMs) in artificial general intelligence (AGI). Subsequently, the release of LLMs has sparked the open-source community's interest in instruction-tuning, which is deemed to accelerate ChatGPT's replication process. However, research on instruction-tuning LLMs in Chinese, the world's most spoken language, is still in its early stages. Therefore, this paper makes an in-depth empirical study of instruction-tuning LLMs in Chinese, which can serve as a cookbook that provides valuable findings for effectively customizing LLMs that can better respond to Chinese instructions. Specifically, we systematically explore the impact of LLM bases, parameter-efficient methods, instruction data types, which are the three most important elements for instruction-tuning. Besides, we also conduct experiment to study the impact of other factors, e.g., chain-of-thought data and human-value alignment. We hope that this empirical study can make a modest contribution to the open Chinese version of ChatGPT. This paper will release a powerful Chinese LLMs that is comparable to ChatGLM. The code and data are available at https://github.com/PhoebusSi/Alpaca-CoT.

Submitted to arXiv on 11 Oct. 2023

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2310.07328v2

This paper's license doesn't allow us to build upon its content and the summarizing process is here made with the paper's metadata rather than the article.

, , , , In their paper titled "An Empirical Study of Instruction-tuning Large Language Models in Chinese," authors Qingyi Si, Tong Wang, Zheng Lin, Xu Zhang, Yanan Cao, and Weiping Wang delve into the realm of large language models (LLMs) and their potential in artificial general intelligence (AGI). The success of ChatGPT has highlighted the significance of LLMs, prompting interest in instruction-tuning within the open-source community to enhance ChatGPT's replication process. Despite this progress, research on instruction-tuning LLMs in Chinese, the most widely spoken language globally, is still nascent. To address this gap, the authors conduct an extensive empirical study focusing on instruction-tuning LLMs in Chinese. Their work aims to provide valuable insights for customizing LLMs to effectively respond to Chinese instructions. The study systematically explores key elements such as LLM bases, parameter-efficient methods, and instruction data types crucial for instruction-tuning. Additionally, experiments are conducted to analyze the impact of other factors like chain-of-thought data and human-value alignment. The findings from this empirical study are expected to contribute significantly to the development of an open Chinese version of ChatGPT by introducing a powerful Chinese LLM that rivals ChatGLM. The researchers make their code and data available at https://github.com/PhoebusSi/Alpaca-CoT for further exploration and replication. This comprehensive investigation not only sheds light on optimizing LLMs for Chinese but also sets a foundation for future advancements in leveraging large language models for diverse linguistic contexts.
Created on 17 Nov. 2024

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.