ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools

AI-generated keywords: ChatGLM large language models GLM-4 series open-sourcing democratizing

AI-generated Key Points

  • Introduction of the ChatGLM family of large language models, focusing on GLM-4 series (GLM-4, GLM-4-Air, and GLM-4-9B)
  • Training on vast data in multiple languages with emphasis on Chinese and English
  • Impressive performance metrics compared to state-of-the-art models like GPT-4 and Claude 3
  • Autonomy in decision-making for using external tools in the GLM-4 All Tools model
  • Practical applications include web browsing, Python interpretation, accessing online information, and solving math problems
  • Open-sourcing of various models within ChatGLM family with over 10 million downloads on platforms like Hugging Face
  • Commitment to promoting accessibility and safety of Large Language Models through open releasing model weights and techniques
  • Continuous refinement based on lessons learned from previous generations
  • Democratizing cutting-edge LLM technologies through open sourcing efforts to push boundaries towards teaching machines to think more like humans
  • Acknowledgments to contributors at Zhipu AI, Tsinghua University, collaborators, partners, and organizations supporting open-sourcing efforts
  • Represents a significant step forward in understanding and executing complex tasks autonomously in natural language processing
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Team GLM, :, Aohan Zeng, Bin Xu, Bowen Wang, Chenhui Zhang, Da Yin, Diego Rojas, Guanyu Feng, Hanlin Zhao, Hanyu Lai, Hao Yu, Hongning Wang, Jiadai Sun, Jiajie Zhang, Jiale Cheng, Jiayi Gui, Jie Tang, Jing Zhang, Juanzi Li, Lei Zhao, Lindong Wu, Lucen Zhong, Mingdao Liu, Minlie Huang, Peng Zhang, Qinkai Zheng, Rui Lu, Shuaiqi Duan, Shudan Zhang, Shulin Cao, Shuxun Yang, Weng Lam Tam, Wenyi Zhao, Xiao Liu, Xiao Xia, Xiaohan Zhang, Xiaotao Gu, Xin Lv, Xinghan Liu, Xinyi Liu, Xinyue Yang, Xixuan Song, Xunkai Zhang, Yifan An, Yifan Xu, Yilin Niu, Yuantao Yang, Yueyan Li, Yushi Bai, Yuxiao Dong, Zehan Qi, Zhaoyu Wang, Zhen Yang, Zhengxiao Du, Zhenyu Hou, Zihan Wang

License: CC BY 4.0

Abstract: We introduce ChatGLM, an evolving family of large language models that we have been developing over time. This report primarily focuses on the GLM-4 language series, which includes GLM-4, GLM-4-Air, and GLM-4-9B. They represent our most capable models that are trained with all the insights and lessons gained from the preceding three generations of ChatGLM. To date, the GLM-4 models are pre-trained on ten trillions of tokens mostly in Chinese and English, along with a small set of corpus from 24 languages, and aligned primarily for Chinese and English usage. The high-quality alignment is achieved via a multi-stage post-training process, which involves supervised fine-tuning and learning from human feedback. Evaluations show that GLM-4 1) closely rivals or outperforms GPT-4 in terms of general metrics such as MMLU, GSM8K, MATH, BBH, GPQA, and HumanEval, 2) gets close to GPT-4-Turbo in instruction following as measured by IFEval, 3) matches GPT-4 Turbo (128K) and Claude 3 for long context tasks, and 4) outperforms GPT-4 in Chinese alignments as measured by AlignBench. The GLM-4 All Tools model is further aligned to understand user intent and autonomously decide when and which tool(s) touse -- including web browser, Python interpreter, text-to-image model, and user-defined functions -- to effectively complete complex tasks. In practical applications, it matches and even surpasses GPT-4 All Tools in tasks like accessing online information via web browsing and solving math problems using Python interpreter. Over the course, we have open-sourced a series of models, including ChatGLM-6B (three generations), GLM-4-9B (128K, 1M), GLM-4V-9B, WebGLM, and CodeGeeX, attracting over 10 million downloads on Hugging face in the year 2023 alone. The open models can be accessed through https://github.com/THUDM and https://huggingface.co/THUDM.

Submitted to arXiv on 18 Jun. 2024

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2406.12793v1

In this report, we introduce the ChatGLM family of large language models. Specifically focusing on the GLM-4 series which includes GLM-4, GLM-4-Air, and GLM-4-9B. These models have been trained on a vast amount of data in multiple languages with a strong emphasis on Chinese and English. Through a multi-stage post-training process involving supervised fine-tuning and human feedback, the GLM-4 models have achieved impressive performance metrics compared to state-of-the-art models like GPT-4 and Claude 3. One key advancement in the GLM-4 All Tools model is its ability to autonomously decide when and which external tools to use for complex tasks such as web browsing or Python interpretation. This capability has enabled the model to excel in practical applications like accessing online information and solving math problems. Furthermore, the open-sourcing of various models within the ChatGLM family has garnered significant interest with over 10 million downloads on platforms like Hugging Face. The commitment to promoting accessibility and safety of Large Language Models (LLMs) through open releasing model weights and techniques reflects a dedication to advancing LLM technologies while ensuring transparency. Looking ahead, the team is continuously refining their models based on lessons learned from previous generations. By democratizing cutting-edge LLM technologies through open sourcing efforts, they aim to push the boundaries of model capabilities towards teaching machines to think more like humans. Acknowledgments are extended to all those who have contributed to the development of ChatGLM models at Zhipu AI and Tsinghua University as well as collaborators and partners who have supported this journey. Special thanks are given to individuals from various organizations who have assisted in open-sourcing efforts. Overall, the ChatGLM family of large language models represents a significant step forward in understanding and executing complex tasks autonomously. With ongoing advancements and a commitment to openness and collaboration, these models are poised to continue making strides in the field of natural language processing.
Created on 02 Jul. 2025

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.