LLaMA: Open and Efficient Foundation Language Models

AI-generated keywords: LLaMA Language Models Transformer Architecture Benchmarks Responsible AI

AI-generated Key Points

  • Thibaut Lavril, Giza Cardozo, Éric Grave, and Guillaume Lample introduce the LLaMA collection of foundation language models
  • Models range from 7B to 65B parameters and are trained on trillions of tokens using publicly available datasets exclusively
  • State-of-the-art models can be trained without proprietary and inaccessible datasets
  • A smaller model trained for longer can ultimately be cheaper at inference
  • The focus is to train language models that achieve the best possible performance at various inference budgets by training on more tokens than what is typically used
  • LLaMA models outperform existing large language models (LLMs) such as GPT-3 on most benchmarks despite being smaller in size
  • All their models are released to the research community and use only publicly available data sources for training
  • Compatible with open-sourcing and democratizes access to and study of LLMs
  • Modifications made to the transformer architecture (Vaswani et al., 2017) and their training method are presented
  • Performance of their models compared with other LLMs on a set of standard benchmarks is reported
  • Biases and toxicity encoded in their models using some of the latest responsible AI benchmarks are exposed
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, Aurelien Rodriguez, Armand Joulin, Edouard Grave, Guillaume Lample

License: CC BY 4.0

Abstract: We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters. We train our models on trillions of tokens, and show that it is possible to train state-of-the-art models using publicly available datasets exclusively, without resorting to proprietary and inaccessible datasets. In particular, LLaMA-13B outperforms GPT-3 (175B) on most benchmarks, and LLaMA-65B is competitive with the best models, Chinchilla-70B and PaLM-540B. We release all our models to the research community.

Submitted to arXiv on 27 Feb. 2023

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2302.13971v1

In their recent work, Thibaut Lavril, Giza Cardozo, Éric Grave, and Guillaume Lample introduce the LLaMA collection of foundation language models. These models range from 7B to 65B parameters and are trained on trillions of tokens using publicly available datasets exclusively. The authors demonstrate that it is possible to train state-of-the-art models without resorting to proprietary and inaccessible datasets. They also show that a smaller model trained for longer can ultimately be cheaper at inference. The focus of this work is to train language models that achieve the best possible performance at various inference budgets by training on more tokens than what is typically used. The resulting LLaMA models outperform existing large language models (LLMs) such as GPT-3 on most benchmarks despite being smaller in size. For instance, LLaMA-13B outperforms GPT-3 (175B) on most benchmarks, while LLaMA-65B is competitive with the best models, Chinchilla-70B and PaLM-540B. The authors release all their models to the research community. Unlike other existing LLMs such as Chinchilla or PaLM which rely on data that is either not publicly available or undocumented, the authors only use publicly available data sources for training their models. This makes their work compatible with open-sourcing and democratizes access to and study of LLMs. In addition to presenting an overview of the modifications made to the transformer architecture (Vaswani et al., 2017) and their training method, the authors report the performance of their models compared with other LLMs on a set of standard benchmarks. They also expose some biases and toxicity encoded in their models using some of the latest responsible AI benchmarks. Overall, this work demonstrates that it is possible to train highly performant language models using publicly available datasets and provides a valuable resource for researchers and practitioners in the field.
Created on 25 Mar. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.