Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

AI-generated keywords: Phi-3 Technical Report phi-3-mini groundbreaking language model training dataset

AI-generated Key Points

  • Phi-3-mini is a 3.8 billion parameter language model trained on a dataset of 3.3 trillion tokens
  • Phi-3-mini competes with leading models like Mixtral 8x7B and GPT-3.5, achieving scores of 69% on MMLU and 8.38 on MT-bench
  • Key innovation of phi-3-mini is its training dataset, which enhances robustness, safety features, and chat format capabilities
  • Introduction of larger models phi-3-small and phi-3-medium trained on 4.8 trillion tokens each with enhanced capabilities, scoring 75% and 78% on MMLU respectively
  • Team includes talented individuals like Marah Abdin, Sam Ade Jacobs, Ammar Ahmad Awan among others pushing boundaries in language modeling technology
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Marah Abdin, Sam Ade Jacobs, Ammar Ahmad Awan, Jyoti Aneja, Ahmed Awadallah, Hany Awadalla, Nguyen Bach, Amit Bahree, Arash Bakhtiari, Harkirat Behl, Alon Benhaim, Misha Bilenko, Johan Bjorck, Sébastien Bubeck, Martin Cai, Caio César Teodoro Mendes, Weizhu Chen, Vishrav Chaudhary, Parul Chopra, Allie Del Giorno, Gustavo de Rosa, Matthew Dixon, Ronen Eldan, Dan Iter, Abhishek Goswami, Suriya Gunasekar, Emman Haider, Junheng Hao, Russell J. Hewett, Jamie Huynh, Mojan Javaheripi, Xin Jin, Piero Kauffmann, Nikos Karampatziakis, Dongwoo Kim, Mahoud Khademi, Lev Kurilenko, James R. Lee, Yin Tat Lee, Yuanzhi Li, Chen Liang, Weishung Liu, Eric Lin, Zeqi Lin, Piyush Madan, Arindam Mitra, Hardik Modi, Anh Nguyen, Brandon Norick, Barun Patra, Daniel Perez-Becker, Thomas Portet, Reid Pryzant, Heyang Qin, Marko Radmilac, Corby Rosset, Sambudha Roy, Olli Saarikivi, Amin Saied, Adil Salim, Michael Santacroce, Shital Shah, Ning Shang, Hiteshi Sharma, Xia Song, Olatunji Ruwase, Xin Wang, Rachel Ward, Guanhua Wang, Philipp Witte, Michael Wyatt, Can Xu, Jiahang Xu, Sonali Yadav, Fan Yang, Ziyi Yang, Donghan Yu, Chengruidong Zhang, Cyril Zhang, Jianwen Zhang, Li Lyna Zhang, Yi Zhang, Yunan Zhang, Xiren Zhou

12 pages
License: CC BY 4.0

Abstract: We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3.5 (e.g., phi-3-mini achieves 69% on MMLU and 8.38 on MT-bench), despite being small enough to be deployed on a phone. The innovation lies entirely in our dataset for training, a scaled-up version of the one used for phi-2, composed of heavily filtered web data and synthetic data. The model is also further aligned for robustness, safety, and chat format. We also provide some initial parameter-scaling results with a 7B and 14B models trained for 4.8T tokens, called phi-3-small and phi-3-medium, both significantly more capable than phi-3-mini (e.g., respectively 75% and 78% on MMLU, and 8.7 and 8.9 on MT-bench).

Submitted to arXiv on 22 Apr. 2024

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2404.14219v1

The Phi-3 Technical Report introduces the phi-3-mini, a groundbreaking 3.8 billion parameter language model trained on an extensive dataset of 3.3 trillion tokens. This model's exceptional performance has been validated by both academic benchmarks and internal evaluations, placing it in direct competition with leading models such as Mixtral 8x7B and GPT-3.5. Impressively, the phi-3-mini achieves remarkable scores of 69% on MMLU and 8.38 on MT-bench while maintaining a compact size suitable for deployment on mobile devices. A key innovation of the phi-3-mini lies in its training dataset, which is an enhanced version of the one utilized for its predecessor, the phi-2 model. This meticulously filtered web data and synthetic information contribute to the model's robustness, safety features, and optimized chat format capabilities. Furthermore, the report delves into additional advancements with the introduction of two larger models: phi-3-small and phi-3-medium. These models are trained on a staggering 4.8 trillion tokens each and exhibit significantly enhanced capabilities compared to the phi-3-mini. For instance, they achieve impressive scores of 75% and 78% on MMLU respectively while scoring 8.7 and 8.9 on MT-bench. The team behind this groundbreaking research includes a diverse group of talented individuals such as Marah Abdin, Sam Ade Jacobs, Ammar Ahmad Awan, Jyoti Aneja, Ahmed Awadallah, Hany Awadalla among others who have collectively contributed to pushing the boundaries of language modeling technology. In conclusion, showcases a new era in language modeling technology with its innovative approach to training datasets and cutting-edge models that promise to revolutionize natural language processing capabilities across various applications.
Created on 07 Sep. 2024

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.