Seed Diffusion: A Large-Scale Diffusion Language Model with High-Speed Inference

AI-generated keywords: Seed Diffusion Preview

AI-generated Key Points

  • Utilizes discrete-state diffusion for unparalleled inference speed
  • Employs non-sequential, parallel generation for significant speedup compared to traditional methods
  • Successfully demonstrated in models like Mercury Coder and Gemini Diffusion
  • Achieves impressive inference speed of 2,146 token/s on H20 GPUs
  • Maintains competitive results across various standard code evaluation benchmarks
  • Surpasses contemporary models like Mercury and Gemini Diffusion in speed
  • Team behind the model is constantly pushing the boundaries of language modeling technology
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Yuxuan Song, Zheng Zhang, Cheng Luo, Pengyang Gao, Fan Xia, Hao Luo, Zheng Li, Yuehang Yang, Hongli Yu, Xingwei Qu, Yuwei Fu, Jing Su, Ge Zhang, Wenhao Huang, Mingxuan Wang, Lin Yan, Xiaoying Jia, Jingjing Liu, Wei-Ying Ma, Ya-Qin Zhang, Yonghui Wu, Hao Zhou

Demo is available at https://studio.seed.ai/exp/seed_diffusion/; Project page is https://seed.bytedance.com/seed_diffusion
License: CC BY 4.0

Abstract: We present Seed Diffusion Preview, a large-scale language model based on discrete-state diffusion, offering remarkably fast inference speed. Thanks to non-sequential, parallel generation, discrete diffusion models provide a notable speedup to mitigate the inherent latency of token-by-token decoding, as demonstrated recently (e.g., Mercury Coder, Gemini Diffusion). Seed Diffusion Preview achieves an inference speed of 2,146 token/s over H20 GPUs while maintaining competitive performance across a sweep of standard code evaluation benchmarks, significantly faster than contemporary Mercury and Gemini Diffusion, establishing new state of the art on the speed-quality Pareto frontier for code models.

Submitted to arXiv on 04 Aug. 2025

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2508.02193v1

is a groundbreaking language model that utilizes discrete-state diffusion to achieve unparalleled inference speed. By employing non-sequential, parallel generation, offers a significant speedup compared to traditional token-by-token decoding methods. This approach has been successfully demonstrated in recent models such as Mercury Coder and Gemini Diffusion. In terms of performance, achieves an impressive inference speed of 2,146 token/s on H20 GPUs while maintaining competitive results across various standard code evaluation benchmarks. This speed surpasses contemporary models like Mercury and Gemini Diffusion, positioning at the forefront of the speed-quality Pareto frontier for code models. The team behind , including Yuxuan Song, Zheng Zhang, Cheng Luo, Pengyang Gao, Fan Xia, Hao Luo, Zheng Li, Yuehang Yang, Hongli Yu, Xingwei Qu, Yuwei Fu, Jing Su, Ge Zhang, Wenhao Huang,Mingxuan Wang,Lin Yan,Xiaoying Jia,Jingjing Liu ,Wei-Ying Ma,Ya-Qin Zhang,Yonghui Wu,Hao Zhou are constantly pushing the boundaries of language modeling technology. For those interested in exploring further or trying out a demo of the model's capabilities can visit https://studio.seed.ai/exp/seed_diffusion/. Additionally more information about the project can be found at https://seed.bytedance.com/seed_diffusion.
Created on 22 Aug. 2025

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.