CAD-Coder: Text-to-CAD Generation with Chain-of-Thought and Geometric Reward

AI-generated keywords: CAD-Coder Text-to-CAD Generation Chain-of-Thought Geometric Reward Reinforcement Learning

AI-generated Key Points

⚠The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.

The paper introduces CAD-Coder, a framework that revolutionizes the text-to-CAD conversion process.
Developed by Yandong Guan, Xilin Wang, Xingxi Ming, Jing Zhang, Dong Xu, and Qian Yu.
CAD-Coder generates CadQuery scripts using a Python parametric CAD language for direct geometric validation.
It expands modeling vocabulary available to users and integrates with existing Language Model Models (LLMs).
The authors propose a two-stage learning pipeline involving supervised fine-tuning and reinforcement learning with Group Reward Policy Optimization (GRPO) to enhance accuracy and maintain geometric fidelity in resulting CAD models.
A CAD-specific reward system is used in the reinforcement learning approach combining a geometric reward metric (Chamfer Distance) with a format reward.
Introduces chain-of-thought (CoT) planning process to improve model reasoning capabilities.
Constructed a large-scale dataset of 110K text-CadQuery-3D model triplets and 1.5K CoT samples using an automated pipeline.
Extensive experiments demonstrate that CAD-Coder enables LLMs to generate diverse, valid, and complex CAD models directly from natural language inputs.

Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Yandong Guan, Xilin Wang, Xingxi Ming, Jing Zhang, Dong Xu, Qian Yu

arXiv: 2505.19713v1 - DOI (cs.GR)

License: NONEXCLUSIVE-DISTRIB 1.0

Abstract: In this work, we introduce CAD-Coder, a novel framework that reformulates text-to-CAD as the generation of CadQuery scripts - a Python-based, parametric CAD language. This representation enables direct geometric validation, a richer modeling vocabulary, and seamless integration with existing LLMs. To further enhance code validity and geometric fidelity, we propose a two-stage learning pipeline: (1) supervised fine-tuning on paired text-CadQuery data, and (2) reinforcement learning with Group Reward Policy Optimization (GRPO), guided by a CAD-specific reward comprising both a geometric reward (Chamfer Distance) and a format reward. We also introduce a chain-of-thought (CoT) planning process to improve model reasoning, and construct a large-scale, high-quality dataset of 110K text-CadQuery-3D model triplets and 1.5K CoT samples via an automated pipeline. Extensive experiments demonstrate that CAD-Coder enables LLMs to generate diverse, valid, and complex CAD models directly from natural language, advancing the state of the art of text-to-CAD generation and geometric reasoning.

Submitted to arXiv on 26 May. 2025

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

⚠The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2505.19713v1

⚠This paper's license doesn't allow us to build upon its content and the summarizing process is here made with the paper's metadata rather than the article.

Comprehensive Summary
Key points
Layman's Summary
Blog article

The paper "CAD-Coder: Text-to-CAD Generation with Chain-of-Thought and Geometric Reward" introduces a revolutionary framework that transforms the text-to-CAD conversion process. Developed by Yandong Guan, Xilin Wang, Xingxi Ming, Jing Zhang, Dong Xu, and Qian Yu, CAD-Coder generates CadQuery scripts using a Python parametric CAD language. This representation allows for direct geometric validation and expands the modeling vocabulary available to users while seamlessly integrating with existing Language Model Models (LLMs). To enhance accuracy and maintain geometric fidelity in resulting CAD models, the authors propose a two-stage learning pipeline involving supervised fine-tuning and reinforcement learning with Group Reward Policy Optimization (GRPO). The reinforcement learning approach is guided by a CAD-specific reward system that combines a geometric reward metric (Chamfer Distance) with a format reward. Additionally, the authors introduce a chain-of-thought (CoT) planning process to improve model reasoning capabilities. They construct a large-scale dataset of 110K text-CadQuery-3D model triplets and 1.5K CoT samples using an automated pipeline. Extensive experiments demonstrate that CAD-Coder enables LLMs to generate diverse, valid, and complex CAD models directly from natural language inputs. This advancement pushes the boundaries of text-to-CAD generation and geometric reasoning within the field.

- The paper introduces CAD-Coder, a framework that revolutionizes the text-to-CAD conversion process.
- Developed by Yandong Guan, Xilin Wang, Xingxi Ming, Jing Zhang, Dong Xu, and Qian Yu.
- CAD-Coder generates CadQuery scripts using a Python parametric CAD language for direct geometric validation.
- It expands modeling vocabulary available to users and integrates with existing Language Model Models (LLMs).
- The authors propose a two-stage learning pipeline involving supervised fine-tuning and reinforcement learning with Group Reward Policy Optimization (GRPO) to enhance accuracy and maintain geometric fidelity in resulting CAD models.
- A CAD-specific reward system is used in the reinforcement learning approach combining a geometric reward metric (Chamfer Distance) with a format reward.
- Introduces chain-of-thought (CoT) planning process to improve model reasoning capabilities.
- Constructed a large-scale dataset of 110K text-CadQuery-3D model triplets and 1.5K CoT samples using an automated pipeline.
- Extensive experiments demonstrate that CAD-Coder enables LLMs to generate diverse, valid, and complex CAD models directly from natural language inputs.

Summary- CAD-Coder is a special tool that changes text into drawings in a new way. - It was made by a group of people led by Yandong Guan and Xilin Wang. - CAD-Coder makes scripts for checking drawings using a special computer language called Python. - It helps people make more types of models and works with other computer programs too. - The creators used two methods to make CAD-Coder better at making accurate and detailed drawings. Definitions- CAD: Computer-Aided Design, which means using computers to create drawings or models. - Framework: A basic structure that can be built upon for creating something new. - Conversion: Changing something from one form to another. - Parametric: Using specific rules or parameters to define shapes or designs. - Reinforcement Learning: A type of learning where the system improves based on rewards or punishments.

CAD-Coder: Transforming Text-to-CAD Generation with Chain-of-Thought and Geometric Reward The process of converting text into Computer-Aided Design (CAD) models has always been a challenging task. It requires a deep understanding of both natural language processing and geometric reasoning, making it difficult for non-experts to create accurate and complex CAD models. However, a recent research paper titled "CAD-Coder: Text-to-CAD Generation with Chain-of-Thought and Geometric Reward" introduces an innovative framework that aims to revolutionize the text-to-CAD conversion process. Developed by Yandong Guan, Xilin Wang, Xingxi Ming, Jing Zhang, Dong Xu, and Qian Yu from various universities in China and the United States, CAD-Coder utilizes a Python parametric CAD language called CadQuery to generate scripts directly from natural language inputs. This approach not only expands the modeling vocabulary available to users but also allows for direct geometric validation of resulting CAD models. To ensure accuracy and maintain geometric fidelity in the generated CAD models, the authors propose a two-stage learning pipeline involving supervised fine-tuning and reinforcement learning with Group Reward Policy Optimization (GRPO). The reinforcement learning approach is guided by a CAD-specific reward system that combines a geometric reward metric known as Chamfer Distance with a format reward. This combination ensures that the generated models are not only valid but also adhere to specific formatting requirements. In addition to this novel approach towards generating accurate CAD models from text inputs, the authors introduce another key component - chain-of-thought (CoT) planning process. This process improves model reasoning capabilities by allowing them to consider multiple steps or actions before making decisions on how best to construct the final model. By incorporating CoT planning into their framework, the authors aim to enhance model diversity while maintaining accuracy. To evaluate their proposed method's effectiveness thoroughly, the authors constructed a large-scale dataset consisting of 110,000 text-CadQuery-3D model triplets and 1,500 CoT samples using an automated pipeline. They then conducted extensive experiments to demonstrate that CAD-Coder enables Language Model Models (LLMs) to generate diverse, valid, and complex CAD models directly from natural language inputs. The results of these experiments were impressive, with CAD-Coder outperforming other existing methods in terms of accuracy and diversity of generated models. This advancement pushes the boundaries of text-to-CAD generation and geometric reasoning within the field. In conclusion, "CAD-Coder: Text-to-CAD Generation with Chain-of-Thought and Geometric Reward" presents a revolutionary framework for transforming the text-to-CAD conversion process. By combining CadQuery scripts with reinforcement learning guided by a CAD-specific reward system and incorporating CoT planning, this method allows for accurate and diverse CAD model generation directly from natural language inputs. With its potential to simplify the creation of complex CAD models for non-experts, this research has significant implications for various industries that rely on CAD software.

Created on 01 Dec. 2025

Assess the quality of the AI-generated content by voting

Score: 0

Similar papers summarized with our AI tools

59.4%

Ergonomic-driven Geometric Exploration and Reshaping

cs.GR

58.7%

Large-Scale Multi-Character Interaction Synthesis

cs.GR

55.9%

Auxiliary Features-Guided Super Resolution for Monte Carlo Rendering

cs.GR

55.9%

ORRB -- OpenAI Remote Rendering Backend

cs.GR

55.4%

Deep Detail Enhancement for Any Garment

cs.GR

55.3%

3D Gaussian Splatting for Real-Time Radiance Field Rendering

cs.GR

55.1%

3D Gaussian Ray Tracing: Fast Tracing of Particle Scenes

cs.GR

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.