SWE-Pruner: Self-Adaptive Context Pruning for Coding Agents

AI-generated keywords: Large Language Models Coding Agents Context Management Efficiency Gains Software Development Tasks

AI-generated Key Points

  • Large Language Models (LLMs) have advanced software development tasks with coding agents like Claude Code and Gemini CLI.
  • The challenge faced by these agents is accumulating long interaction contexts within the limited window of LLMs, creating a "Context Wall" that hampers performance.
  • Long-context models have been developed to address this issue, but blindly processing extensive code leads to high API costs and latency problems.
  • Existing context compression techniques for natural language tasks struggle when applied to coding agents as they compromise syntactic validity and discard crucial debugging information.
  • SWE-Pruner is introduced as a self-adaptive context pruning framework tailored for coding agents, allowing them to articulate explicit natural language goals alongside retrieval actions.
  • By training a lightweight neural skimmer on synthetic data, SWE-Pruner enables adaptive selection of relevant lines based on given goals while preserving syntactic and structural integrity.
  • Evaluation shows the effectiveness of SWE-Pruner in achieving substantial token reduction on agent tasks like SWE-Bench Verified and significant compression on single-turn tasks like LongCodeQA with minimal impact on performance.
  • SWE-Pruner delivers efficiency gains through token savings and enhances agent decision quality by reducing interaction rounds through more decisive reasoning.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Yuhang Wang, Yuling Shi, Mo Yang, Rongrui Zhang, Shilin He, Heng Lian, Yuting Chen, Siyu Ye, Kai Cai, Xiaodong Gu

Code available at https://github.com/Ayanami1314/swe-pruner
License: CC BY 4.0

Abstract: LLM agents have demonstrated remarkable capabilities in software development, but their performance is hampered by long interaction contexts, which incur high API costs and latency. While various context compression approaches such as LongLLMLingua have emerged to tackle this challenge, they typically rely on fixed metrics such as PPL, ignoring the task-specific nature of code understanding. As a result, they frequently disrupt syntactic and logical structure and fail to retain critical implementation details. In this paper, we propose SWE-Pruner, a self-adaptive context pruning framework tailored for coding agents. Drawing inspiration from how human programmers "selectively skim" source code during development and debugging, SWE-Pruner performs task-aware adaptive pruning for long contexts. Given the current task, the agent formulates an explicit goal (e.g., "focus on error handling") as a hint to guide the pruning targets. A lightweight neural skimmer (0.6B parameters) is trained to dynamically select relevant lines from the surrounding context given the goal. Evaluations across four benchmarks and multiple models validate SWE-Pruner's effectiveness in various scenarios, achieving 23-54% token reduction on agent tasks like SWE-Bench Verified and up to 14.84x compression on single-turn tasks like LongCodeQA with minimal performance impact.

Submitted to arXiv on 23 Jan. 2026

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2601.16746v1

Large Language Models (LLMs) have made significant advancements in software development tasks, with coding agents like Claude Code and Gemini CLI showcasing their ability to seamlessly navigate repositories, run tests, and submit patches. However, these agents face a major challenge of accumulating long interaction contexts within the limited window of LLMs, creating a "Context Wall" that hampers their performance. To address this issue, long-context models have been developed but blindly processing extensive amounts of code leads to high API costs and latency issues. Existing context compression techniques designed for natural language tasks struggle when applied to coding agents as they compromise syntactic validity and discard crucial debugging information. <br> <br> To overcome these limitations, we introduce SWE-Pruner - a self-adaptive context pruning framework tailored specifically for coding agents. Drawing inspiration from how human programmers selectively skim through code based on their goals, SWE-Pruner enables agents to articulate explicit natural language goals alongside retrieval actions. By training a lightweight neural skimmer on synthetic data, our approach allows for adaptive selection of relevant lines based on given goals while preserving syntactic and structural integrity. <br> <br> Evaluation across various benchmarks and models demonstrates the effectiveness of SWE-Pruner in achieving substantial token reduction on agent tasks like SWE-Bench Verified and significant compression on single-turn tasks like LongCodeQA with minimal impact on performance. Not only does SWE-Pruner deliver efficiency gains through token savings but also enhances agent decision quality by reducing interaction rounds through more decisive reasoning.<br> <br> In summary, addressing the challenges posed by long interaction contexts in LLMs and providing a solution that improves efficiency and decision-making capabilities in software development tasks.
Created on 07 May. 2026

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.