Uncertainty-Aware Hybrid Retrieval for Long-Document RAG

AI-generated keywords: Retrieval augmented generation Quality Granularity Uncertainty-aware Multi-Granularity RAG (UMG-RAG) UMGP-RAG

AI-generated Key Points

  • Quality and granularity of retrieved evidence are crucial in retrieval augmented generation (RAG)
  • Large retrieval units provide contextual richness but may include irrelevant content
  • Fine-grained units are concise but can pose challenges in reliable retrieval
  • A novel training-free hybrid retrieval framework leverages chunk granularity for query-specific reliability estimation
  • The framework utilizes existing dense and sparse retrievers as complementary experts across various chunk granularities
  • It transforms expert-granularity score lists into an evidence distribution, assesses reliability based on distribution entropy, and merges candidates considering query-specific factors
  • An extension employs fine-grained hits to pinpoint relevant evidence while returning broader non-redundant parent chunks for enhanced local coherence during generation
  • Experiments show improved generation quality with uncertainty-aware fusion and parent promotion techniques in long-document RAG settings involving multiple retrievers and generators
  • The framework formalizes a tradeoff in retrieval granularity for long-document RAG scenarios and provides a solution that estimates query-specific reliability without extensive training
  • Evaluation against existing benchmarks demonstrates efficacy in enhancing generation quality through uncertainty-aware fusion and parent promotion strategies
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Hoin Jung, Xiaoqian Wang

License: CC BY 4.0

Abstract: Retrieval augmented generation (RAG) depends critically on the quality and granularity of retrieved evidence. Large retrieval units preserve context but often introduce irrelevant content, which can dilute answer bearing evidence and worsen long context utilization. Fine-grained units are more compact, but they may be difficult to retrieve reliably because short chunks can lack semantic, lexical, or bridging cues needed to match the query. We propose Uncertainty-aware Multi-Granularity RAG (UMG-RAG), a training-free hybrid retrieval framework that treats chunk granularity as query-specific reliability estimation. Instead of training a new retriever or modifying the generator, UMG-RAG uses existing dense and sparse retrievers as complementary experts across multiple chunk granularities. For each query, it converts each expert-granularity score list into an evidence distribution, estimates reliability from distribution entropy, and fuses candidates according to query-specific semantic, lexical, and granularity confidence. We further introduce UMGP-RAG, a parent promotion variant that uses fine-grained hits to locate relevant evidence while returning broader non-redundant parent chunks for local coherence. Experiments on question answering benchmarks show that uncertainty-aware fusion and parent promotion improve generation quality while maintaining a lightweight, plug-and-play retrieval pipeline.

Submitted to arXiv on 11 Jun. 2026

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2606.13550v1

In the realm of retrieval augmented generation (RAG), the quality and granularity of retrieved evidence play a pivotal role. Large retrieval units offer contextual richness but can also bring in irrelevant content that dilutes crucial answer-bearing evidence and hinders effective utilization of long contexts. On the other hand, fine-grained units are more concise but may pose challenges in reliable retrieval due to potential lack of semantic, lexical, or bridging cues necessary for query matching. To address these challenges, we introduce , a novel training-free hybrid retrieval framework that leverages chunk granularity as a query-specific reliability estimation. Instead of developing new retrievers or modifying generators, utilizes existing dense and sparse retrievers as complementary experts across various chunk granularities. For each query, it transforms expert-granularity score lists into an evidence distribution, assesses reliability based on distribution entropy, and merges candidates considering query-specific semantic, lexical, and granularity confidence. Furthermore, we present , an extension that employs fine-grained hits to pinpoint relevant evidence while returning broader non-redundant parent chunks for enhanced local coherence during generation. Through experiments conducted on question answering benchmarks within long-document RAG settings involving multiple dense retrievers and generators, our uncertainty-aware fusion approach and parent promotion technique demonstrate improved generation quality while maintaining a lightweight and adaptable retrieval pipeline. Additionally, our contributions include formalizing a tradeoff in retrieval granularity for long-document RAG scenarios and proposing as a solution that estimates query-specific reliability for each expert-granularity pair without the need for extensive training. We evaluate our methods against existing benchmarks to showcase their efficacy in enhancing generation quality through uncertainty-aware fusion and parent promotion strategies. Furthermore, we discuss related work focusing on interventions for addressing "lost in the middle" issues in language models within long prompts and highlight the significance of retrieval granularity considerations in hybrid approaches within RAG frameworks.
Created on 13 Jun. 2026

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.