DRAGIN: Dynamic Retrieval Augmented Generation based on the Information Needs of Large Language Models
AI-generated Key Points
⚠The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.
- In the realm of text generation processes for Large Language Models (LLMs), the paradigm is crucial for determining when and what information to retrieve during generation.
- The paradigm consists of two essential components: identifying the optimal moment to activate the retrieval module and crafting the appropriate query once retrieval is initiated.
- Existing dynamic RAG methods face limitations in deciding when to retrieve due to reliance on static rules and in determining what to retrieve by focusing only on recent sentences or a few tokens.
- A new framework called DRAGIN (Dynamic Retrieval Augmented Generation based on the Information Needs of Large Language Models) has been introduced to address these shortcomings.
- DRAGIN is designed to make informed decisions on when and what information to retrieve based on real-time information requirements during text generation.
- Comprehensive experiments conducted over four knowledge-intensive generation datasets show that DRAGIN outperforms existing methods across all tasks, demonstrating superior performance in meeting real-time information needs during text generation.
- The authors behind DRAGIN are Weihang Su, Yichen Tang, Qingyao Ai, Zhijing Wu, and Yiqun Liu. Their research delves into how this framework revolutionizes dynamic RAG methods by addressing key limitations and enhancing performance outcomes.
- All code, data, and models associated with DRAGIN have been made openly accessible through GitHub at https://github.com/oneal2000/DRAGIN/tree/main for further exploration or replication of findings.
Authors: Weihang Su, Yichen Tang, Qingyao Ai, Zhijing Wu, Yiqun Liu
Abstract: Dynamic retrieval augmented generation (RAG) paradigm actively decides when and what to retrieve during the text generation process of Large Language Models (LLMs). There are two key elements of this paradigm: identifying the optimal moment to activate the retrieval module (deciding when to retrieve) and crafting the appropriate query once retrieval is triggered (determining what to retrieve). However, current dynamic RAG methods fall short in both aspects. Firstly, the strategies for deciding when to retrieve often rely on static rules. Moreover, the strategies for deciding what to retrieve typically limit themselves to the LLM's most recent sentence or the last few tokens, while the LLM's real-time information needs may span across the entire context. To overcome these limitations, we introduce a new framework, DRAGIN, i.e., Dynamic Retrieval Augmented Generation based on the real-time Information Needs of LLMs. Our framework is specifically designed to make decisions on when and what to retrieve based on the LLM's real-time information needs during the text generation process. We evaluate DRAGIN along with existing methods comprehensively over 4 knowledge-intensive generation datasets. Experimental results show that DRAGIN achieves superior performance on all tasks, demonstrating the effectiveness of our method. We have open-sourced all the code, data, and models in GitHub: https://github.com/oneal2000/DRAGIN/tree/main
Ask questions about this paper to our AI assistant
You can also chat with multiple papers at once here.
⚠The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.
Assess the quality of the AI-generated content by voting
Score: 0
Why do we need votes?
Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.
Similar papers summarized with our AI tools
Navigate through even more similar papers through a
tree representationLook for similar papers (in beta version)
By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.
Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.