Learning to Filter Context for Retrieval-Augmented Generation

AI-generated keywords: FILCO

AI-generated Key Points

  • FILCO is a method designed to enhance context for generation models by filtering out irrelevant passages during test time
  • The effectiveness of FILCO is demonstrated across knowledge-intensive tasks like extractive question answering, complex multi-hop QA, fact verification, and dialog generation
  • Promising results are shown in tasks like NQ, TQA, HotpotQA, ELI5, FEVER, and WoW using metrics like Exact Match and Unigram F1
  • Implementing FILCO requires training models for context filtering and output generation with varying computational resources depending on model architecture and size
  • FILCO outperforms existing methods like RAG, FID, EVI in scenarios where top-5 retrieved passages are filtered by full passages or sentences
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Zhiruo Wang, Jun Araki, Zhengbao Jiang, Md Rizwan Parvez, Graham Neubig

License: CC BY-SA 4.0

Abstract: On-the-fly retrieval of relevant knowledge has proven an essential element of reliable systems for tasks such as open-domain question answering and fact verification. However, because retrieval systems are not perfect, generation models are required to generate outputs given partially or entirely irrelevant passages. This can cause over- or under-reliance on context, and result in problems in the generated output such as hallucinations. To alleviate these problems, we propose FILCO, a method that improves the quality of the context provided to the generator by (1) identifying useful context based on lexical and information-theoretic approaches, and (2) training context filtering models that can filter retrieved contexts at test time. We experiment on six knowledge-intensive tasks with FLAN-T5 and LLaMa2, and demonstrate that our method outperforms existing approaches on extractive question answering (QA), complex multi-hop and long-form QA, fact verification, and dialog generation tasks. FILCO effectively improves the quality of context, whether or not it supports the canonical output.

Submitted to arXiv on 14 Nov. 2023

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2311.08377v1

, , , , In the study "Learning to Filter Context for Retrieval-Augmented Generation," the authors introduce FILCO, a method designed to enhance the quality of context provided to generation models by identifying useful information through lexical and information-theoretic approaches. By training context filtering models, FILCO aims to filter out irrelevant passages during test time, ultimately improving overall output quality. The effectiveness of FILCO is demonstrated across various knowledge-intensive tasks such as extractive question answering, complex multi-hop and long-form QA, fact verification, and dialog generation. The authors emphasize that their method shows promising results across different tasks like NQ, TQA, HotpotQA, ELI5, FEVER, and WoW using automatic metrics like Exact Match and Unigram F1. However, they encourage further evaluation through neural- or human-based assessments due to potential inaccuracies in automated measures. Furthermore, implementing FILCO requires training models for both context filtering and output generation which may vary in computational resources depending on the chosen model architecture and size. Additionally, the study compares FILCO's performance with existing methods such as RAG, FID, EVI., revealing superior results in various scenarios when providing top-5 retrieved passages filtered by either full passages or sentences. The authors also discuss related work in augmented generation techniques where additional contexts have proven effective but stress the importance of optimizing granularity and strategy for retrieval to enhance generation accuracy. Overall,<kgd> Learning to Filter Context for Retrieval-Augmented Generation</kgd> presents a novel approach in improving context quality for generation models through efficient filtering mechanisms. The method showcases promising outcomes across diverse knowledge-intensive tasks and encourages further exploration and validation before generalizing conclusions to specialized domain datasets.
Created on 11 Sep. 2024

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.