Active Retrieval Augmented Generation

AI-generated keywords: Retrieval-augmented FLARE Knowledge-intensive Factually inaccurate Generative Model

AI-generated Key Points

The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.

  • The paper proposes a retrieval-augmented generation method to address the issue of large language models generating factually inaccurate output.
  • Existing retrieval-augmented LMs retrieve information only once based on the input, limiting their ability to generate long texts.
  • The authors propose a generalized view of active retrieval augmented generation which actively decides when and what to retrieve across the course of the generation process.
  • This is implemented in Forward-Looking Active Retrieval augmented generation (FLARE), a generic retrieval-augmented generation method that iteratively uses a prediction of the upcoming sentence to anticipate future content and retrieve relevant documents to regenerate low-confidence tokens.
  • FLARE is evaluated along with baselines over four knowledge-intensive datasets and demonstrates superior or competitive performance on all tasks.
  • Code and datasets are available at https://github.com/jzbjyb/FLARE.
  • The paper is authored by Zhengbao Jiang, Frank F. Xu, Luyu Gao, Zhiqing Sun, Qian Liu, Jane Dwivedi-Yu, Yiming Yang, Jamie Callan and Graham Neubig.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Zhengbao Jiang, Frank F. Xu, Luyu Gao, Zhiqing Sun, Qian Liu, Jane Dwivedi-Yu, Yiming Yang, Jamie Callan, Graham Neubig

Abstract: Despite the remarkable ability of large language models (LMs) to comprehend and generate language, they have a tendency to hallucinate and create factually inaccurate output. Augmenting LMs by retrieving information from external knowledge resources is one promising solution. Most existing retrieval-augmented LMs employ a retrieve-and-generate setup that only retrieves information once based on the input. This is limiting, however, in more general scenarios involving generation of long texts, where continually gathering information throughout the generation process is essential. There have been some past efforts to retrieve information multiple times while generating outputs, which mostly retrieve documents at fixed intervals using the previous context as queries. In this work, we provide a generalized view of active retrieval augmented generation, methods that actively decide when and what to retrieve across the course of the generation. We propose Forward-Looking Active REtrieval augmented generation (FLARE), a generic retrieval-augmented generation method which iteratively uses a prediction of the upcoming sentence to anticipate future content, which is then utilized as a query to retrieve relevant documents to regenerate the sentence if it contains low-confidence tokens. We test FLARE along with baselines comprehensively over 4 long-form knowledge-intensive generation tasks/datasets. FLARE achieves superior or competitive performance on all tasks, demonstrating the effectiveness of our method. Code and datasets are available at https://github.com/jzbjyb/FLARE.

Submitted to arXiv on 11 May. 2023

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2305.06983v1

This paper's license doesn't allow us to build upon its content and the summarizing process is here made with the paper's metadata rather than the article.

The paper titled "Active Retrieval Augmented Generation" addresses the issue of large language models (LMs) generating factually inaccurate output by proposing a retrieval-augmented generation method. Existing retrieval-augmented LMs typically retrieve information only once based on the input, limiting their ability to generate long texts. To address this limitation, the authors propose a generalized view of active retrieval augmented generation which actively decides when and what to retrieve across the course of the generation process. This is implemented in Forward-Looking Active Retrieval augmented generation (FLARE), a generic retrieval-augmented generation method that iteratively uses a prediction of the upcoming sentence to anticipate future content and retrieve relevant documents to regenerate low-confidence tokens. The authors evaluate FLARE along with baselines over four knowledge-intensive datasets and demonstrate its superior or competitive performance on all tasks. Code and datasets are available at https://github.com/jzbjyb/FLARE. The paper is authored by Zhengbao Jiang, Frank F. Xu, Luyu Gao, Zhiqing Sun, Qian Liu, Jane Dwivedi-Yu, Yiming Yang, Jamie Callan and Graham Neubig.
Created on 03 Jun. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

The license of this specific paper does not allow us to build upon its content and the summarizing tools will be run using the paper metadata rather than the full article. However, it still does a good job, and you can also try our tools on papers with more open licenses.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.