Multi-Meta-RAG: Improving RAG for Multi-Hop Queries using Database Filtering with LLM-Extracted Metadata

AI-generated keywords: Multi-Meta-RAG

AI-generated Key Points

  • Multi-Meta-RAG method enhances retrieval-augmented generation (RAG) for multi-hop queries using database filtering and large language model (LLM)-extracted metadata
  • Results show significant improvements in chunk retrieval and LLM generation compared to Graph RAG
  • Limitations include the need for specific domain and question format queries, manual creation of prompt templates, and results falling short of feeding LLM precise ground-truth facts
  • Evaluation results compare GPT-4 and Google PaLM performance on different query types, with both models achieving high accuracy scores above 0.9 for inference queries
  • Future plans involve exploring more generic prompt templates for metadata extraction, testing alternative LLMs like LLama 3.1 on datasets with recent cut-off dates
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Mykhailo Poliakov, Nadiya Shvai

Accepted to ICTERI 2024 Posters Track
License: CC BY 4.0

Abstract: The retrieval-augmented generation (RAG) enables retrieval of relevant information from an external knowledge source and allows large language models (LLMs) to answer queries over previously unseen document collections. However, it was demonstrated that traditional RAG applications perform poorly in answering multi-hop questions, which require retrieving and reasoning over multiple elements of supporting evidence. We introduce a new method called Multi-Meta-RAG, which uses database filtering with LLM-extracted metadata to improve the RAG selection of the relevant documents from various sources, relevant to the question. While database filtering is specific to a set of questions from a particular domain and format, we found out that Multi-Meta-RAG greatly improves the results on the MultiHop-RAG benchmark. The code is available at https://github.com/mxpoliakov/Multi-Meta-RAG.

Submitted to arXiv on 19 Jun. 2024

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2406.13213v2

, , , , The authors of this study introduce Multi-Meta-RAG, a method that utilizes database filtering with large language model (LLM)-extracted metadata to enhance retrieval-augmented generation (RAG) for multi-hop queries. The results of experiments demonstrate that Multi-Meta-RAG significantly improves chunk retrieval and LLM generation compared to alternative solutions such as Graph RAG. However, there are limitations to this approach, including the need for specific domain and question format queries for metadata extraction, manual creation of prompt templates, and results that still fall short of feeding LLM precise ground-truth facts. The evaluation results in Table 4 compare the performance of GPT-4 and Google PaLM on different types of questions. Both models achieve high accuracy scores above 0.9 for inference queries, with Google PaLM outperforming GPT-4 in comparison and temporal queries but struggling with Null questions. The authors suggest that combining both models for different query types could further improve overall accuracy. In the future, more generic prompt templates will be explored for metadata extraction using multi-hop datasets from various domains, and alternative LLMs like LLama 3.1 will be tested on datasets with more recent cut-off dates. Despite its limitations, Multi-Meta-RAG shows promise in enhancing RAG performance for multi-hop queries and offers a relatively straightforward and explainable solution compared to other methods. This research was partially funded by the OpenAI Researcher Access Program (Application 0000005294). Additionally, an Appendix provides a Metadata Extraction Prompt Template for extracting metadata from questions to filter database sources related to article content. The template includes XML feed data related to the study's topic on Multi-Meta-RAG.
Created on 06 Jan. 2025

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.