You Only Need One Model for Open-domain Question Answering

AI-generated keywords: Open-domain QA Singular Model Architecture Hard-attention Mechanisms Pre-training Methodology End-to-end Training

AI-generated Key Points

The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.

  • The paper proposes a new approach to Open-domain Question Answering (QA) using a singular model architecture instead of the traditional three-model approach.
  • The existing approach involves separate retriever, reranker, and reader models with weakly coupled parameters during training.
  • The proposed method uses hard-attention mechanisms within its transformer architecture to sequentially apply the retriever and reranker and feed resulting computed representations to the reader.
  • This singular model architecture progressively refines hidden representations from the retriever to the reranker to the reader, leading to better gradient flow when trained in an end-to-end manner.
  • A pre-training methodology is proposed to effectively train this architecture.
  • The authors evaluate their model on Natural Questions and TriviaQA open datasets and show that their approach outperforms previous state-of-the-art models by 1.0 and 0.7 exact match scores for a fixed parameter budget.
  • Contributions of this paper include proposing a new singular model architecture for Open-domain QA that efficiently uses model capacity while improving performance over previous approaches, utilizing hard attention mechanisms within its transformer architecture which enables end-to-end training with improved gradient flow compared to traditional approaches, and proposing a pre-training methodology which further boosts its performance on open domain QA tasks such as Natural Questions and TriviaQA datasets where it outperforms existing state of art models by 1.0 and 0.7 exact match scores respectively for fixed parameter budget.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Haejun Lee, Akhil Kedia, Jongwon Lee, Ashwin Paranjape, Christopher D. Manning, Kyoung-Gu Woo

preprint

Abstract: Recent works for Open-domain Question Answering refer to an external knowledge base using a retriever model, optionally rerank the passages with a separate reranker model and generate an answer using an another reader model. Despite performing related tasks, the models have separate parameters and are weakly-coupled during training. In this work, we propose casting the retriever and the reranker as hard-attention mechanisms applied sequentially within the transformer architecture and feeding the resulting computed representations to the reader. In this singular model architecture the hidden representations are progressively refined from the retriever to the reranker to the reader, which is more efficient use of model capacity and also leads to better gradient flow when we train it in an end-to-end manner. We also propose a pre-training methodology to effectively train this architecture. We evaluate our model on Natural Questions and TriviaQA open datasets and for a fixed parameter budget, our model outperforms the previous state-of-the-art model by 1.0 and 0.7 exact match scores.

Submitted to arXiv on 14 Dec. 2021

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2112.07381v1

This paper's license doesn't allow us to build upon its content and the summarizing process is here made with the paper's metadata rather than the article.

The paper titled "You Only Need One Model for Open-domain Question Answering" proposes a novel approach to Open-domain Question Answering (QA) that utilizes a singular model architecture instead of the traditional three-model approach. The existing approach involves using a retriever model to refer to an external knowledge base, optionally reranking passages with a separate reranker model, and generating an answer using another reader model. However, these models have separate parameters and are weakly coupled during training. The proposed method casts the retriever and the reranker as hard-attention mechanisms applied sequentially within the transformer architecture and feeds the resulting computed representations to the reader. This singular model architecture progressively refines hidden representations from the retriever to the reranker to the reader, leading to better gradient flow when trained in an end-to-end manner. Additionally, a pre-training methodology is proposed to effectively train this architecture. The authors evaluate their model on Natural Questions and TriviaQA open datasets and show that their approach outperforms previous state-of-the-art models by 1.0 and 0.7 exact match scores for a fixed parameter budget. The paper's contributions include proposing a new singular model architecture for Open-domain QA that efficiently uses model capacity while improving performance over previous approaches. This approach utilizes hard attention mechanisms within its transformer architecture which enables end-to-end training with improved gradient flow compared to traditional approaches. Furthermore, it also proposes a pre-training methodology which further boosts its performance on open domain QA tasks such as Natural Questions and TriviaQA datasets where it outperforms existing state of art models by 1.0 and 0.7 exact match scores respectively for fixed parameter budget.
Created on 26 Apr. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

The license of this specific paper does not allow us to build upon its content and the summarizing tools will be run using the paper metadata rather than the full article. However, it still does a good job, and you can also try our tools on papers with more open licenses.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.