Unlimiformer: Long-Range Transformers with Unlimited Length Input

AI-generated keywords: Unlimiformer Transformer-based models KNN index Long-document summarization Machine translation

AI-generated Key Points

The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.

  • Unlimiformer is a novel approach that addresses the input length limitation of transformer-based models
  • Transformer models have a predefined bound to their input length because they need to attend to every token in the input, which can be computationally expensive
  • Unlimiformer proposes a general approach that can wrap any existing pretrained encoder-decoder transformer and offload the attention computation across all layers to a single $k$-nearest-neighbor index
  • This index can be kept on either the GPU or CPU memory and queried in sub-linear time, allowing extremely long input sequences to be indexed without truncation at test time
  • Every attention head in every decoder layer retrieves its top-$k$ keys instead of attending to every key, improving efficiency and accuracy
  • The efficacy of Unlimiformer was demonstrated on several long-document and multi-document summarization benchmarks, including the BookSum dataset with inputs up to 350k tokens long
  • Unlimiformer improves pretrained models such as BART and Longformer by extending them to unlimited inputs without additional learned weights or modifying their code
  • The authors make their code and models publicly available at https://github.com/abertsch72/unlimiformer
  • This work has significant implications for natural language processing tasks that require processing of lengthy texts, such as document summarization or machine translation. By overcoming the input length limitation of transformer-based models, Unlimiformer opens up new possibilities for more accurate and efficient processing of lengthy texts.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Amanda Bertsch, Uri Alon, Graham Neubig, Matthew R. Gormley

Preprint

Abstract: Transformer-based models typically have a predefined bound to their input length, because of their need to potentially attend to every token in the input. In this work, we propose Unlimiformer: a general approach that can wrap any existing pretrained encoder-decoder transformer, and offload the attention computation across all layers to a single $k$-nearest-neighbor index; this index can be kept on either the GPU or CPU memory and queried in sub-linear time. This way, we can index extremely long input sequences, while every attention head in every decoder layer retrieves its top-$k$ keys, instead of attending to every key. We demonstrate Unlimiformers's efficacy on several long-document and multi-document summarization benchmarks, showing that it can summarize even 350k token-long inputs from the BookSum dataset, without any input truncation at test time. Unlimiformer improves pretrained models such as BART and Longformer by extending them to unlimited inputs without additional learned weights and without modifying their code. We make our code and models publicly available at https://github.com/abertsch72/unlimiformer .

Submitted to arXiv on 02 May. 2023

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2305.01625v1

This paper's license doesn't allow us to build upon its content and the summarizing process is here made with the paper's metadata rather than the article.

Unlimiformer is a novel approach that addresses the input length limitation of transformer-based models. These models have a predefined bound to their input length because they need to attend to every token in the input, which can be computationally expensive. Unlimiformer proposes a general approach that can wrap any existing pretrained encoder-decoder transformer and offload the attention computation across all layers to a single $k$-nearest-neighbor index. This index can be kept on either the GPU or CPU memory and queried in sub-linear time, allowing extremely long input sequences to be indexed without truncation at test time. Every attention head in every decoder layer retrieves its top-$k$ keys instead of attending to every key, improving efficiency and accuracy. The efficacy of Unlimiformer was demonstrated on several long-document and multi-document summarization benchmarks, including the BookSum dataset with inputs up to 350k tokens long. Unlimiformer improves pretrained models such as BART and Longformer by extending them to unlimited inputs without additional learned weights or modifying their code. The authors make their code and models publicly available at https://github.com/abertsch72/unlimiformer. This work has significant implications for natural language processing tasks that require processing of lengthy texts, such as document summarization or machine translation. By overcoming the input length limitation of transformer-based models, Unlimiformer opens up new possibilities for more accurate and efficient processing of lengthy texts.
Created on 03 May. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

The license of this specific paper does not allow us to build upon its content and the summarizing tools will be run using the paper metadata rather than the full article. However, it still does a good job, and you can also try our tools on papers with more open licenses.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.