Unleashing Infinite-Length Input Capacity for Large-scale Language Models with Self-Controlled Memory System

AI-generated keywords: Self-Controlled Memory Large-scale Language Models Memory Stream Memory Controller Evaluation

AI-generated Key Points

  • The authors propose the Self-Controlled Memory (SCM) system to address the limitation of Large-scale Language Models (LLMs) in processing lengthy inputs.
  • The SCM system is composed of three key modules: the language model agent, the memory stream, and the memory controller.
  • The language model agent iteratively processes ultra-long inputs and stores all historical information in the memory stream.
  • The memory controller provides both long-term memory and short-term memory to generate precise and coherent responses.
  • SCM system can be integrated with any LLMs to enable them to process ultra-long texts without any modification or fine-tuning.
  • Experimental results show that their SCM system enables LLMs to achieve multi-turn dialogue capabilities comparable to ChatGPT and outperform ChatGPT in scenarios involving ultra-long document summarization or long-term conversations.
  • Limitations in evaluating the handling of extremely lengthy texts due to a lack of appropriate datasets for comprehensive and objective evaluation.
  • Future work will focus on releasing a comprehensive test set and its manual evaluation criteria while testing their system on various open-source models currently available.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Xinnian Liang, Bing Wang, Hui Huang, Shuangzhi Wu, Peihao Wu, Lu Lu, Zejun Ma, Zhoujun Li

Working in progress
License: CC BY 4.0

Abstract: Large-scale Language Models (LLMs) are constrained by their inability to process lengthy inputs. To address this limitation, we propose the Self-Controlled Memory (SCM) system to unleash infinite-length input capacity for large-scale language models. Our SCM system is composed of three key modules: the language model agent, the memory stream, and the memory controller. The language model agent iteratively processes ultra-long inputs and stores all historical information in the memory stream. The memory controller provides the agent with both long-term memory (archived memory) and short-term memory (flash memory) to generate precise and coherent responses. The controller determines which memories from archived memory should be activated and how to incorporate them into the model input. Our SCM system can be integrated with any LLMs to enable them to process ultra-long texts without any modification or fine-tuning. Experimental results show that our SCM system enables LLMs, which are not optimized for multi-turn dialogue, to achieve multi-turn dialogue capabilities that are comparable to ChatGPT, and to outperform ChatGPT in scenarios involving ultra-long document summarization or long-term conversations. Additionally, we will supply a test set, which covers common long-text input scenarios, for evaluating the abilities of LLMs in processing long documents.~\footnote{Working in progress.}\footnote{\url{https://github.com/wbbeyourself/SCM4LLMs}}

Submitted to arXiv on 26 Apr. 2023

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2304.13343v1

In this paper, the authors propose the Self-Controlled Memory (SCM) system to address the limitation of Large-scale Language Models (LLMs) in processing lengthy inputs. The SCM system is composed of three key modules: the language model agent, the memory stream, and the memory controller. The language model agent iteratively processes ultra-long inputs and stores all historical information in the memory stream. The memory controller provides both long-term memory and short-term memory to generate precise and coherent responses. The authors demonstrate that their SCM system can be integrated with any LLMs to enable them to process ultra-long texts without any modification or fine-tuning. Experimental results show that their SCM system enables LLMs to achieve multi-turn dialogue capabilities comparable to ChatGPT and outperform ChatGPT in scenarios involving ultra-long document summarization or long-term conversations. However, there are limitations in evaluating the handling of extremely lengthy texts due to a lack of appropriate datasets for comprehensive and objective evaluation. Therefore, the authors aim to construct a specific test set that incorporates various key indicators essential for processing long texts in diverse settings. Additionally, they plan to assess the efficacy of their system on more open-source models that possess single-turn instruction comprehension capability. In conclusion, this paper proposes an effective method for extending input length for LLMs without requiring any training or modification of models. Future work will focus on releasing a comprehensive test set and its manual evaluation criteria while testing their system on various open-source models currently available.
Created on 02 May. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.