Reflexion: an autonomous agent with dynamic memory and self-reflection
AI-generated Key Points
- Recent advancements in large language model (LLM) agents have shown remarkable performance across various benchmarks.
- A team of researchers proposed Reflexion to address the limitation of LLM agents lacking certain qualities inherent to human decision-making processes, such as the ability to learn from mistakes through self-reflection.
- Reflexion endows an agent with dynamic memory and self-reflection capabilities to enhance its existing reasoning trace and task-specific action choice abilities.
- The team introduced a heuristic that enables the agent to pinpoint hallucination instances, avoid repetition in action sequences, and construct an internal memory map of the given environment for full automation.
- The team evaluated their approach by assessing the agent's ability to complete decision-making tasks in AlfWorld environments and knowledge-intensive search-based question-and-answer tasks in HotPotQA environments. They observed success rates of 97% and 51%, respectively.
- The agent used ReAct to solve 97% of the given tasks in 12 trials out of 134 tasks while failing only four times in AlfWorld environments.
- In HotPotQA environments equipped with a Wikipedia search engine, the agent had to perform relevant searches across multiple documents before providing EM answers given context.
- Reflexion demonstrates the emergent property of self-reflection in an agent's decision making process which could lead to more efficient problem solving through trial and error.
Authors: Noah Shinn, Beck Labash, Ashwin Gopinath
Abstract: Recent advancements in decision-making large language model (LLM) agents have demonstrated impressive performance across various benchmarks. However, these state-of-the-art approaches typically necessitate internal model fine-tuning, external model fine-tuning, or policy optimization over a defined state space. Implementing these methods can prove challenging due to the scarcity of high-quality training data or the lack of well-defined state space. Moreover, these agents do not possess certain qualities inherent to human decision-making processes, specifically the ability to learn from mistakes. Self-reflection allows humans to efficiently solve novel problems through a process of trial and error. Building on recent research, we propose Reflexion, an approach that endows an agent with dynamic memory and self-reflection capabilities to enhance its existing reasoning trace and task-specific action choice abilities. To achieve full automation, we introduce a straightforward yet effective heuristic that enables the agent to pinpoint hallucination instances, avoid repetition in action sequences, and, in some environments, construct an internal memory map of the given environment. To assess our approach, we evaluate the agent's ability to complete decision-making tasks in AlfWorld environments and knowledge-intensive, search-based question-and-answer tasks in HotPotQA environments. We observe success rates of 97% and 51%, respectively, and provide a discussion on the emergent property of self-reflection.
Ask questions about this paper to our AI assistant
You can also chat with multiple papers at once here.
Welcome to our AI assistant! Here are some important things to keep in mind:
- The assistant will only answer questions related to this specific paper.
- Please note that this is not a bot for casual chatting.
- If you want the answer in a language other than the language you chose for navigating the website, simply add "TRANSLATE IN LANGUAGE L" at the end of your query (replace "LANGUAGE L" with the language of your choice).
- For example, you could ask "Can you extract the most important aspect of the paper? TRANSLATE IN SPANISH".
- If you want to keep the history of your questions/answers you should create an account.
Assess the quality of the AI-generated content by voting
Why do we need votes?
Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.
Look for similar papers (in beta version)
By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.
Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.