Agent Workflow Memory

AI-generated keywords: Language Model-based Agents Agent Workflow Memory (AWM) Sub-routine-based Induction Web Navigation Benchmarks Reusable Workflows

AI-generated Key Points

Study focuses on enhancing performance of language model-based agents in real-world tasks like web navigation
Challenge is dealing with long-horizon tasks involving complex action trajectories
Humans efficiently solve tasks by learning reusable task workflows from past experiences
Researchers introduce Agent Workflow Memory (AWM) to bridge gap and help agents benefit from similar process
AWM involves inducing commonly reused routines or workflows for future task-solving assistance
Abstract, sub-routine-based induction methods using Language Models (LMs) compared to rule-based methods without context and sub-routine abstraction
LM-based workflow induction more efficient by using fewer steps and preventing unnecessary actions, improving task-solving efficiency
AWM tested on Mind2Web and WebArena benchmarks, significantly enhancing baseline results with relative success rate improvements of 24.6% and 51.1%
Online AWM demonstrates robust generalization capabilities across different evaluations, outperforming baselines by up to 14.0 absolute points as train-test task distribution gaps widen
Importance of abstract, reusable workflows in improving agent performance on complex tasks emphasized

Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Zora Zhiruo Wang, Jiayuan Mao, Daniel Fried, Graham Neubig

arXiv: 2409.07429v1 - DOI (cs.CL)

License: CC BY-SA 4.0

Abstract: Despite the potential of language model-based agents to solve real-world tasks such as web navigation, current methods still struggle with long-horizon tasks with complex action trajectories. In contrast, humans can flexibly solve complex tasks by learning reusable task workflows from past experiences and using them to guide future actions. To build agents that can similarly benefit from this process, we introduce Agent Workflow Memory (AWM), a method for inducing commonly reused routines, i.e., workflows, and selectively providing workflows to the agent to guide subsequent generations. AWM flexibly applies to both offline and online scenarios, where agents induce workflows from training examples beforehand or from test queries on the fly. We experiment on two major web navigation benchmarks -- Mind2Web and WebArena -- that collectively cover 1000+ tasks from 200+ domains across travel, shopping, and social media, among others. AWM substantially improves the baseline results by 24.6% and 51.1% relative success rate on Mind2Web and WebArena while reducing the number of steps taken to solve WebArena tasks successfully. Furthermore, online AWM robustly generalizes in cross-task, website, and domain evaluations, surpassing baselines from 8.9 to 14.0 absolute points as train-test task distribution gaps widen.

Submitted to arXiv on 11 Sep. 2024

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2409.07429v1

Comprehensive Summary
Key points
Layman's Summary
Blog article

This study focuses on enhancing the performance of language model-based agents in solving real-world tasks such as web navigation. The current challenge lies in dealing with long-horizon tasks that involve complex action trajectories. Unlike machines, humans have the ability to efficiently solve intricate tasks by learning reusable task workflows from past experiences and using them to guide future actions. To bridge this gap and enable agents to benefit from a similar process, the researchers introduce Agent Workflow Memory (AWM). This method involves inducing commonly reused routines or workflows and selectively providing them to the agent to assist in future task-solving processes. The study explores how abstract, sub-routine-based induction methods using Language Models (LMs) compare to rule-based methods without context and sub-routine abstraction. Results show that while rule- and LM-based workflow induction perform comparably in terms of success rate, the LM-based method proves to be more efficient by using fewer steps. The finer-grained workflows produced by LM-based induction prevent agents from following unnecessary steps present in rule-induced workflows, thereby improving task-solving efficiency. Furthermore, AWM is tested on two major web navigation benchmarks - Mind2Web and WebArena - covering a wide range of tasks across various domains such as travel, shopping, and social media. AWM significantly enhances baseline results on both benchmarks, with relative success rate improvements of 24.6% and 51.1% on Mind2Web and WebArena respectively. Additionally, online AWM demonstrates robust generalization capabilities across different evaluations, outperforming baselines by up to 14.0 absolute points as train-test task distribution gaps widen. Overall, the study highlights the importance of abstract, reusable workflows in improving agent performance on complex tasks. By leveraging AWM to induce and apply workflows effectively, agents can enhance their problem-solving abilities and adaptability over time in dynamic environments.

- Study focuses on enhancing performance of language model-based agents in real-world tasks like web navigation
- Challenge is dealing with long-horizon tasks involving complex action trajectories
- Humans efficiently solve tasks by learning reusable task workflows from past experiences
- Researchers introduce Agent Workflow Memory (AWM) to bridge gap and help agents benefit from similar process
- AWM involves inducing commonly reused routines or workflows for future task-solving assistance
- Abstract, sub-routine-based induction methods using Language Models (LMs) compared to rule-based methods without context and sub-routine abstraction
- LM-based workflow induction more efficient by using fewer steps and preventing unnecessary actions, improving task-solving efficiency
- AWM tested on Mind2Web and WebArena benchmarks, significantly enhancing baseline results with relative success rate improvements of 24.6% and 51.1%
- Online AWM demonstrates robust generalization capabilities across different evaluations, outperforming baselines by up to 14.0 absolute points as train-test task distribution gaps widen
- Importance of abstract, reusable workflows in improving agent performance on complex tasks emphasized

Summary- Researchers are trying to make computer programs that understand and do things on the internet better. - It's hard because the tasks they want these programs to do are complicated and involve many steps. - People are good at these tasks because they learn how to do them efficiently from past experiences. - The researchers have created a memory system for the programs to remember and reuse common ways of doing tasks. - This memory system helps the programs work faster and better at solving problems online. Definitions- Language model-based agents: Computer programs that use language models to understand and interact with information on the internet. - Agent Workflow Memory (AWM): A memory system designed to help computer programs remember and reuse common task-solving methods. - Induction methods: Techniques used to teach or guide computer programs in learning new ways of doing tasks efficiently. - Task-solving efficiency: How well a computer program can complete tasks accurately and quickly.

Introduction: The field of artificial intelligence (AI) has made significant advancements in recent years, with language model-based agents being at the forefront. These agents have shown great potential in solving real-world tasks such as web navigation. However, one major challenge that remains is dealing with long-horizon tasks that involve complex action trajectories. Unlike machines, humans have the ability to efficiently solve intricate tasks by learning reusable task workflows from past experiences and using them to guide future actions. In order to bridge this gap and enable agents to benefit from a similar process, researchers have introduced Agent Workflow Memory (AWM). This method involves inducing commonly reused routines or workflows and selectively providing them to the agent to assist in future task-solving processes. In this blog article, we will delve into the details of this research paper and understand how AWM can enhance the performance of language model-based agents on complex tasks. Methodology: The study compares abstract, sub-routine-based induction methods using Language Models (LMs) with rule-based methods without context and sub-routine abstraction. The goal is to see which method performs better in terms of success rate and efficiency when applied on long-horizon tasks. Results: The results show that while rule- and LM-based workflow induction perform comparably in terms of success rate, the LM-based method proves to be more efficient by using fewer steps. This is because the finer-grained workflows produced by LM-based induction prevent agents from following unnecessary steps present in rule-induced workflows, thereby improving task-solving efficiency. Furthermore, AWM was tested on two major web navigation benchmarks - Mind2Web and WebArena - covering a wide range of tasks across various domains such as travel, shopping, and social media. The results were impressive as AWM significantly enhanced baseline results on both benchmarks with relative success rate improvements of 24.6% and 51.1% on Mind2Web and WebArena respectively. Generalization: One of the key strengths of AWM is its robust generalization capabilities. The study tested AWM on different evaluations with varying train-test task distribution gaps and found that it outperformed baselines by up to 14.0 absolute points. This highlights the adaptability and effectiveness of AWM in dynamic environments. Conclusion: The research paper concludes that abstract, reusable workflows are crucial in improving agent performance on complex tasks. By leveraging AWM to induce and apply workflows effectively, agents can enhance their problem-solving abilities and adaptability over time in dynamic environments. In conclusion, this study sheds light on the importance of incorporating human-like learning processes into language model-based agents for better performance on real-world tasks. The introduction of Agent Workflow Memory has shown promising results in enhancing agent efficiency and success rates, especially on long-horizon tasks with complex action trajectories. With further advancements in AI technology, we can expect to see more sophisticated methods like AWM being implemented in various applications for improved problem-solving capabilities.

Created on 12 Oct. 2025

Assess the quality of the AI-generated content by voting

Score: 0

Similar papers summarized with our AI tools

60.9%

ScribeAgent: Towards Specialized Web Agents Using Production-Scale Workflow D…

cs.CL

59.1%

AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigati…

cs.CL

54.9%

ReWOO: Decoupling Reasoning from Observations for Efficient Augmented Languag…

cs.CL

54.3%

MindSearch: Mimicking Human Minds Elicits Deep AI Searcher

cs.CL

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.