CREATOR: Disentangling Abstract and Concrete Reasonings of Large Language Models through Tool Creation

AI-generated keywords: CREATOR LLM Tool Creation Knowledge Transfer AI

AI-generated Key Points

The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.

  • Large Language Models (LLMs) have made advancements in utilizing external APIs for various tasks
  • Limitations of LLMs include availability of suitable APIs and instability of implicit reasoning
  • CREATOR framework empowers LLMs to create their own tools through documentation and code realization
  • CREATOR separates the LLM's ability into abstract tool creation and concrete decision execution phases
  • CREATOR improves the performance of LLMs by separating these phases
  • Experiments on MATH and TabMWP benchmarks show that CREATOR outperforms existing baselines
  • A new dataset called Creation Challenge highlights the necessity and benefits of LLMs' tool creation ability
  • Leveraging LLMs as tool creators facilitates knowledge transfer between domains
  • LLMs exhibit varying levels of tool creation abilities, enabling them to tackle diverse situations
  • The study represents a promising avenue for maximizing the potential of LLMs towards intelligent AI systems.
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Cheng Qian, Chi Han, Yi R. Fung, Yujia Qin, Zhiyuan Liu, Heng Ji

Abstract: Large Language Models (LLMs) have demonstrated significant progress in utilizing external APIs as tools for various tasks. However, their tool-using ability is limited by the availability of suitable APIs and the instability of implicit reasoning, particularly when simultaneously engaging in reasoning about plans and actual calculations. To address these limitations, we propose CREATOR, a novel framework that empowers LLMs to create their own tools through documentation and code realization. CREATOR disentangles the LLM's ability into two distinct phases: abstract tool creation and concrete decision execution, which results in improved LLM performance. We evaluate CREATOR on two established benchmarks: MATH, which consists of challenging math competition problems, and TabMWP, which includes diverse tabular contents for problem-solving. Remarkably, CREATOR significantly outperforms existing chain-of-thought (CoT), program-of-thought (PoT), and tool-using baselines on these two benchmarks. Additionally, we present a new dataset, Creation Challenge, comprising 2K diverse questions, to highlight the necessity and benefits of LLMs' tool creation ability in effectively addressing these problems. Furthermore, our research reveals that leveraging LLMs as tool creators facilitates knowledge transfer, and LLMs exhibit varying levels of tool creation abilities, enabling them to flexibly tackle diverse situations. Our study represents a promising avenue for maximizing the potential of LLMs and advancing toward truly intelligent and adaptable AI systems.

Submitted to arXiv on 23 May. 2023

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2305.14318v1

This paper's license doesn't allow us to build upon its content and the summarizing process is here made with the paper's metadata rather than the article.

In recent years, Large Language Models (LLMs) have made significant advancements in utilizing external APIs as tools for various tasks. However, their ability to use these tools is limited by the availability of suitable APIs and the instability of implicit reasoning when engaging in reasoning about plans and calculations simultaneously. To overcome these limitations, a team of researchers proposes a novel framework called CREATOR. CREATOR empowers LLMs to create their own tools through documentation and code realization. It disentangles the LLM's ability into two distinct phases: abstract tool creation and concrete decision execution. By separating these phases, CREATOR improves the performance of LLMs. To evaluate the effectiveness of CREATOR, the researchers conducted experiments on two established benchmarks: MATH and TabMWP. The MATH benchmark consists of challenging math competition problems, while TabMWP includes diverse tabular contents for problem-solving. Remarkably, CREATOR outperformed existing chain-of-thought (CoT), program-of-thought (PoT), and tool-using baselines on both benchmarks. Additionally, the researchers introduced a new dataset called Creation Challenge which comprises 2K diverse questions. This dataset highlights the necessity and benefits of LLMs' tool creation ability in effectively addressing complex problems. Furthermore, this research reveals that leveraging LLMs as tool creators facilitates knowledge transfer between different domains and demonstrates that they exhibit varying levels of tool creation abilities enabling them to flexibly tackle diverse situations. Overall, this study represents a promising avenue for maximizing the potential of LLMs and advancing towards truly intelligent and adaptable AI systems. The proposed CREATOR framework shows great promise in enhancing LLM performance by enabling them to create their own tools for problem-solving tasks.
Created on 09 Aug. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.