Evolving Deeper LLM Thinking

AI-generated keywords: StegPoet Mind Evolution Large Language Models Refinement through Critical Conversation Natural language planning

AI-generated Key Points

  • StegPoet is a new benchmark problem involving encoding hidden messages in essays, stories, or poems
  • Implementation of a hidden message detector makes solving this steganography challenge achievable
  • Gemini 1.5 Pro using the Mind Evolution approach achieves an 87% success rate in this task
  • The study explores combining Large Language Models (LLMs) with evolutionary search for optimization tasks in natural language spaces without extensive formalization
  • Comparison to other methods shows superior performance on benchmarks like TravelPlanner with Gemini 1.5 Flash achieving over 95% success rate
  • Introduction of Refinement through Critical Conversation (RCC) for enhancing critical thinking abilities of LLMs and improving solution quality based on feedback
  • Study demonstrates effectiveness of Mind Evolution in scaling inference time compute in Large Language Models across various tasks including natural language planning and steganography
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Kuang-Huei Lee, Ian Fischer, Yueh-Hua Wu, Dave Marwood, Shumeet Baluja, Dale Schuurmans, Xinyun Chen

License: CC BY 4.0

Abstract: We explore an evolutionary search strategy for scaling inference time compute in Large Language Models. The proposed approach, Mind Evolution, uses a language model to generate, recombine and refine candidate responses. The proposed approach avoids the need to formalize the underlying inference problem whenever a solution evaluator is available. Controlling for inference cost, we find that Mind Evolution significantly outperforms other inference strategies such as Best-of-N and Sequential Revision in natural language planning tasks. In the TravelPlanner and Natural Plan benchmarks, Mind Evolution solves more than 98% of the problem instances using Gemini 1.5 Pro without the use of a formal solver.

Submitted to arXiv on 17 Jan. 2025

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2501.09891v1

In this study, we present StegPoet - a new benchmark problem that involves encoding hidden messages in generated essays, stories, or poems. This form of steganography poses challenges in formalization and solution finding. However, with the implementation of a hidden message detector to guide the search process programmatically, it becomes achievable. Our goal is to demonstrate the versatility of evolutionary search beyond easily formalized natural language domains. Using the Mind Evolution approach, Gemini 1.5 Pro achieves an impressive success rate of 87% in this task. We also discuss related work exploring the combination of Large Language Models (LLMs) with evolutionary search for numerical and combinatorial optimization tasks. While previous studies have focused on evolving solutions in formal spaces, our work emphasizes evolving solutions in natural language spaces without the need for extensive task formalization. This approach eliminates the requirement for significant effort and expert knowledge for each task instance. Furthermore, we compare our approach to other works that apply evolutionary search to prompt optimization and problem-solving tasks. Unlike some existing methods that evolve new LLM agents or perform evolutionary search directly on plans, our approach demonstrates superior performance on benchmarks like TravelPlanner by achieving over 95% success rate with Gemini 1.5 Flash. We also introduce the concept of Refinement through Critical Conversation (RCC), where an initial solution undergoes evaluation and feedback from a critic character before being refined by an author character in an iterative process. This structured prompt-driven conversation aims to enhance critical thinking abilities of LLMs and improve solution quality based on received feedback. Overall, our study showcases the effectiveness of Mind Evolution in scaling inference time compute in Large Language Models across various tasks including natural language planning and steganography. The results highlight the potential of evolutionary search strategies in optimizing plans and generating high-quality responses without the need for extensive formalization of underlying problems.
Created on 23 Jun. 2025

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.