Agents of Change: Self-Evolving LLM Agents for Strategic Planning

University of California, Santa Barbara

*Indicates Equal Contribution
Catan gameplay and LLM-agent interaction

Overview of Catan gameplay and LLM-agent interaction. Left: Settlers of Catan – Players take turns to gather, trade, and spend resources to build on a modular board in a stochastic, partially observable strategy game. The objective is to reach 10 victory points by constructing settlements, roads, and cities. Right: Our LLM-based framework interacts with the Catanatron API, leveraging game state information and strategic reasoning to decide actions. Through repeated play and self-modification, agents evolve more coherent long-term strategies.

Abstract

Recent advances in LLMs have enabled their use as autonomous agents across a range of tasks, yet they continue to struggle with formulating and adhering to coherent long-term strategies. In this paper, we investigate whether LLM agents can self-improve when placed in environments that explicitly challenge their strategic planning abilities. Using the board game Settlers of Catan, accessed through the open-source Catanatron framework, we benchmark a progression of LLM-based agents, from a simple game-playing agent to systems capable of autonomously rewriting their own prompts and their player agent's code. We introduce a multi-agent architecture in which specialized roles (Analyzer, Researcher, Coder, and Player) collaborate to iteratively analyze gameplay, research new strategies, and modify the agent's logic or prompt. By comparing manually crafted agents to those evolved entirely by LLMs, we evaluate how effectively these systems can diagnose failure and adapt over time. Our results show that self-evolving agents, particularly when powered by models like Claude 3.7 and GPT-4o, outperform static baselines by autonomously adopting their strategies, passing along sample behavior to game-playing agents, and demonstrating adaptive reasoning over multiple iterations.

Agent Architectures

LLM-based Agent Architectures

Diagrams of the LLM-based Agent Architectures. Baseline agents use LLMs to map Catan game states to actions by direct prompting (BaseAgent) or structured formatting (StructuredAgent). PromptEvolver adds a multi-agent loop where prompts are iteratively refined via analysis and summarization. AgentEvolver enables autonomous code evolution, with the Analyzer, Researcher, Coder, and Strategizer agents collaboratively redesigning player logic from gameplay feedback.

Results

BibTeX

@misc{belle2025agentschangeselfevolvingllm,
      title={Agents of Change: Self-Evolving LLM Agents for Strategic Planning}, 
      author={Nikolas Belle and Dakota Barnes and Alfonso Amayuelas and Ivan Bercovich and Xin Eric Wang and William Wang},
      year={2025},
      eprint={2506.04651},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2506.04651}, 
}