AI Arms and Influence: Frontier Models Exhibit Sophisticated Reasoning in Simulated Nuclear Crises

Summaries already available in other languages: fr

Authors: Kenneth Payne

45 pages, 6 figures, 27 tables

Abstract: Today's leading AI models engage in sophisticated behaviour when placed in strategic competition. They spontaneously attempt deception, signaling intentions they do not intend to follow; they demonstrate rich theory of mind, reasoning about adversary beliefs and anticipating their actions; and they exhibit credible metacognitive self-awareness, assessing their own strategic abilities before deciding how to act. Here we present findings from a crisis simulation in which three frontier large language models (GPT-5.2, Claude Sonnet 4, Gemini 3 Flash) play opposing leaders in a nuclear crisis. Our simulation has direct application for national security professionals, but also, via its insights into AI reasoning under uncertainty, has applications far beyond international crisis decision-making. Our findings both validate and challenge central tenets of strategic theory. We find support for Schelling's ideas about commitment, Kahn's escalation framework, and Jervis's work on misperception, inter alia. Yet we also find that the nuclear taboo is no impediment to nuclear escalation by our models; that strategic nuclear attack, while rare, does occur; that threats more often provoke counter-escalation than compliance; that high mutual credibility accelerated rather than deterred conflict; and that no model ever chose accommodation or withdrawal even when under acute pressure, only reduced levels of violence. We argue that AI simulation represents a powerful tool for strategic analysis, but only if properly calibrated against known patterns of human reasoning. Understanding how frontier models do and do not imitate human strategic logic is essential preparation for a world in which AI increasingly shapes strategic outcomes.

Submitted to arXiv on 16 Feb. 2026

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2602.14740v1

Comprehensive Summary
Key points
Layman's Summary
Blog article

The summary is not ready yet

The key points are not ready yet

The Layman's summary is not ready yet

The blog article is not ready yet

Created on 27 Feb. 2026

Available in other languages: fr

Assess the quality of the AI-generated content by voting

Score: 0

Some bits of the article are not summarized yet, you can re-run the summarizing process by clicking on the Run button below.

Similar papers summarized with our AI tools

53.5%

AgentGroupChat: An Interactive Group Chat Simulacra For Better Eliciting Emer…

cs.AI

51.6%

Does GPT-4 Pass the Turing Test?

cs.AI

51.2%

War and Peace (WarAgent): Large Language Model-based Multi-Agent Simulation o…

cs.AI

50.1%

Auto-GPT for Online Decision Making: Benchmarks and Additional Opinions

cs.AI

49.8%

Graph-enhanced Large Language Models in Asynchronous Plan Reasoning

cs.AI

49.6%

When Chain of Thought is Necessary, Language Models Struggle to Evade Monitors

cs.AI

49.4%

On the Brittle Foundations of ReAct Prompting for Agentic Large Language Mode…

cs.AI

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.