Adversarial Policies Beat Superhuman Go AIs

AI-generated keywords: Adversarial Policies Superhuman Go AIs KataGo ICML 2023 AI

AI-generated Key Points

The license of the paper does not allow us to build upon its content and the key points are generated using the paper metadata rather than the full article.

  • Authors present a groundbreaking attack on the Go-playing AI system KataGo
  • Achieve an impressive win rate of over 97% against KataGo running at superhuman settings
  • Adversaries employ clever strategies to trick KataGo into making serious blunders
  • Attack transfers zero-shot to other superhuman Go-playing AIs
  • Comprehensible enough for human experts to implement without algorithmic assistance and consistently beat superhuman AIs
  • Core vulnerability uncovered by this attack persists even in adversarially trained KataGo agents
  • Example games provided on their website to demonstrate the effectiveness of the attack
  • Implications for the field of AI and game-playing algorithms are significant
Also access our AI generated: Comprehensive summary, Lay summary, Blog-like article; or ask questions about this paper to our AI assistant.

Authors: Tony T. Wang, Adam Gleave, Tom Tseng, Kellin Pelrine, Nora Belrose, Joseph Miller, Michael D. Dennis, Yawen Duan, Viktor Pogrebniak, Sergey Levine, Stuart Russell

Accepted to ICML 2023, see paper for changelog

Abstract: We attack the state-of-the-art Go-playing AI system KataGo by training adversarial policies against it, achieving a >97% win rate against KataGo running at superhuman settings. Our adversaries do not win by playing Go well. Instead, they trick KataGo into making serious blunders. Our attack transfers zero-shot to other superhuman Go-playing AIs, and is comprehensible to the extent that human experts can implement it without algorithmic assistance to consistently beat superhuman AIs. The core vulnerability uncovered by our attack persists even in KataGo agents adversarially trained to defend against our attack. Our results demonstrate that even superhuman AI systems may harbor surprising failure modes. Example games are available https://goattack.far.ai/.

Submitted to arXiv on 01 Nov. 2022

Ask questions about this paper to our AI assistant

You can also chat with multiple papers at once here.

The license of the paper does not allow us to build upon its content and the AI assistant only knows about the paper metadata rather than the full article.

AI assistant instructions?

Results of the summarizing process for the arXiv paper: 2211.00241v4

This paper's license doesn't allow us to build upon its content and the summarizing process is here made with the paper's metadata rather than the article.

In their paper titled "Adversarial Policies Beat Superhuman Go AIs," authors Tony T. Wang, Adam Gleave, Tom Tseng, Kellin Pelrine, Nora Belrose, Joseph Miller, Michael D. Dennis, Yawen Duan, Viktor Pogrebniak, Sergey Levine and Stuart Russell present a groundbreaking attack on the state-of-the-art Go-playing AI system KataGo. By training adversarial policies against KataGo they achieve an impressive win rate of over 97% against KataGo running at superhuman settings. The adversaries do not win by playing Go well themselves but rather employ clever strategies to trick KataGo into making serious blunders. Remarkably this attack transfers zero-shot to other superhuman Go-playing AIs as well and is comprehensible enough for human experts to implement it without algorithmic assistance and consistently beat superhuman AIs. Furthermore the core vulnerability uncovered by this attack persists even in KataGo agents that have been adversarially trained to defend against it; highlighting the surprising failure modes that even superhuman AI systems may harbor. The authors provide example games on their website (https://goattack.far.ai/) to demonstrate the effectiveness of their attack and its implications for the field of AI and game-playing algorithms are significant. This paper has been accepted for presentation at ICML 2023 with a changelog included in its final version; showcasing the authors' diverse expertise and collaboration which contribute to the credibility and depth of their findings in challenging existing AI systems in complex domains like Go.
Created on 24 Jul. 2023

Assess the quality of the AI-generated content by voting

Score: 0

Why do we need votes?

Votes are used to determine whether we need to re-run our summarizing tools. If the count reaches -10, our tools can be restarted.

The previous summary was created more than a year ago and can be re-run (if necessary) by clicking on the Run button below.

The license of this specific paper does not allow us to build upon its content and the summarizing tools will be run using the paper metadata rather than the full article. However, it still does a good job, and you can also try our tools on papers with more open licenses.

Similar papers summarized with our AI tools

Navigate through even more similar papers through a

tree representation

Look for similar papers (in beta version)

By clicking on the button above, our algorithm will scan all papers in our database to find the closest based on the contents of the full papers and not just on metadata. Please note that it only works for papers that we have generated summaries for and you can rerun it from time to time to get a more accurate result while our database grows.

Disclaimer: The AI-based summarization tool and virtual assistant provided on this website may not always provide accurate and complete summaries or responses. We encourage you to carefully review and evaluate the generated content to ensure its quality and relevance to your needs.