Systems online

Gray Swan AI Welcomes U.S. AI Safety Institute to the UK AISI Agent Red-Teaming Challenge

We're excited to announce that the U.S. Al Safety Institute (US AISI) has officially joined the UK AISI Agent Red-Teaming Challenge as a co-judge.

Gray Swan
April 2, 2025

We're excited to announce that the U.S. Al Safety Institute (US AISI) has officially joined the UK AISI Agent Red-Teaming Challenge as a co-judge. Alongside the UK AISI, US AISI will help evaluate submissions focused on Al agent failures, instruction bypass, misuse risk, and over-refusals-helping ensure the challenge maintains the highest standards of fairness and transparency.

A Global, Multi-Stakeholder Effort

This challenge is now supported by some of the most influential organizations in Al:

  • Al Security Institute
  • OpenAl|
  • Anthropic Al
  • Google DeepMind

The prize pool has grown to $170,000, making this the largest Al red-teaming challenge of its kind.

What Is the UK AISI Agent Red-Teaming Challenge?

The challenge tasks participants with identifying vulnerabilities in anonymous Al agents-testing their ability to:

  • Breach confidentiality
  • Override aligned goals
  • Trigger disallowed actions
  • Expose model weaknesses under pressure
  • Identify over-refusals in edge-case scenarios

Participants use both direct and indirect exploit techniques, simulating the kinds of threats real-world agents may face in production.

We're currently in Wave 4, the final phase of the month-long challenge. New behaviors have been introduced, and submissions remain open through April 6.