Systems online

Gray Swan Arena

Push the boundaries of AI security. Identify vulnerabilities, exploit weaknesses, and help shape the future of robust AI systems.

Gray Swan
October 28, 2024

On September 7th, we unleashed the Jailbreaking Championship, an electrifying competition designed to push the latest AI models to their very limits. Over 600 hackers and bug bounty hunters from across the globe stepped up to the challenge, showcasing their skills in preventing AI from generating harmful outputs. A massive thank you to everyone who joined this incredible event—your participation was nothing short of groundbreaking!

Our inaugural competition (September-October 2024), by the numbers:

- 631 Participants

- 10,000 Jailbreak submissions

- 100,000 Jailbreak attempts

- 25 Models

-$40,000 In prizes

To continue the experience and create a long-standing way to test bleeding edge AI models, today we are launching a brand new platform called the Gray Swan Arena.

The Gray Swan Arena is a dynamic, community-driven approach to understanding how effective current safeguards are. While professional red-teaming engagements are important tools, it is impossible to predict what will happen when a model goes public. The safety arena is a controlled environment for measuring what will happen when the model goes public. By providing incentives for an open and diverse community of hackers to stress test a model's safeguards, the arena is a distinct and valuable signal for how the model's safeguards will stand up to the public.

Participants who want to continue playing with the existing format of the Jailbreaking Championship can do so by opening the "Single Turn Harmful Outputs" challenge.

In combination with this launch, we are also adding OpenAI's o1 models to the Single Turn Harmful Outputs challenge and revealing a brand new challenge called "Revealing Hidden CoT" where $1000 in bounties is up for grabs to reveal o1's inner chain of thought (CoT).

We are beyond excited to see what the future holds for the Gray Swan Arena and to see all the innovative solutions our participants will come up with to find vulnerabilities in the latest AI models.

Mark Your Calendar For the o1 Release:

Registration: Fill out the form on our website.

Release Date: Oct 29, 2024, at 1:00 PM EST

See you in the arena,

Gray Swan Team

[ Website | X | Discord ]