Gray Swan News

Introducing the Gray Swan AI Proving Ground

Product Announcement

•

June 24, 2025

Introducing the Gray Swan AI Proving Ground: The Fastest Path to AI Red Teaming Careers

UK  AISI × Gray Swan Agent Red‑Teaming Challenge: Results Snapshot

Arena

•

May 9, 2025

UK AISI × Gray Swan Agent Red‑Teaming Challenge: Results Snapshot

Gray Swan Introduces the Dangerous Reasoning Arena Competition

Arena

•

May 7, 2025

Launching May 10, Gray Swan AI’s newest challenge invites red-teamers and AI security professionals to explore a deeper layer of vulnerability: the model’s internal reasoning.

Gray Swan AI Welcomes U.S. AI Safety Institute to the UK AISI Agent Red-Teaming Challenge

Arena

•

April 2, 2025

We're excited to announce that the U.S. Al Safety Institute (US AISI) has officially joined the UK AISI Agent Red-Teaming Challenge as a co-judge.

Gray Swan Announces the Visual Vulnerabilities Challenge

Arena

•

March 12, 2025

Use image inputs to jailbreak leading vision-enabled AI models. Visual prompt injections, chem/bio/cyber weaponization, privacy violations, and more.

Gray Swan Featured in Forbes

Press Release

•

October 29, 2024

Gray Swan AI, a security startup founded by computer scientists from Carnegie Mellon, is leading the charge in bulletproofing AI models for companies like OpenAI and Anthropic. Gray Swan is at the forefront of AI safety, building powerful tools to mitigate risks in rapidly evolving AI landscapes.

Gray Swan Arena

Arena

•

October 28, 2024

Push the boundaries of AI security. Identify vulnerabilities, exploit weaknesses, and help shape the future of robust AI systems.

Announcing RepE Chat

Product Announcement

•

October 17, 2024

Introducing an interactive interface based on our team's research into Representation Engineering (RepE), an approach to enhancing the transparency of AI systems that draws on insights from cognitive neuroscience.

AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents

Research

•

October 14, 2024

To address potential safety and alignment concerns coming from LLM agents, we introduce AgentHarm, a new benchmark for measuring harmfulness of LLM agents. Our evaluation results show that the LLM agents built around the current frontier models such as GPT-4o and Claude Sonnet 3.5 show limited robustness to basic jailbreak attacks.

Jailbreaking Championship 2024

Arena

•

August 27, 2024

Statement on SB-1047 and Founders

Press Release

•

July 25, 2024

Key points about Gray Swan's position on SB-1047 and updates on our founding team.

nanoGCG

Open Source

•

July 25, 2024

A fast and lightweight implementation of the GCG algorithm. We designed nanoGCG to be both easy to use and deploy, and straightforward for others to build on top of. nanoGCG is available as an open source Python package.

Public Launch

Press Release

•

July 16, 2024

Gray Swan AI Emerges from Stealth: Revolutionizing AI Risk Assessment and Mitigation with Cutting-Edge Tools.

Google DeepMind and Anthropic Join as Agent Red-Teaming Challenge Sponsors

Arena

•

Gray Swan is excited to announce that Google DeepMind and Anthropic have joined as sponsors of the UK AISI Agent Red-Teaming Challenge. With their sponsorship we have been able to raise the total prize pool to $170k. New challenge behaviors continue to drop weekly, and the challenge runs through April 6th.

Latest News

Categories

Introducing the Gray Swan AI Proving Ground

UK  AISI × Gray Swan Agent Red‑Teaming Challenge: Results Snapshot

Gray Swan Introduces the Dangerous Reasoning Arena Competition

Gray Swan AI Welcomes U.S. AI Safety Institute to the UK AISI Agent Red-Teaming Challenge

Gray Swan Announces the Visual Vulnerabilities Challenge

Gray Swan Featured in Forbes

Gray Swan Arena

Announcing RepE Chat

AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents

Jailbreaking Championship 2024

Statement on SB-1047 and Founders

nanoGCG

Public Launch

Google DeepMind and Anthropic Join as Agent Red-Teaming Challenge Sponsors

Gray Swan AI security products

Cygnal

Shade

Research

Join our newsletter

Latest News

Categories

Introducing the Gray Swan AI Proving Ground

UK AISI × Gray Swan Agent Red‑Teaming Challenge: Results Snapshot

Gray Swan Introduces the Dangerous Reasoning Arena Competition

Gray Swan AI Welcomes U.S. AI Safety Institute to the UK AISI Agent Red-Teaming Challenge

Gray Swan Announces the Visual Vulnerabilities Challenge

Gray Swan Featured in Forbes

Gray Swan Arena

Announcing RepE Chat

AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents

Jailbreaking Championship 2024

Statement on SB-1047 and Founders

nanoGCG

Public Launch

Google DeepMind and Anthropic Join as Agent Red-Teaming Challenge Sponsors

Gray Swan AI security products

Cygnal

Shade

Research

Join our newsletter

UK  AISI × Gray Swan Agent Red‑Teaming Challenge: Results Snapshot