Gray Swan News

Your AI Agent Can Be Compromised. You’d Never Know.

•

March 18, 2026

Gray Swan’s Indirect Prompt Injection Arena tested 13 frontier models, 464 red teamers, and 272,000+ attacks, and no model came out clean.

Conducting The First Live Enterprise Comparison Between Agents and Human Professionals

Research

•

December 11, 2025

A new study from Stanford and Gray Swan finds that purpose-built AI agents can outperform most human cybersecurity professionals in real-world penetration testing—at a fraction of the cost.

UK  AISI × Gray Swan Agent Red‑Teaming Challenge: Results Snapshot

Arena

•

May 9, 2025

Results from the largest public evaluation of agentic LLM safety to date.

Gray Swan Introduces the Dangerous Reasoning Arena Competition

Arena

•

May 7, 2025

Launching May 10, Gray Swan AI’s newest challenge invites red-teamers and AI security professionals to explore a deeper layer of vulnerability: the model’s internal reasoning.

Gray Swan AI Welcomes U.S. AI Safety Institute to the UK AISI Agent Red-Teaming Challenge

Arena

•

April 2, 2025

We're excited to announce that the U.S. Al Safety Institute (US AISI) has officially joined the UK AISI Agent Red-Teaming Challenge as a co-judge.

Gray Swan Announces the Visual Vulnerabilities Challenge

Arena

•

March 12, 2025

Use image inputs to jailbreak leading vision-enabled AI models. Visual prompt injections, chem/bio/cyber weaponization, privacy violations, and more.

Gray Swan Arena

Arena

•

October 28, 2024

Push the boundaries of AI security. Identify vulnerabilities, exploit weaknesses, and help shape the future of robust AI systems.

AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents

Research

•

October 14, 2024

To address potential safety and alignment concerns coming from LLM agents, we introduce AgentHarm, a new benchmark for measuring harmfulness of LLM agents. Our evaluation results show that the LLM agents built around the current frontier models such as GPT-4o and Claude Sonnet 3.5 show limited robustness to basic jailbreak attacks.

Jailbreaking Championship 2024

Arena

•

August 27, 2024

nanoGCG

Open Source

•

July 25, 2024

A fast and lightweight implementation of the GCG algorithm. We designed nanoGCG to be both easy to use and deploy, and straightforward for others to build on top of. nanoGCG is available as an open source Python package.

Google DeepMind and Anthropic Join as Agent Red-Teaming Challenge Sponsors

Arena

•

Gray Swan is excited to announce that Google DeepMind and Anthropic have joined as sponsors of the UK AISI Agent Red-Teaming Challenge. With their sponsorship we have been able to raise the total prize pool to $170k. New challenge behaviors continue to drop weekly, and the challenge runs through April 6th.

AI Agent Security Cheat Sheet

Battle-Tested AI Security for Enterprise AI

Your AI Agent Can Be Compromised. You'd Never Know.

We’re Hiring: ML Engineers

Latest News

Categories

Your AI Agent Can Be Compromised. You’d Never Know.

Conducting The First Live Enterprise Comparison Between Agents and Human Professionals

UK  AISI × Gray Swan Agent Red‑Teaming Challenge: Results Snapshot

Gray Swan Introduces the Dangerous Reasoning Arena Competition

Gray Swan AI Welcomes U.S. AI Safety Institute to the UK AISI Agent Red-Teaming Challenge

Gray Swan Announces the Visual Vulnerabilities Challenge

Gray Swan Arena

AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents

Jailbreaking Championship 2024

nanoGCG

Google DeepMind and Anthropic Join as Agent Red-Teaming Challenge Sponsors

Gray Swan Solutions

Shade

Cygnal

Arena

AI Agent Security Cheat Sheet

Battle-Tested AI Security for Enterprise AI

Your AI Agent Can Be Compromised. You'd Never Know.

We’re Hiring: ML Engineers

Latest News

Categories

Your AI Agent Can Be Compromised. You’d Never Know.

Conducting The First Live Enterprise Comparison Between Agents and Human Professionals

UK AISI × Gray Swan Agent Red‑Teaming Challenge: Results Snapshot

Gray Swan Introduces the Dangerous Reasoning Arena Competition

Gray Swan AI Welcomes U.S. AI Safety Institute to the UK AISI Agent Red-Teaming Challenge

Gray Swan Announces the Visual Vulnerabilities Challenge

Gray Swan Arena

AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents

Jailbreaking Championship 2024

nanoGCG

Google DeepMind and Anthropic Join as Agent Red-Teaming Challenge Sponsors

Gray Swan Solutions

Shade

Cygnal

Arena

Join our newsletter

UK  AISI × Gray Swan Agent Red‑Teaming Challenge: Results Snapshot