Axis Intelligence — AI Model Vulnerability Tracker: 71% Attack Success Rate Across Six Frontier Models

2026-05-28 Research by al-ice.ai Editorial

AI relevance: Independent testing of 312 attack vectors against six production-deployed frontier models reveals that indirect prompt injection via agent tool inputs succeeds 84% of the time — the most fragile attack category — while system-prompt extraction still works in 31% of deployments across GPT-5, Claude Opus 4.7, Gemini 3, Llama 4, DeepSeek V4, and Mistral Large 3.

What happened

Axis Intelligence released the first iteration of its AI Model Vulnerability Tracker, a living database of independently reproduced LLM vulnerabilities across ChatGPT, Claude, Gemini, Llama, Mistral, and DeepSeek.
Between January 15 and April 25, 2026, they tested 312 distinct attack vectors against six production models. 71% of attacks succeeded against at least one model; 23% succeeded against all six.
The most fragile attack category was indirect prompt injection delivered through agent tool inputs, with an 84% success rate — confirming that the integration points where agents consume external data are the weakest security surface.
The most surprising finding: system-prompt extraction succeeded in 31% of tested deployments, despite being one of the oldest and best-documented LLM attack classes.
The most resilient category was direct policy-violating jailprompts, which models refused 77% of the time.
Notable individual findings included indirect injection via PDF annotation layers, multi-turn crescendo bypasses via pedagogical framing, system-prompt extraction via translation requests, and tool-result injection in RAG pipelines.
The tracker uses the Axis Vulnerability Index (AVI) for severity scoring and is updated weekly with lab-confirmed results.

Why it matters

This is one of the largest cross-model vulnerability studies published to date, and the results are sobering: the vast majority of tested attack vectors work against at least one frontier model, and the attack surface most relevant to AI agent deployments — indirect injection through tool inputs — is the most fragile. The persistence of system-prompt extraction at 31% success suggests that years of defense work have not resolved a fundamental design limitation in how models process mixed-instruction contexts.

What to do

Treat all tool results as adversarial input — never assume agent tool outputs (API responses, web fetches, database queries) are safe to pass directly into model context without sanitization.
Test your deployment against the tracker's published categories — especially indirect injection and system-prompt extraction, which have the highest success rates across models.
Implement output filtering on agent tool calls — intercept and validate agent actions before they execute, not just after.
Monitor the tracker at axis-intelligence.com/research/ai-model-vulnerability-tracker for new attack vectors as they are reproduced.

Sources

Axis Intelligence — AI Model Vulnerability Tracker 2026