NVIDIA Triton — CVE-2026-24207 Critical Auth Bypass in Inference Server
AI relevance: Triton is the default inference-serving runtime for many LLM deployments — a critical auth bypass lets unauthenticated attackers reach model-serving APIs, swap or poison models, and steal inference data.
- CVE-2026-24207 (CVSS 9.8, CWE-288) is an authentication bypass in NVIDIA Triton Inference Server that requires zero prior credentials, zero user interaction, and is network-exploitable with low complexity.
- Successful exploitation can lead to unauthorized code execution, privilege escalation, data tampering, denial of service, and information disclosure on the inference server.
- The flaw was reported by security researcher deayzl and disclosed in NVIDIA's May 2026 security bulletin alongside six additional Triton CVEs (24208–24215).
- Other notable findings in the same bulletin include CVE-2026-24208 (auth bypass, reported by Mohamed Lemine Ahmed Jidou) and three CVEs reported by Navtej Kathuria covering additional privilege-escalation paths.
- No public proof-of-concept exists yet, but the zero-barrier nature (no auth needed, low complexity) makes this a high-priority target for model-infrastructure attackers.
- NVIDIA released a patch on May 18, 2026; Triton instances without it should be treated as exposed.
Why it matters
Triton Inference Server is one of the most widely deployed serving backends for production LLMs. An authentication bypass at the inference layer means an attacker can send arbitrary requests to loaded models, exfiltrate training data through inference-side channels, poison model weights, or pivot into the host container runtime. This sits squarely in the AI infrastructure attack surface.
What to do
- Upgrade Triton to the patched version per NVIDIA's May 2026 security bulletin immediately.
- Audit network exposure — Triton should never be directly internet-facing; place it behind an API gateway with independent authentication.
- Review inference logs for anomalous unauthenticated requests during the exposure window.
- Rotate any secrets or credentials stored on inference-server hosts.