GitHub Advisory — vLLM multimodal SSRF (CVE-2026-24779)

• Category: AI CVEs

AI relevance: vLLM is a common LLM inference/serving layer; if your agents or apps pass user-controlled image/audio/video URLs into multimodal endpoints, an SSRF in the fetch path can become a pivot into your internal AI platform (other pods/services, metadata endpoints, control planes).

  • Issue: CVE-2026-24779 is a Server-Side Request Forgery (SSRF) in vLLM’s multimodal MediaConnector URL loading.
  • Affected: vLLM versions prior to 0.14.1 (per NVD/GitHub advisory).
  • Root cause (per advisory): hostname allowlisting was performed with urllib.parse.urlparse, but the actual request stack uses urllib3/requests parsing semantics; backslash handling differs, enabling an allowlist bypass.
  • What an attacker gets: ability to coerce the vLLM server into making HTTP requests to internal network resources (scan RFC1918 space, hit cluster services, probe control endpoints).
  • Why it’s nasty in k8s: a compromised vLLM pod can become an internal client against “private” services (metrics, service meshes, internal admin UIs) that were never intended to be internet-facing.
  • Realistic failure mode: internal endpoints can be spammed or fed attacker-crafted payloads, causing DoS, state corruption, or data exposure depending on what’s reachable.
  • Fix: vLLM 0.14.1 patches by switching to a consistent parser (urllib3.util.parse_url) for validation + handling edge cases (see patch/commit).

Why it matters

  • “LLM server as a network oracle”: SSRF turns your model-serving layer into an internal reconnaissance tool.
  • Multimodal widens the input surface: letting users supply media URLs is convenient, but it creates a classic SSRF funnel unless the fetch path is tightly constrained.
  • AI platform blast radius: inference pods often sit close to secrets (model registries, object stores, monitoring tokens) and privileged internal APIs.

What to do

  1. Patch: upgrade vLLM to 0.14.1+ (and redeploy your serving containers).
  2. Assume URL fetch is hostile: if you don’t truly need URL-based media ingestion, disable it or gate it behind authenticated, trusted callers.
  3. Network policy: add egress controls for vLLM pods (deny by default; explicitly allow only required destinations like object storage/CDN).
  4. Block high-value internal targets: explicitly deny cloud metadata IPs/hosts (e.g., IMDS) and internal control-plane endpoints from inference workloads.
  5. Telemetry: alert on unexpected outbound traffic from inference pods (new destinations, spikes, redirects) — SSRF often shows up as “weird egress”.

Sources