vLLM CVE-2026-22778/34756 — Heap Leak and DoS in Multimodal Serving

AI relevance: vLLM is the most widely deployed LLM inference server — the CVEs below affect production model-serving surfaces (multimodal input handling and the OpenAI-compatible API), not just development tooling.

  • CVE-2026-22778 — An invalid image sent to vLLM's multimodal endpoint triggers an error path that leaks a heap address, weakening ASLR and providing an information-leak primitive that can help chain a full RCE. Affects v0.8.3 through v0.14.0; fixed in v0.14.1.
  • CVE-2026-34756 — An unbounded n parameter in the OpenAI-compatible API server allows an attacker to trigger denial of service by requesting arbitrarily many outputs per call. Fixed in v0.19.0.
  • These are distinct from the earlier CVE-2026-4944 (hardcoded trust_remote_code in HuggingFace model loading) — different code paths, different fix branches.
  • Together they cover two layers: image preprocessing (CVE-2026-22778) and API request handling (CVE-2026-34756). An operator might fix one while remaining exposed to the other.
  • Both CVEs have published NVD entries referencing the GitHub advisories and corresponding release notes, making them easy to track in vulnerability scanners.
  • vLLM's rapid release cadence (0.14.1 and 0.19.0 fixes) means teams running pinned versions — common in production ML pipelines — may not have received these patches automatically.

Why it matters

vLLM powers inference for a large fraction of self-hosted LLM deployments. Heap-address leaks in multimodal handling are particularly concerning because vision-capable models are now default in production. An attacker who can submit crafted images to a multimodal endpoint gains a reliable ASLR-bypass primitive. The DoS vector in the OpenAI-compatible API is trivial to trigger remotely and requires no special access beyond network reachability.

What to do

  • Check your vLLM version: upgrade to at least v0.14.1 for CVE-2026-22778 and v0.19.0 for CVE-2026-34756.
  • If you expose multimodal endpoints to untrusted input, add image validation/wrapping at the reverse-proxy layer as defense-in-depth.
  • Rate-limit or cap the n parameter at your API gateway to mitigate the DoS vector even on older versions.

Sources: