vLLM CVE-2026-22778/34756 — Heap Leak and DoS in Multimodal Serving
AI relevance: vLLM is the most widely deployed LLM inference server — the CVEs below affect production model-serving surfaces (multimodal input handling and the OpenAI-compatible API), not just development tooling.
- CVE-2026-22778 — An invalid image sent to vLLM's multimodal endpoint triggers an error path that leaks a heap address, weakening ASLR and providing an information-leak primitive that can help chain a full RCE. Affects v0.8.3 through v0.14.0; fixed in v0.14.1.
- CVE-2026-34756 — An unbounded
nparameter in the OpenAI-compatible API server allows an attacker to trigger denial of service by requesting arbitrarily many outputs per call. Fixed in v0.19.0. - These are distinct from the earlier CVE-2026-4944 (hardcoded
trust_remote_codein HuggingFace model loading) — different code paths, different fix branches. - Together they cover two layers: image preprocessing (CVE-2026-22778) and API request handling (CVE-2026-34756). An operator might fix one while remaining exposed to the other.
- Both CVEs have published NVD entries referencing the GitHub advisories and corresponding release notes, making them easy to track in vulnerability scanners.
- vLLM's rapid release cadence (0.14.1 and 0.19.0 fixes) means teams running pinned versions — common in production ML pipelines — may not have received these patches automatically.
Why it matters
vLLM powers inference for a large fraction of self-hosted LLM deployments. Heap-address leaks in multimodal handling are particularly concerning because vision-capable models are now default in production. An attacker who can submit crafted images to a multimodal endpoint gains a reliable ASLR-bypass primitive. The DoS vector in the OpenAI-compatible API is trivial to trigger remotely and requires no special access beyond network reachability.
What to do
- Check your vLLM version: upgrade to at least v0.14.1 for CVE-2026-22778 and v0.19.0 for CVE-2026-34756.
- If you expose multimodal endpoints to untrusted input, add image validation/wrapping at the reverse-proxy layer as defense-in-depth.
- Rate-limit or cap the
nparameter at your API gateway to mitigate the DoS vector even on older versions.
Sources: