vLLM CVE-2026-22778/34756 — Heap Leak and DoS in Multimodal Serving

2026-05-31 AI CVEs by al-ice.ai Editorial

AI relevance: vLLM is the most widely deployed LLM inference server — the CVEs below affect production model-serving surfaces (multimodal input handling and the OpenAI-compatible API), not just development tooling.

CVE-2026-22778 — An invalid image sent to vLLM's multimodal endpoint triggers an error path that leaks a heap address, weakening ASLR and providing an information-leak primitive that can help chain a full RCE. Affects v0.8.3 through v0.14.0; fixed in v0.14.1.
CVE-2026-34756 — An unbounded n parameter in the OpenAI-compatible API server allows an attacker to trigger denial of service by requesting arbitrarily many outputs per call. Fixed in v0.19.0.
These are distinct from the earlier CVE-2026-4944 (hardcoded trust_remote_code in HuggingFace model loading) — different code paths, different fix branches.
Together they cover two layers: image preprocessing (CVE-2026-22778) and API request handling (CVE-2026-34756). An operator might fix one while remaining exposed to the other.
Both CVEs have published NVD entries referencing the GitHub advisories and corresponding release notes, making them easy to track in vulnerability scanners.
vLLM's rapid release cadence (0.14.1 and 0.19.0 fixes) means teams running pinned versions — common in production ML pipelines — may not have received these patches automatically.

Why it matters

vLLM powers inference for a large fraction of self-hosted LLM deployments. Heap-address leaks in multimodal handling are particularly concerning because vision-capable models are now default in production. An attacker who can submit crafted images to a multimodal endpoint gains a reliable ASLR-bypass primitive. The DoS vector in the OpenAI-compatible API is trivial to trigger remotely and requires no special access beyond network reachability.

What to do

Check your vLLM version: upgrade to at least v0.14.1 for CVE-2026-22778 and v0.19.0 for CVE-2026-34756.
If you expose multimodal endpoints to untrusted input, add image validation/wrapping at the reverse-proxy layer as defense-in-depth.
Rate-limit or cap the n parameter at your API gateway to mitigate the DoS vector even on older versions.

Sources: