vLLM CVE-2026-4944 — Hardcoded trust_remote_code Bypass Enables RCE
AI relevance: vLLM is the dominant open-source model serving engine; hardcoded trust_remote_code in model loaders means even operators who explicitly disable remote code trust can still be compromised through malicious HuggingFace model repositories.
- CVE-2026-4944 (CVSS 8.8) was published May 28, 2026 for vllm-project/vllm 0.14.1.
- The vulnerability originates in two model implementation files:
nemotron_vl.pyandkimi_k25.py, which hardcodetrust_remote_code=True. - This bypasses any user-configured
--trust-remote-code=Falsesetting, nullifying an explicit security control. - An attacker hosting a malicious model on HuggingFace can achieve RCE when a vulnerable vLLM instance loads a NemotronVL or KimiK25 model.
- This is an incomplete fix for prior CVEs: CVE-2025-66448 and CVE-2026-22807 — the patched paths were addressed, but these two model files were missed.
- Exploitation requires network access to host the malicious model but no authentication on the victim vLLM instance.
- No public PoC at time of writing; attack complexity is rated High.
- The pattern repeats a known class of failures in HuggingFace model loading: custom tokenizer/model code is a trusted execution boundary that should never be forced on.
Why it matters
vLLM processes millions of inference requests daily across enterprises and cloud providers. Hardcoded trust overrides mean security-conscious operators who explicitly disable remote code execution are still exposed. The fact that this is the third related CVE in this codebase signals systemic difficulty auditing the sprawling model-loading surface.
What to do
- Audit your vLLM deployment version; if running 0.14.1, check for patches or restrict model loading to vetted repositories only.
- Network-isolate model serving infrastructure from untrusted internet sources.
- Monitor HuggingFace model uploads for known project names if you auto-load models by identifier.
- Track the fix status — no patch details were in the advisory at time of publication.
Sources: