GitHub Advisory — vLLM model-load RCE risk via auto_map (CVE-2026-22807)

2026-01-30 • Category: AI CVEs

What happened: CVE-2026-22807 (GitHub Advisory Database) reports that vLLM may execute attacker-controlled Python code during model resolution / startup via Hugging Face auto_map dynamic modules.
Key detail: the advisory claims this can occur even when trust_remote_code is false, due to how the code path delegates to Transformers’ dynamic module loader.
Why this is dangerous: it happens before request handling — so “we have auth on the inference API” doesn’t help if the model source/path is attacker-influenced.
Threat model: any environment where model identifiers/paths can be influenced (automation that pulls the “latest” model, user-provided model names, compromised internal model registry, poisoned mirror).
Fix: the advisory links to a vLLM PR intended to gate dynamic module loading appropriately.
Practical takeaway: treat model repos like dependencies: provenance, pinning, review, and controlled rollout.

Why it matters

Model supply chain: teams often treat models as “data assets,” but frameworks can interpret parts of model configs as executable module references.
High-privilege runtimes: inference servers frequently run with GPU access, broad filesystem access, and network reach — great for attackers.
Silent failure mode: you can be compromised at startup with no suspicious prompts or requests.

Patch: update vLLM to a version that includes the fix (track the PR/release notes referenced by the advisory).
Pin models: avoid pulling mutable tags; pin to immutable revisions/SHAs where possible and use a curated internal registry.
Restrict model sources: block arbitrary remote model loading in production; only allow approved repos/paths.
Defensive validation (safe): on vLLM hosts you own, inventory deployed models and check whether any model config references dynamic modules (auto_map entries) and whether your deployment policy actually prevents remote code loading.