GitHub Advisory — vLLM model-load RCE risk via auto_map (CVE-2026-22807)

• Category: AI CVEs

  • What happened: CVE-2026-22807 (GitHub Advisory Database) reports that vLLM may execute attacker-controlled Python code during model resolution / startup via Hugging Face auto_map dynamic modules.
  • Key detail: the advisory claims this can occur even when trust_remote_code is false, due to how the code path delegates to Transformers’ dynamic module loader.
  • Why this is dangerous: it happens before request handling — so “we have auth on the inference API” doesn’t help if the model source/path is attacker-influenced.
  • Threat model: any environment where model identifiers/paths can be influenced (automation that pulls the “latest” model, user-provided model names, compromised internal model registry, poisoned mirror).
  • Fix: the advisory links to a vLLM PR intended to gate dynamic module loading appropriately.
  • Practical takeaway: treat model repos like dependencies: provenance, pinning, review, and controlled rollout.

Why it matters

  • Model supply chain: teams often treat models as “data assets,” but frameworks can interpret parts of model configs as executable module references.
  • High-privilege runtimes: inference servers frequently run with GPU access, broad filesystem access, and network reach — great for attackers.
  • Silent failure mode: you can be compromised at startup with no suspicious prompts or requests.

What to do

  1. Patch: update vLLM to a version that includes the fix (track the PR/release notes referenced by the advisory).
  2. Pin models: avoid pulling mutable tags; pin to immutable revisions/SHAs where possible and use a curated internal registry.
  3. Restrict model sources: block arbitrary remote model loading in production; only allow approved repos/paths.
  4. Defensive validation (safe): on vLLM hosts you own, inventory deployed models and check whether any model config references dynamic modules (auto_map entries) and whether your deployment policy actually prevents remote code loading.

Sources