GitHub Advisory — vLLM trust_remote_code bypass RCE

AI relevance: vLLM is widely used to serve LLMs; a trust boundary bypass in model config loading can turn routine model pulls into host-level code execution in AI inference stacks.

  • vLLM’s Nemotron_Nano_VL_Config resolves auto_map with dynamic module loading and immediately instantiates the class.
  • That code path does not enforce trust_remote_code=False, so remote code can execute during config loading.
  • An attacker can publish a benign-looking frontend repo whose config.json points to a separate malicious backend repo.
  • The issue sits in a common model-loading path, so services, CI jobs, or dev machines using vLLM’s transformer utils are exposed.
  • The fix adds checks to prevent dynamic loading when trust is disabled.

Why it matters

  • Model serving pipelines often auto-fetch configs; a trust bypass converts that convenience into RCE in AI infrastructure.
  • Attackers can hide the malicious code in a separate repo, reducing the chance of manual review.
  • Any environment that loads unvetted models is a potential entry point into inference hosts.

What to do

  • Patch: Update vLLM to the fixed release that includes the advisory patch.
  • Audit model sources: Allowlist trusted repos and block unknown auto_map references.
  • Harden pipelines: Run model loading in sandboxes or isolated containers to limit blast radius.

Sources