• The Vulnerability: A critical chain of flaws in vLLM's video handling (CVE-2026-22778) enables Remote Code Execution (RCE).
  • Attack Vector: Attackers can send a malicious video URL (e.g., via chat/completions or invocations) to a vLLM instance serving a video model.
  • The Chain:
    • Info Leak: PIL error messages leak heap memory addresses, allowing ASLR bypass.
    • Heap Overflow: A bug in the JPEG2000 decoder (OpenCV/FFmpeg) allows overwriting heap memory when processing malicious "cdef" boxes.
  • Impact: Successful exploitation allows arbitrary system command execution on the server running vLLM.
  • Affected Versions: vLLM versions prior to 0.14.1 are vulnerable. Default configurations (no auth) are immediately exploitable; auth-enabled instances are vulnerable if the attacker has access to the API.
  • Scope: "Millions of AI servers" potentially exposed if running vulnerable versions with video models enabled.

Why it matters

vLLM is a core component of the modern AI inference stack. This vulnerability demonstrates how dependencies in multimodal processing pipelines (like OpenCV and FFmpeg) introduce classic memory corruption risks into high-level AI services. The ability to trigger RCE via a standard API request makes this a highly critical flaw for any provider offering public or semi-public vLLM endpoints.

What to do

  • Upgrade immediately: Update vLLM to version 0.14.1 or later, which contains the fix.
  • Disable video models: If upgrading is not possible, disable video model support to mitigate the specific attack vector.
  • Network Isolation: Ensure vLLM instances are not directly exposed to the public internet without strict authentication and access controls (though auth alone does not prevent exploitation by authorized users).

Read the GitHub Advisory

Read the OX Security Analysis