Wiz — ZeroDay.cloud: cloud + AI infra zero-days

• Category: Security

AI relevance: If you run self-hosted inference (e.g., vLLM/Ollama) on shared Kubernetes/GPU clusters, “cloud infra” vulns like RCE/container escape become direct model-serving compromise and cross-tenant risk.

  • Wiz recaps the ZeroDay.cloud 2025 competition, focused on open-source components used in critical cloud infrastructure (and AI infrastructure).
  • Reported outcome: multiple critical RCE-class findings across foundational layers (databases, container runtimes, Linux kernel), with public CVEs mentioned.
  • One highlighted class is container escape: breaking isolation in multi-tenant environments where “containers are the security boundary.”
  • They also mention attempted exploit demonstrations against vLLM and Ollama (popular for running open-source models), with the stated goal of accessing private AI artifacts (models/datasets/prompts).
  • Even without the vLLM/Ollama demos landing in the time window, the broader lesson stands: the AI stack inherits all the sharp edges of the cloud stack it sits on.

Why it matters

A lot of “AI security” discussions fixate on prompt injection and model behavior. But the fastest path to owning an AI system is often boring: compromise the model-serving host via RCE, break out of a container, and grab the weights, prompts, caches, and credentials.

What to do

  • Assume model-serving is high-value: isolate inference workloads (dedicated nodes where possible), and treat GPU clusters as crown-jewel infrastructure.
  • Harden container boundaries: keep kernel/container runtime patched, enable seccomp/AppArmor, drop privileges, and avoid mounting host paths into pods.
  • Use defense-in-depth for multi-tenancy: network policies, workload identity, and separate secrets per service (no shared “cluster-wide” tokens).
  • Log and monitor: unusual process execution in model-serving pods, unexpected outbound egress, and access to model artifact stores.

Sources