Microsoft — Framework CVEs: Prompt Injection to RCE Across Semantic Kernel, CrewAI, LangChain
AI relevance: Microsoft's "When Prompts Become Shells" research maps a systemic class of vulnerabilities where prompt injection in AI agent frameworks escalates directly to remote code execution — proving that connected LLMs transform content-security bugs into host-compromise paths.
- Research published May 7, 2026 by Microsoft's security team synthesizes a pattern across the agentic AI ecosystem: when an LLM connects to tools, poisoned content becomes executable code.
- Microsoft Semantic Kernel: CVE-2026-26030 (Critical) —
eval()used to execute lambda expressions in vector store filter functions without sanitization. A simple data lookup becomes an executable payload. Fix: Python ≥ 1.39.4, .NET ≥ 1.71.0. - Microsoft Semantic Kernel: CVE-2026-25592 (Critical) —
DownloadFileAsyncfunction exposed to AI models via[KernelFunction]attribute, enabling arbitrary file writes to the host filesystem from a single malicious prompt. - CrewAI: Sandbox bypass — when Docker is unavailable, framework defaults to SandboxPython which fails to block
ctypescalls; attackers invokectypes.CDLL("libc.so.6").system()for arbitrary command execution. - CrewAI: No continuous Docker availability check — if Docker goes offline mid-session, the system silently degrades to insecure sandbox mode without alerting operators.
- LangChain: CVE-2026-34070 (CVSS 7.5) — path traversal in prompt loading. LangGraph also affected via deserialization vulnerability (CVSS 9.3) leaking API keys.
- LangFlow: CVE-2026-33017 (CVSS 9.8) — unauthenticated RCE via HTTP endpoint.
- Attack vector is consistent: attacker embeds malicious instructions in documents, emails, web pages, or database records that the agent later retrieves; the agent treats poisoned content as legitimate context and executes it via tool access.
- Only two of CrewAI's four identified vulnerabilities have received official vendor statements; tens of thousands of production deployments may remain vulnerable.
Why it matters
This is the clearest articulation yet that prompt injection in agent frameworks is not a content-quality problem — it's a host-compromise problem. When frameworks expose tool-calling functions directly to LLMs, a poisoned retrieval context bypasses the traditional boundary between "what the AI thinks" and "what code runs." The Microsoft research shows this pattern is systemic across the ecosystem, not isolated to one vendor. For teams running agent frameworks in production, the threat model is identical to traditional injection (SQL, command injection): untrusted input reaching an execution context.
What to do
- Audit your agent framework version against the listed CVEs and patch immediately if affected.
- Review which functions your LLM can invoke — any function with file, network, or system access should be treated as callable from untrusted input.
- For CrewAI users: ensure Docker availability is continuously monitored, not just checked at startup.
- Apply defense-in-depth: isolate agent execution in containers with minimal privileges, restrict network egress, and log all tool invocations for anomaly detection.
- Treat retrieved content (documents, web pages, database records) as untrusted — the same standard you'd apply to user input in traditional applications.