Adversa AI — GuardFall: Bash Tricks Bypass Safeguards in 10 of 11 AI Coding Agents

AI relevance: Open-source AI coding agents run shell commands with the developer's full privileges — Adversa AI shows that pattern-based shell guards fail against decades-old Bash expansion tricks, turning poisoned READMEs and Makefiles into remote code execution via supply chain.

What happened

  • Adversa AI published research on GuardFall, a structural vulnerability class affecting open-source AI coding agents and computer-use agents.
  • Researchers tested 11 popular open-source agents (including Hermes, OpenCode, Roo-code, and others) — 10 of 11 left the gap open in at least one of four ways. Only one agent (Continue) blocked the structural majority of bypasses.
  • The root cause: agents use pattern-based shell guards (regex denylists) to inspect commands before execution, but Bash expands, unquotes, and rewrites text before running it — so the guard sees raw text while Bash executes the unwound version.
  • Attack classes ranged from quote removal and $IFS spacing tricks (classes A–D) to alternative argv shapes (class E) that survived even the strongest tokenized guard, because per-flag reasoning requires knowing which flag combinations flip a binary from benign to destructive.
  • The trigger: an engineer uses a vulnerable agent to read a poisoned README, Makefile, or MCP server output from a malicious repository. The agent emits a destructive shell command that passes the guard but runs with the operator's full account authority.
  • Exploitation requires auto-execute mode or local sandbox mode — but these are default configurations in CI pipelines and common developer setups.
  • The research started after finding a NousResearch/hermes-agent approval gate bypass via shell rewrites against a 30-pattern regex denylist.

Why it matters

  • This is not a bug — it's a design mismatch. A guard inspects raw text while Bash expands and rewrites text before execution. The gap between what the agent thinks it's running and what Bash actually runs is structural, not patchable with better regexes.
  • Supply chain radius is enormous. Developer machines hold AWS credentials, SSH keys, cloud CLI tokens, and CI/CD pipeline access. A malicious repo that exploits GuardFall gets all of it — with no social engineering beyond git clone.
  • CI pipelines are ground zero. Auto-yes modes are default in many CI setups, removing the human-in-the-loop step that might catch a suspicious command.
  • Joins a growing cluster. GuardFall is the latest in a wave of AI coding agent trust boundary failures: CVE-2025-59536 (Claude Code), CVE-2025-54136 (Cursor), CVE-2026-30615 (Windsurf), CVE-2026-12957 (Amazon Q) — all traced to workspace config or ingested content executed without adequate consent.

What to do

  • Run agents from a scoped shell with $HOME redirected. Adversa's strongest stopgap: HOME=$HOME/.agent-sandbox-$RANDOM agent … — keeps the project directory but removes ~/.ssh/, ~/.aws/, shell history, and other secrets from the agent's reach.
  • Disable auto-yes modes in any environment where the agent touches untrusted content (repos, MCP servers, web pages).
  • Audit repo-shipped configs — Makefiles, setup scripts, and agent config files from external repos should be reviewed before the agent processes them.
  • Block agent execution on fork PRs and any untrusted contribution that could carry poisoned content.
  • Check if your agent is vulnerable. Adversa's report lists all 11 tested agents and their specific failure modes. If yours is not Continue, assume exposure and apply the scoped-shell stopgap immediately.

Sources