Trail of Bits Bypasses ClawHub, Cisco & Vercel AI Skill Scanners
AI relevance: Trail of Bits demonstrates that automated scanners used by ClawHub, Cisco AI Defense, and Vercel's skills.sh all fail to detect malicious code hidden in agent skill files — exposing a systemic gap in how AI agent ecosystems vet third-party tooling that executes code on behalf of models.
What Happened
- Trail of Bits bypassed three major AI skill-detection tools: ClawHub's malicious skill detector, Cisco's open-source skill scanner, and Vercel's skills.sh integrations.
- On ClawHub, researchers evaded detection by prepending ~100,000 newline characters before malicious logic in a skill file, pushing harmful code beyond the scanner's inspection window and confusing integrated engines including VirusTotal Code Insight.
- Against Cisco's scanner, malicious Python bytecode (.pyc) was embedded in a skill — a text-formatting skill whose visible source appeared benign but whose compiled bytecode extracted environment variables for data exfiltration.
- Scanners focused on readable source files missed the bytecode payload entirely, classifying the skill as safe.
- An indirect execution attack hid operational logic inside a document (.docx archive) containing a hidden script, which the AI agent was instructed to retrieve — bypassing both signature-based detection and LLM-based reasoning since the malicious behavior was not in the primary skill definition.
- Prompt injection was used against LLM-based scanners: a skill disguised a malicious package registry configuration as a legitimate enterprise requirement, causing the scanner to downgrade the risk to low severity and approve it.
- These attacks exploit fundamental constraints — limited context windows, selective file inspection, and the assumption that visible source code tells the full story.
Why It Matters
- All three scanning approaches — static analysis, pattern matching, and LLM-based inspection — were defeated using basic obfuscation techniques that don't require advanced exploitation knowledge.
- AI agent skills are executable code that can influence model behavior, access tools, and interact with external APIs. A malicious skill installed by a developer or user effectively gives an attacker a trusted path inside the agent's execution context.
- Public skill marketplaces are growing rapidly but prioritize usability over security — creating a widening attack surface as organizations deploy third-party skills without rigorous vetting.
- This research validates what the Acronis TRU campaign showed in May 2026: AI skill ecosystems are a proven malware distribution vector, not a theoretical risk.
What to Do
- Treat all public AI skills as untrusted code. Apply the same scrutiny you would to any third-party dependency.
- Adopt supply chain security practices: curated repositories, strict access controls, and version pinning for skills used in production.
- Audit installed skills for compiled bytecode, hidden scripts in archive files, or instructions that direct the agent to fetch external logic.
- Do not rely solely on automated scanning — combine it with code review, sandboxing, and runtime monitoring of agent behavior.
- Restrict skill permissions to the minimum required. Deny skills that request broad filesystem, network, or credential access without justification.