arXiv — Securing LLM Agents Needs Intent-to-Execution Integrity (2605.16976)

2026-05-22 Research by al-ice.ai Editorial

AI relevance: This position paper identifies a foundational gap in LLM agent security — the lack of a formal correctness property defining what "secure execution" means — and maps existing defenses against four integrity requirements that any agentic system must satisfy.

Researchers from NUS, UCLA, and Berkeley (Qu et al., May 2026) argue that LLM agent security requires a formal intent-to-execution integrity property — analogous to compiler correctness — that specifies when an agent's actions faithfully reflect the user's intent.
They identify four conjunctive integrity properties: Tool Integrity (tools do what they claim), Instruction Integrity (user instructions aren't hijacked), Judgment Integrity (the model's decisions aren't poisoned), and Data Flow Integrity (data moves only along intended paths).
The paper highlights systems like OpenClaw with open skill ecosystems and third-party tool integrations, where tools cannot be assumed trusted — breaking the implicit assumption underlying most existing agent-security defenses.
Analysis of existing defenses (NemoClaw, SeClaw, SafeClaw-R, SecureClaw) reveals only partial, non-compositional coverage — each stacks mechanisms without defining what correctness property those mechanisms are intended to achieve.
The authors draw a compiler analogy: just as a compiler must preserve program semantics from source to binary, an LLM agent must preserve user intent from natural language to tool-call execution. Security violations are "mis-executions" in this pipeline.
The paper catalogs real-world evidence of supply-chain risks in agent tool ecosystems: the ClawHavoc campaign (1,000+ malicious ClawHub skills), a 42,447-skill audit finding 26.1% with vulnerabilities, and documented sandbox bypasses.

Why it matters

The agentic AI security field is drowning in point solutions (sandboxing, injection filters, policy enforcement) without a unified theory of what security means for agents. This paper provides the missing framework to evaluate whether defense compositions actually close all attack paths — or leave exploitable gaps between mechanisms.

What to do

Read the paper and map your agent architecture against the four integrity properties — identify which are unaddressed.
Treat third-party tools as untrusted by default, especially in open-skill ecosystems.
Audit tool compositions for gaps where multiple partial defenses leave a combined attack path open.

Sources

arXiv:2605.16976 — Securing LLM Agents Need Intent-to-Execution Integrity