arXiv — Optimizing agent planning for security and autonomy

• Category: Research

AI relevance: The paper targets agentic systems that execute actions under prompt-injection defenses and proposes planning changes to raise secure autonomy.

  • The work focuses on indirect prompt injection and system-level defenses that enforce confidentiality and integrity policies.
  • Authors argue prior evaluations miss a key dimension: how much autonomy remains without human approval.
  • They define autonomy metrics as the share of consequential actions executed without HITL approval while preserving security.
  • A security-aware agent is introduced that explicitly plans for task progress and policy compliance.
  • The design adds richer HITL interactions to reduce unnecessary approvals.
  • Implementation builds atop an information-flow control defense against prompt injection.
  • Evaluations run on AgentDojo and WASP benchmarks.
  • Results show higher autonomy without sacrificing utility compared to prior baselines.

Why it matters

  • Operators often disable strict defenses because they slow agents down; autonomy metrics make that trade-off measurable.
  • Planning for policy compliance could reduce alert fatigue and improve approval throughput.
  • Security teams can compare defenses on both safety and operational cost, not just task success.

What to do

  • Add autonomy KPIs to your agent evaluations if you deploy HITL workflows.
  • Test policy-aware planning for high-risk tools and data sources.
  • Benchmark with prompt-injection suites (AgentDojo/WASP) before production rollouts.

Links