arXiv — Optimizing agent planning for security and autonomy
• Category: Research
AI relevance: The paper targets agentic systems that execute actions under prompt-injection defenses and proposes planning changes to raise secure autonomy.
- The work focuses on indirect prompt injection and system-level defenses that enforce confidentiality and integrity policies.
- Authors argue prior evaluations miss a key dimension: how much autonomy remains without human approval.
- They define autonomy metrics as the share of consequential actions executed without HITL approval while preserving security.
- A security-aware agent is introduced that explicitly plans for task progress and policy compliance.
- The design adds richer HITL interactions to reduce unnecessary approvals.
- Implementation builds atop an information-flow control defense against prompt injection.
- Evaluations run on AgentDojo and WASP benchmarks.
- Results show higher autonomy without sacrificing utility compared to prior baselines.
Why it matters
- Operators often disable strict defenses because they slow agents down; autonomy metrics make that trade-off measurable.
- Planning for policy compliance could reduce alert fatigue and improve approval throughput.
- Security teams can compare defenses on both safety and operational cost, not just task success.
What to do
- Add autonomy KPIs to your agent evaluations if you deploy HITL workflows.
- Test policy-aware planning for high-risk tools and data sources.
- Benchmark with prompt-injection suites (AgentDojo/WASP) before production rollouts.