UCSC / The Register — CHAI: physical prompt injection hijacks self-driving cars and drones via road signs

  • Researchers at UC Santa Cruz and Johns Hopkins introduce CHAI (Command Hijacking against embodied AI), a new class of environmental indirect prompt injection that uses printed signs in the physical world to hijack AI-powered vehicles and drones.
  • How it works: text-based commands (e.g., "proceed", "turn left") are displayed on signs placed in the camera's field of view. The large vision-language model (LVLM) driving the system interprets the sign text as an instruction, overriding its actual mission.
  • The team used AI to optimize both the prompt wording and the sign's visual properties (font, color, placement) to maximize the probability of the LVLM obeying the injected command.
  • Self-driving car tests: 81.8% success rate. Using the DriveLM dataset, CHAI tricked LVLMs into turning left at a crosswalk where pedestrians were present — overriding the correct "slow down" decision.
  • Drone tracking tests: up to 95.5% success rate. CloudTrack was fooled into misidentifying a generic car as a police vehicle by simply displaying "Police Santa Cruz" on the car's roof.
  • Attacks worked across multiple languages (English, Chinese, Spanish, Spanglish) and against both closed (GPT-4o) and open (InternVL) models.
  • Physical-world validation confirmed results: RC cars in real corridors obeyed injected commands at rates comparable to simulation.
  • Green backgrounds with yellow text were the most effective sign design across all tested languages.

Why it matters

  • This takes prompt injection out of chatbots and into the physical world. Embodied AI systems — autonomous vehicles, delivery drones, security robots — are now demonstrated targets for environmental manipulation.
  • Unlike adversarial image patches (which exploit vision model weights), CHAI exploits the language understanding layer, meaning any LVLM-powered system that reads its environment is potentially vulnerable.
  • No digital access needed: the attacker just prints a sign. This has implications for physical security of autonomous systems in public spaces.

What to do

  • Don't trust visual text as commands: embodied AI systems should treat OCR-extracted text from the environment as untrusted input, not instructions.
  • Implement input separation: architect LVLM pipelines so that environmental observations and system commands flow through distinct channels that cannot be confused.
  • Test against environmental injection: red-team autonomous systems with physical-world prompt injection scenarios before deployment.
  • Monitor for anomalous behavior: runtime anomaly detection that flags sudden decision changes (e.g., "turn" when "stop" was expected) can catch injection attempts.

Sources