arXiv — Prompt injection vs LLM rankers

AI relevance: Any AI system that ranks or re-ranks content (search, retrieval, recommendations) can be steered if the candidate items embed prompt injections.

  • The paper evaluates how prompt injections embedded in candidate documents can hijack LLM-based ranking decisions.
  • Authors test three ranking paradigms: pairwise, listwise, and setwise rankers.
  • Two injection styles are evaluated: decision objective hijacking and decision criteria hijacking.
  • They measure both attack success rate and downstream ranking quality degradation (nDCG@10).
  • The study compares vulnerability across model families, architectures, and position sensitivity.
  • One key result: encoder–decoder models show stronger inherent resistance to jailbreak-style injections.
  • Code and extra experiments are released for reproducibility and follow-on testing.

Why it matters

  • LLM rankers are increasingly used in search and RAG pipelines, making ranking integrity a core security property.
  • Attackers can exploit prompt injections to surface malicious content or hide relevant results in AI-driven ranking stacks.

What to do

  • Red-team ranking pipelines with injected candidates before production use.
  • Track architecture choices (encoder–decoder vs decoder-only) as a security control, not just a latency tradeoff.
  • Filter or neutralize untrusted instructions in candidate documents before LLM scoring.

Sources