GreyNoise — 91K attack sessions reveal active targeting of exposed LLM infrastructure

AI relevance: threat actors are running systematic, automated campaigns against exposed LLM infrastructure — scanning for misconfigured Ollama servers, probing 73+ model API endpoints, and exploiting SSRF to force AI servers to phone home to attacker infrastructure.

What happened

  • GreyNoise's Ollama honeypot infrastructure captured 91,403 attack sessions between October 2025 and January 2026, revealing two distinct targeting campaigns against AI deployments.
  • The SSRF campaign exploited Ollama's model-pull functionality, injecting malicious registry URLs to force servers into outbound connections to attacker-controlled infrastructure — 1,688 sessions spiked over a 48-hour Christmas window.
  • Attackers used ProjectDiscovery's OAST (out-of-band application security testing) infrastructure to confirm successful SSRF exploitation via callback validation, with a single JA4H fingerprint appearing in 99% of attacks — likely shared Nuclei-based automation tooling.
  • The enumeration campaign began December 28, 2025: two IPs systematically probed 73+ LLM model endpoints, testing OpenAI-compatible and Gemini API formats across GPT-4o, Claude Sonnet/Opus/Haiku, Llama 3.x, DeepSeek-R1, Gemini, Mistral, Qwen, and Grok — generating 80,469 sessions in eleven days.
  • Probe queries were deliberately innocuous ("hi", "How many states are there in the US?") to fingerprint which model responds without triggering security alerts.
  • Infrastructure traced to AS210558 (1337 Services GmbH) and AS51396 (Pfcloud UG) — known bulletproof hosting providers frequently used for scanning infrastructure.
  • A separate 293-day investigation found 175,000 unique publicly accessible Ollama hosts across 130 countries, confirming the attack surface is massive and growing.

Why it matters

AI teams treat inference servers and model APIs as internal tools, deploying them with default configurations and no perimeter controls. These campaigns prove attackers already know where to look and how to exploit them. The enumeration campaign is especially concerning: it's building a target list of exposed AI infrastructure, mapping which models are running where — a prerequisite for targeted exploitation at scale.

What to do

  • Never expose Ollama (port 11434) or other inference servers directly to the internet — use reverse proxies with authentication.
  • Disable or restrict Ollama's model-pull functionality to allowlisted registries only.
  • Monitor outbound connections from AI infrastructure for SSRF callbacks — unexpected HTTP calls to external domains are a strong indicator of compromise.
  • Apply network segmentation: inference servers should not have unrestricted outbound internet access.
  • Treat AI infrastructure with the same perimeter discipline as any internet-facing service — default credentials, exposed APIs, and unauthenticated endpoints are being actively scanned.

Sources