GreyNoise — Threat actors actively targeting exposed LLM endpoints
• Category: Security
- What happened: GreyNoise summarized findings from an Ollama honeypot that recorded 91,403 attack sessions (Oct 2025–Jan 2026) aimed at AI/LLM deployment surfaces.
- Campaign 1 (SSRF-style callbacks): attackers tried to force servers to make outbound “phone home” requests, including attempts via Ollama model pull URL handling (and co-occurring probes against Twilio SMS webhook patterns).
- Validation channel: GreyNoise says attackers used ProjectDiscovery OAST callback domains to confirm the server made the outbound request.
- Campaign 2 (enumeration): two IPs launched a high-volume probe across 73+ model endpoints, generating 80,469 sessions in ~11 days to fingerprint exposed LLM proxies.
- Fingerprint prompts: the probe used deliberately innocuous queries (e.g., “How many states are there…?”, “What model are you?”) likely to identify which backend responds without tripping content filters.
- Model coverage: the probe list included OpenAI-compatible and Gemini formats, spanning OpenAI, Anthropic, Meta (Llama), DeepSeek, Google (Gemini), Mistral, Qwen, xAI, etc.
- Attribution posture: GreyNoise frames the SSRF/OAST campaign as possibly research/bug bounty behavior, but assesses the enumeration as a more concerning threat-actor recon pattern.
- Defensive indicators: the post includes suggested blocks (OAST domains, IPs/ASNs, and JA4 fingerprints) and highlights egress filtering + rate limiting.
Why it matters
- LLM endpoints are becoming “internet services”: once you expose an OpenAI-compatible API, you inherit the same recon/scan lifecycle as any other service.
- Misconfigured proxies are a real prize: if a proxy forwards to paid/commercial APIs, attackers can turn it into free inference (or a foothold for deeper access) via simple enumeration.
- Egress is security: the SSRF angle is a reminder that outbound connectivity from model hosts can be the exploit confirmation path.
What to do
- Require auth + isolate: keep LLM APIs off the public internet where possible; enforce strong auth and per-tenant rate limits where not.
- Detect the “innocuous fingerprint” pattern: alert on rapid requests across many model names/endpoints, especially using the exact prompt strings GreyNoise highlighted.
- Defensive validation (safe): run an internal scan of your environment for accidentally exposed OpenAI-compatible routes (e.g.,
/v1/chat/completions,/v1/models) on non-edge hosts. - Egress filtering: restrict model servers from making arbitrary outbound HTTP requests; explicitly allow only required registries and upstream APIs.
Sources
- GreyNoise (primary): Threat Actors Actively Targeting LLMs
- DefusedCyber (referenced by GreyNoise): Post referenced in report
- ProjectDiscovery OAST (background): Interactsh / OAST