Operation ‘Bizarre Bazaar’: LLMjacking campaign targets exposed LLM/MCP endpoints (Pillar Security)

• Category: Security

  • What happened: Pillar Security reports a coordinated operation (“Bizarre Bazaar”) that systematically finds exposed AI endpoints, validates access, and resells it through an online marketplace.
  • Scale signal: their honeypots recorded ~35,000 attack sessions over ~40 days (roughly ~1k/day), suggesting automation + sustained targeting, not one-off poking.
  • Targets: exposed/unauthenticated LLM inference endpoints (e.g., Ollama on 11434, OpenAI-compatible APIs on 8000), plus publicly reachable production chatbots and dev/staging stacks with public IPs.
  • MCP angle: they highlight exposed MCP servers as high-value pivot points because MCP can bridge the model to filesystems, databases, cloud APIs, Kubernetes, and shells.
  • Monetization: Pillar describes a “scanner → validator → marketplace” supply chain, with a service (reported as silver.inc) reselling discounted access to many model providers.
  • Impact isn’t just cost: beyond compute theft, exposed endpoints can leak sensitive prompt/context data, and can enable lateral movement if tool servers are reachable.

Why it matters

  • AI endpoints are becoming “shadow infrastructure”: engineers spin up Ollama/vLLM, proxies, and tool servers quickly; security posture often lags.
  • Discovery is fast: once an endpoint appears in internet scans (Shodan/Censys), exploitation attempts can start within hours.
  • Agentic tooling changes the blast radius: an exposed “chatbot API” is bad; an exposed MCP server can be a bridge into internal systems.

What to do

  1. Inventory and scan your external surface: find any public LLM endpoints, proxies, “OpenAI-compatible” APIs, and MCP servers (prod + staging + “temporary”).
  2. Require auth everywhere: don’t rely on obscurity or “it’s just a dev box” — add auth + network ACLs, and bind internal services to private interfaces.
  3. Keep MCP off the public internet: treat MCP servers like you’d treat a database: private network only, strict authn/z, least privilege.
  4. Add rate limits + usage caps: even public-facing assistants should have WAF/CDN protections and anomaly detection for enumeration patterns.
  5. Log for investigations: keep request logs for AI endpoints (models requested, tokens/volume, auth failures, unusual tool usage).

Sources