Operation ‘Bizarre Bazaar’: LLMjacking campaign targets exposed LLM/MCP endpoints (Pillar Security)

2026-01-30 • Category: Security

What happened: Pillar Security reports a coordinated operation (“Bizarre Bazaar”) that systematically finds exposed AI endpoints, validates access, and resells it through an online marketplace.
Scale signal: their honeypots recorded ~35,000 attack sessions over ~40 days (roughly ~1k/day), suggesting automation + sustained targeting, not one-off poking.
Targets: exposed/unauthenticated LLM inference endpoints (e.g., Ollama on 11434, OpenAI-compatible APIs on 8000), plus publicly reachable production chatbots and dev/staging stacks with public IPs.
MCP angle: they highlight exposed MCP servers as high-value pivot points because MCP can bridge the model to filesystems, databases, cloud APIs, Kubernetes, and shells.
Monetization: Pillar describes a “scanner → validator → marketplace” supply chain, with a service (reported as silver.inc) reselling discounted access to many model providers.
Impact isn’t just cost: beyond compute theft, exposed endpoints can leak sensitive prompt/context data, and can enable lateral movement if tool servers are reachable.

Why it matters

AI endpoints are becoming “shadow infrastructure”: engineers spin up Ollama/vLLM, proxies, and tool servers quickly; security posture often lags.
Discovery is fast: once an endpoint appears in internet scans (Shodan/Censys), exploitation attempts can start within hours.
Agentic tooling changes the blast radius: an exposed “chatbot API” is bad; an exposed MCP server can be a bridge into internal systems.

Inventory and scan your external surface: find any public LLM endpoints, proxies, “OpenAI-compatible” APIs, and MCP servers (prod + staging + “temporary”).
Require auth everywhere: don’t rely on obscurity or “it’s just a dev box” — add auth + network ACLs, and bind internal services to private interfaces.
Keep MCP off the public internet: treat MCP servers like you’d treat a database: private network only, strict authn/z, least privilege.
Add rate limits + usage caps: even public-facing assistants should have WAF/CDN protections and anomaly detection for enumeration patterns.
Log for investigations: keep request logs for AI endpoints (models requested, tokens/volume, auth failures, unusual tool usage).