Novee — autonomous AI red teaming for LLM applications
AI relevance: This is specifically aimed at testing LLM apps and agent workflows for prompt injection, jailbreaks, tool abuse, and agent manipulation that ordinary web-app scanners do not model well.
- Novee launched AI Red Teaming for LLM Applications, a product pitched as an autonomous pentesting agent for chatbots, copilots, autonomous agents, and LLM-powered workflows.
- The company says the system continuously simulates prompt injection, jailbreaks, data exfiltration, and agent manipulation rather than relying on one-off prompts or static checks.
- The sharp claim is not just “AI for security,” but multi-step adversarial chaining: the agent is supposed to read docs, query APIs, build a model of the target app, and tailor attack paths to that environment.
- That matters because many real LLM failures are not single-request bugs; they depend on state, role boundaries, hidden context, tool access, and sequencing.
- Novee says the product can plug into CI/CD pipelines, which is the more interesting operational angle: testing AI behavior as systems and prompts change, not only before release.
- The company ties the product directly to its own vulnerability research, including a Cursor advisory describing sandbox escape through writable
.gitsettings and Git hooks in versions prior to 2.5. - That linkage gives the launch a bit more weight than generic “agent firewall” marketing: there is at least one concrete, recent example of prompt-injection-to-RCE style impact in a real AI developer tool.
- Still, operators should read this as a signal about continuous AI security testing, not proof that autonomous red teaming can replace human reviewers for complex production systems.
Why it matters
- AI applications change faster than classic pentest cycles. A model swap, prompt tweak, or new connector can alter behavior without a major code release, so annual testing is a weak fit.
- For teams running internal copilots or agent workflows, the hard problems are often cross-step abuse paths — planting malicious context, steering tool use, or crossing RBAC boundaries — which require adaptive testing.
- If products like this become useful in practice, they could push AI security programs toward continuous adversarial validation the same way SAST/DAST shifted appsec left.
What to do
- Test agent flows, not just prompts: include multi-turn, multi-tool, and cross-role scenarios in your security reviews.
- Gate high-risk tools: filesystem writes, shell access, SCM settings, secrets, browsers, and finance actions deserve stronger approvals and audit trails.
- Re-test after behavior changes: new models, prompts, connectors, and retrieval sources should trigger fresh adversarial checks even when the app code barely changes.
- Keep humans in the loop: use automated AI red teaming to widen coverage, but reserve manual review for business-logic abuse, privilege boundaries, and high-impact workflows.