Re: Anthropic Chinese Cyber-Attack. How Do We Protect Open-source Models?
Recently Anthropic published a report on how they detected and foiled the first reported AI-orchestrated cyber espionage campaign. Their Claude Code agent was manipulated by a group they are highly confident was sponsored by the Chinese state, to infiltrate about 30 global targets, including large tech companies and financial institutions....
This is a straw man argument. The standard MO of coding agents is that they use one consistent LLM in their agentic flow. The approach I outlined addresses that default case, and there's obvious utility in that.
You might as well say there's no point in Anthropic tracking malicious usage of Claude Code in their telemetry data, because attackers are free to switch up their coding agent (between e.g. Codex, Gemini etc) within the course of a multi-step task.