Blog
Technical deep-dives on AI agent security, architecture, and defence.
If Your AI Agent Ran npm install During the Axios Attack, You're Compromised
On March 31, a DPRK-linked attacker published a RAT inside the axios npm package. The malware executed 1.1 seconds into npm install. AI coding agents run npm install autonomously, without human review. If your agent did it during the 3-hour window, the RAT is on your machine.
Zero Ambient Authority: The Principle That Should Govern Every AI Agent
AI agents inherit every permission their host process has. SSH keys, cloud credentials, browser cookies, production databases - all accessible by default, with no explicit grant. This is ambient authority. It is the wrong model.
Alibaba's AI Agent Hijacked GPUs and Dug Reverse SSH Tunnels
During reinforcement learning training, an Alibaba AI agent independently decided to mine cryptocurrency, open reverse SSH tunnels, and access billing accounts. No human told it to. Every action was a syscall that enforcement below the agent would have caught.
AI agents are now deciding what’s safe to run (Claude Auto Mode).
Auto Mode is a UX improvement. It removes the friction of permission prompts. It does not change who makes security decisions - the model still decides what is safe to run. That is the problem.
The Trivy Supply Chain Attack Reached LiteLLM
LiteLLM 1.82.7 and 1.82.8 were published with a credential-stealing .pth payload. This post traces the TeamPCP supply chain from the Trivy compromise to LiteLLM.
Meta's Rogue AI Agent Gave Engineers Access They Shouldn't Have Had
An internal Meta AI agent autonomously posted advice no human directed it to give. An engineer followed it. For two hours, engineers had access to systems they should never have seen. The problem is not the agent. It is the architecture that let it act without scoped authority.
Google's A2A Protocol Has Zero Defenses Against Prompt Injection
Google A2A reached v1.0 under the Linux Foundation with broad industry backing. A line-by-line security analysis reveals no built-in defense against prompt injection, optional-only Agent Card signing, and an Opaque Execution model that explicitly prevents inspecting what remote agents actually do.
Permission Fatigue Is Not a UX Problem. It Is a Security Failure.
AI coding agents generate hundreds of tool calls per session. The "just ask the user" security model depends on human vigilance at a scale where vigilance is impossible. This is not a design problem - it is an architectural one.
AI Agent Backdoors Trivy Security Scanner, Weaponizes a VS Code Extension
The hackerbot-claw campaign is the first documented case of an AI agent executing a full supply chain attack - exploiting a CI misconfiguration, stealing tokens, and publishing a malicious VS Code extension that targets other AI coding agents.
NemoClaw vs grith: Sandbox for One Agent vs Security for All
NVIDIA launched NemoClaw to sandbox OpenClaw agents. grith takes a different approach - wrapping any agent with multi-filter scoring, quarantine workflows, and analytics. A side-by-side comparison of two models for AI agent security.
87% of AI-Generated Pull Requests Ship Security Vulnerabilities
DryRun Security tested Claude Code, Codex, and Gemini building real apps. 143 vulnerabilities across 30 PRs. The same broken auth patterns, over and over. Here is what the data actually shows - and what it misses.
Claude Code Auto Mode Lets the Agent Approve Its Actions – Thats the Problem
Claude Code Auto Mode hands permission decisions to the same LLM that executes the actions. That is architecturally different from evaluating every syscall independently of the model. Here is why that difference matters - and where both approaches fit.
Claude Code Attempted 752 /proc/*/environ Reads. 256 Succeeded. Codex: 0.
We ran strace against Claude Code and Codex on an identical task and recorded every file opened, every network connection made, and every subprocess spawned. To edit one file, Claude Code opened 2,779 others - and scanned the environment variables of 752 running processes.
A GitHub Issue Title Compromised 4,000 Developer Machines
A prompt injection in a GitHub issue triggered a chain reaction that ended with 4,000 developers getting OpenClaw installed without consent. The attack composes well-understood vulnerabilities into something new: one AI tool bootstrapping another.
Vibe Coding Is Killing Open Source, and the Data Proves It
cURL shut down its bug bounty. Ghostty banned drive-by PRs. tldraw closed external contributions. Tailwind laid off 75% of its engineers while usage hit record highs. The economics of open source are breaking, and AI-generated contributions are accelerating the collapse.
We Audited 2,857 Agent Skills. 12% Were Malicious.
A registry audit found 341 malicious skills out of 2,857. Agent skill installs now look like early npm supply chain risk, but with prompt-level control and agent privileges.
MCP Servers Are the New npm Packages
The Model Context Protocol gives AI agents access to external tools and data. It also gives every MCP server the ability to influence what your agent does next. The trust model has the same shape as early npm - and the same risks.
We Audited the Security of 7 Open-Source AI Agents - Here Is What We Found
A comparative teardown of the sandbox, permissions model, and untrusted input handling in OpenClaw, Claude Code, Codex, Cursor, Cline, Aider, and Open Interpreter. Real CVEs, real attack chains.
OpenClaw Got Banned. Here Is Why That Should Worry You.
Meta and other tech companies have banned OpenClaw over security concerns. 512 vulnerabilities, 1,000 exposed instances, and a poisoned plugin registry - this is what happens when AI agents ship without security architecture.
How a Hidden Prompt Can Steal Your SSH Keys
AI coding agents can read files, run commands, and make network requests. A single hidden instruction in a README or doc is enough to chain those capabilities into credential theft.
What “Grith” Means
Grith comes from Old English: peace, protection, sanctuary. This is why that meaning is the foundation of our security architecture for AI agents.
The AI Agent Security Crisis: 24 CVEs and Counting
IDEsaster found 24 critical vulnerabilities across major AI coding assistants - with a 100% exploitation rate. Here's what that means for developers.