The coding-agent library, read in order — from the IDE assistants (Cursor, Windsurf, GitHub Copilot, Claude Code) through the terminal-native CLI agents (Claude Code, Codex CLI, Gemini CLI), the agentic IDEs (Antigravity), autonomous and background agents (Devin, Codex, Jules), the open-source agents (Aider, Cline, Roo, Kilo, OpenHands), the AI app builders (Lovable, Bolt, v0, Replit), how the edit lands (diff vs whole-file, fast-apply models), how you steer them (spec-driven development, AGENTS.md/CLAUDE.md), AI code review (CodeRabbit, Greptile, Qodo), parallel agents with git worktrees, how to evaluate one, and the coding-agent security surface.
The four tools map to four architectural postures — and in a year when the companies keep getting acquired out from under their users, the posture is what you're actually choosing.
Three bets on the same idea — that the command line, not the IDE, is where coding agents live. And as of this month one of the three just changed its name and its terms.
Google's Antigravity, Cursor, and Claude Code now all hit ~80% on SWE-bench. So the real difference isn't who writes better code — it's where each one puts the work of checking it.
The async coding agents have all converged on the same shape — a cloud VM that clones your repo, runs the tests, and opens a PR. So the thing you're actually choosing isn't the coder. It's the harness and who reviews the flood.
They aren't ranked by capability. They differ on where the agent runs and who holds the steering wheel — and that decides your blast radius, not your benchmark score.
Three open-source coding agents from one family tree — and the middle child just shut itself down. Its death is the most useful thing in the comparison.
They all promise an app from a prompt. They differ on the question none of them advertises: when you outgrow the tool, do you get to take the code with you?
MCP gives an agent tools. ACP gives an agent an editor. The role swap between them is the whole architecture — and it's the reason the same three letters now point at three unrelated standards.
Everyone argues about which model to use. The under-discussed variable is how the agent writes its changes to disk — and that edit format is often the real bottleneck.
The bottleneck in a coding agent isn't the smart model deciding what to change. It's the dull mechanical work of writing that change to disk correctly — and that's a different model entirely.
Writing a spec before the agent writes code is the loudest idea in AI coding right now. The pitch isn't better code — it's making intent a durable artifact that survives the context window. Three tools bet on that at three different altitudes.
The config-file war for how you talk to a coding agent didn't end with a winner. It ended with a foundation — and that changes which file you should actually write.
The first rigorous benchmark of repository context files is in, and the answer is uncomfortable: the auto-generated ones make agents slightly worse, the hand-written ones barely help, and both raise your bill ~20%.
Every vendor leads with its bug-catch rate. But code review is the one place in the AI stack where precision beats recall — a reviewer you learn to ignore catches nothing.
Worktrees stop your agents from overwriting each other's files. They do nothing about the shared database, the fight over port 3000, or the review queue that becomes your real bottleneck.
Public leaderboards answer 'which model is smartest,' not 'will it fix my bugs' — the only test that predicts your outcome is a private eval built from your own repo.
Amazon Q auto-ran an MCP config out of any repo you opened, with your live AWS keys in the process. It got a CVE. The identical bug in Claude Code, Cursor, Gemini CLI and Copilot got declared working-as-designed — because the trust prompt you inherited from your editor was never a consent to run code.