Ask which open-source coding agent is "best" and you'll get a leaderboard, which is the wrong artifact. Aider, Cline, and OpenHands are not three rungs you climb toward the most capable one. They are three answers to a single question that the benchmark screenshots never ask: where does the agent run, and how much does it do before you look? That answer sets your blast radius. The benchmark score, it turns out, mostly belongs to the model you plug in.
Aider: the disciplined pair in your terminal
Aider is the oldest of the three and still the most disciplined. It lives in your terminal, edits files directly in your working tree, and — the load-bearing design choice — auto-commits every change to git with a descriptive message. That makes the version-control system the agent's undo button: every step is a diff you can read, revert, or cherry-pick with the tools you already trust. To stay oriented in a large repo, Aider builds a tree-sitter repomap, a compressed picture of your codebase's symbols across 100-plus languages, so the model gets the right context without you pasting files by hand.
The shape here is pair programmer. You watch the diffs go by. Aider is least likely to surprise you and best suited to careful, incremental work inside an existing repository — the place where a runaway agent does the most damage.
Cline: the agent that asks first
Cline runs inside your editor — VS Code and JetBrains — and its defining feature is the approval gate. It works in two modes: Plan, where it explores your codebase and lays out a strategy, and Act, where it executes. Crucially, every file edit and every terminal command requires your sign-off (you can flip on auto-approve, but the default is human-in-the-loop). Bring-your-own-key means you choose and pay for the model directly.
The shape here is governed assistant. You are the runtime — nothing touches disk or shell without passing through you. That makes Cline the natural choice for teams that need an audit trail or work under change-control rules, where "the agent did it autonomously" is not an acceptable line in a postmortem.
OpenHands: the agent as its own runtime
OpenHands (formerly OpenDevin) takes the opposite position and makes it pay. Instead of asking before each step, it spins up a sandboxed Docker runtime and lets the agent write code, run terminal commands, browse the web, and open pull requests on its own, iterating until the task is done. You don't approve actions; you read the result.
This is why OpenHands is the one you see on the SWE-bench Verified leaderboard — autonomy is what lets an agent grind on a real GitHub issue end to end. And it's why the sandbox is the actual product decision, not a footnote. With Aider and Cline, your safety boundary is your attention. With OpenHands, your attention is gone, so the container wall is all that stands between an enthusiastic agent and your machine. The more autonomous the agent, the more the isolation matters — which is the same axis as everything else here, viewed from the dangerous end. (If you're weighing where that boundary should live, the agent-sandbox comparison is the next stop.)
With a pair programmer, your attention is the seatbelt. With an autonomous agent, the sandbox is — because your attention has left the car.
The score belongs to the model, not the harness
Here's the part the comparison blogposts bury. These tools are harnesses, not models. The same OpenHands setup posts wildly different numbers depending on what you plug into it: roughly 70%-plus on SWE-bench Verified with a frontier model, but about 37% when driven by a 32B open-weight model. That spread is the model's, not the harness's.
So the right way to read any "Agent X scores Y% on SWE-bench" headline is: that's the score of a model running inside that harness on that day. Swap the model and the number moves more than swapping the agent would. Choose your harness for its workflow and its safety model — terminal discipline, in-editor approval, or sandboxed autonomy — and choose your LLM for raw capability. They're separate decisions, and conflating them is how people end up disappointed by a "weak" agent that was really just running a weak model.
A note on the one that left
If you expected a fourth name, it's Continue — and its absence is the story. The Continue repo went read-only in 2026 after the project was acquired by Cursor; the VS Code and JetBrains extensions still install, but new features and model integrations now depend on community forks. The harness layer is consolidating even as the agents get better, which is its own argument for picking tools whose shape — and license — you can live with for more than a quarter.
The decision, made plainly
- Aider when you want disciplined, git-native edits in an existing repo and you intend to watch every diff.
- Cline when you want an in-editor agent but need every action gated and auditable — governance first.
- OpenHands when you want to hand off whole tasks and your real design decision is how tightly to sandbox the runtime.
Don't ask which is most capable. Ask how much you're willing to let the agent do before you look — then pick the shape that matches, and bring the strongest model you can afford.



