You open five terminals. In each, a coding agent is checked out into its own git worktree, on its own branch, chewing on its own ticket. Auth refactor in one. A flaky-test fix in another. Three small features after that. For about eleven minutes it feels like you have grown four extra hands. The diffs accumulate. Nothing overwrites anything.
Then two of them stop. One agent ran a migration against the dev Postgres; the other was mid-transaction against the same database and is now staring at a schema it didn't expect. A third just died because port 3000 was already taken by the first agent's dev server. Your build cache has a half-written artifact from a process that no longer exists. The worktrees are pristine. The system around them is on fire.
What worktrees actually buy you
The mechanic is genuinely good and genuinely simple. git worktree add <path> attaches a new linked working tree to your existing repository — its own directory, its own branch, sharing the one .git object store, per the official Git docs. Creation is near-instant because nothing is copied except a checkout. Both major agents now wrap this natively: Claude Code ships a --worktree flag that drops a session into an isolated tree under .claude/worktrees/, per its docs, and the Codex app has built-in worktree support for running threads in parallel.
What this solves is file collision. Agent A editing auth.ts cannot touch Agent B's copy of auth.ts, because they are different files on disk on different branches. That is real, and before worktrees people genuinely lost work to two agents writing the same path in the same directory.
But file isolation is the easy 80%. It's the part that was always going to be solvable with a directory and a branch.
Worktrees isolate the files you track. The bugs come from everything you don't.
The shared substrate nobody sells you on
Here is the load-bearing fact: a worktree isolates tracked files and nothing else. Everything untracked is shared across every worktree of the same repo, because it lives on the host, not in the branch.
That list is longer than it looks. The local database. The Docker daemon and its containers. Dev-server ports. node_modules and the build/test caches. The .env file, which is gitignored precisely so it won't be tracked — and therefore won't be isolated. Penligent puts the consequence plainly: worktrees "isolate code but not the runtime environment, giving you separate file systems but shared ports, databases, and services."
The nasty part is what this does to an agent's reasoning. As Penligent notes, when the runtime isn't isolated per worktree, "the agent's view of cause and effect becomes unreliable." A test fails — but is the branch wrong, or did another agent just mutate the shared database, or grab the port? The agent can't tell, so it starts patching code to fix a problem that lives in the host. Now it's confidently wrong, in parallel, five times.
The fix is unglamorous and entirely on you: per-worktree port ranges, a scratch database (or database branching) per tree, a separate .env per tree. Upsun's guidance is blunt about sequencing — isolate runtime before you fan out parallel agents, not after the first deadlock. This is the half the "just use worktrees" advice quietly skips.
That gap is also why a whole tooling layer exists. If worktrees alone were sufficient, Conductor, Vibe Kanban, and Claude Squad would have nothing to do. They wrap worktrees with the runtime wiring and, crucially, a single review surface — Nimbalyst's survey catalogs the field. The tools are an admission that the worktree was step one of three. (Choosing among the agents themselves is a separate question — see Claude Code vs Codex CLI vs Gemini CLI and Cursor vs Windsurf vs Copilot vs Claude Code.)
The ceiling isn't technical
Say you do all of it. Per-tree databases, port ranges, clean .envs, an orchestrator with a dashboard. You've bought the full 100% of isolation. You still don't get to run twenty agents.
Because now the bottleneck has simply moved. It was writing code; it's now merging it. Ten agents each producing a diff every fifteen minutes is forty diffs an hour landing on one human who has to read, judge, and integrate every one. Nimbalyst frames the review queue as a bottleneck that "grows linearly with agent count" — and no orchestrator removes it, it only relocates it downstream to you. This is why practitioners keep landing on the same number: roughly three to five concurrent agents as the practical sweet spot, above which rate limits and merge conflicts eat the gains. Treat that as a reported range, not a law of physics — but notice it's a range about human throughput, not machine throughput.
If you want to push past it, the honest levers are the boring ones: tighter task boundaries so each diff is small, automated quality gates so tests do the first pass of review, and — eventually — genuine sandboxes so runtime isolation stops being something you hand-script. Even orchestration of the tool calls inside a single agent runs into the same wall, as we covered in parallel vs sequential tool calling: parallelism is cheap to start and expensive to reconcile.
So: spin up your worktrees. They're necessary. They're also the part of the problem that was already easy. The work you actually came to do — isolating the runtime, and surviving the merge queue — starts exactly where the worktree's job ends.



