AI-Native EngineeringJun 9, 2026 · 8 min read

How to Run Ten Coding Agents in Parallel on One Laptop

Running six to ten coding agents at once was never a model problem. It's an environment problem, and the moment you solve it, the constraint moves to the one thing you can't refactor: the RAM on your desk.

Oshri Cohen

Chief Product & Technology Officer

AI-NativeParallel by default

Right now, as I write this, there are ten coding agents running on my laptop. Not ten browser tabs with a chatbot in each, ten agents, each on its own branch, each building a different slice of the same product, each able to run that product and prove its own work before it hands it back. You'd think the interesting part is the models, or some clever orchestration trick, but it really isn't. The agents were never the hard part.

When people hear "run six to ten agents at once," they picture the easy version: a row of chat windows, each spitting out code. That part is genuinely easy. Ten agents editing files is a solved problem. The wall you hit, the one nobody shows you in the demo, is that an agent editing code is useless until it can run the thing it changed. And the moment two agents try to run the same product on the same machine at the same time, the whole illusion of parallelism falls apart.

Parallel agents, one shared everything

Most setups break in the same spot. You point every agent at one local environment: one dev server, one database, one queue, one set of ports. Agent A runs a migration to test its feature, and Agent B's tests start failing against a schema it never asked for. Agent C boots the web app and grabs port 3000, and the other nine line up behind it. Two agents seed the same database with conflicting fixtures, and both of their test runs lie to them.

The work fanned out, but the environment didn't. So you aren't really running ten agents in parallel. You're running one environment, single-file, with ten agents elbowing each other for a turn, and paying for it in race conditions you now have to debug. It's actually worse than running them one after another, because now you also have the coordination overhead on top.

Ten agents editing code is easy. Ten agents that can each run the product is the whole problem.

The unit of isolation is the worktree

The fix starts with git worktree. A worktree gives each branch its own working directory on disk, same repository, separate checkout, so ten agents can hold ten different states of the code at once without touching each other's files. I run all of this inside Conductor, a desktop app that spins up worktrees and keeps them organized, so I'm orchestrating agents instead of hand-managing a tangle of directories and branches.

But a worktree only isolates the code. The running product, the web app, the API, the database, the queue, the workers, still has to live somewhere, and by default every worktree's stack wants the same ports and the same database as every other one. Isolating the files gets you part of the way and then leaves you stuck. You have to isolate the runtime too.

Make the environment ephemeral, and parameterized

So I made the environment as disposable as the branch. The product already ran in Docker Compose; the change was to stop treating Compose as a single fixed thing and make it parameterized. I wrap docker compose in a shell script that derives, from the worktree's branch name, the public ports it binds and the names of every container it starts. Branch feature/billing gets its own ports and its own named stack; feature/search gets a different set. Nothing collides, because nothing shares a name or a number.

Then I told Conductor, for this one project, to run that script automatically whenever it creates a new worktree. I start a new piece of work and a complete, isolated, running copy of the entire stack comes up with it, no checklist, no "wait, which port was this one." The environment is born with the branch and dies with it.

Public ports derived from the branch, so two stacks never fight over the same one.
Container names matched to the worktree branch, so each stack is legible at a glance and nothing clobbers anything.
Isolated data, so no two worktrees ever share a database or a queue.
The whole stack per worktree (web app, API, database, queue, workers), not a stripped-down stand-in.

The hard part was the tests (and I was making dinner)

The dev server was the easy half. The genuinely hard part, and the part that decides whether any of this is worth doing, was getting the full test suite to run inside that same isolated Compose. Both layers: the fast unit tests and the slow, stateful end-to-end ones. A green checkmark is supposed to mean the code works, but that depends entirely on what it ran against. A green run against a database nine other agents are mutating means nothing. A green run against this worktree's own database, its own queue, its own services that no other process can touch, is a result you can actually trust.

How it got built is the whole point. I didn't sit there hand-wiring test harnesses into Compose. I shaped the problem, every agent verifies its own work against its own stack, no shared state, unit and end-to-end both, handed it to an agent, and went to make dinner. It did the fiddly, unglamorous plumbing while I cooked. That's what the job looks like now. You describe the outcome precisely enough that you can stand behind it, and then the building happens while you're off doing something else.

A green test run only means something if it ran against an environment no one else could touch.

Isolation is what makes the parallelism honest

With that in place, the parallelism actually does what it claims to. Each agent owns a worktree, a full running stack, and a test suite that proves its slice works in isolation. They stopped being ten autocomplete windows and became ten teammates who can each build something, run it, test it, and hand back work that has actually been checked.

This is the part that never makes the highlight reel. The demos show the swarm of agents and the code flying by, and they never show the environment underneath, because it isn't photogenic. But it's what decides whether running many agents is real engineering or an expensive way to generate merge conflicts. It's the unglamorous layer beneath every engineer becoming a manager of agents, and beneath the "staff specialized sub-agents" and "build the entire testing pyramid" pillars of professional vibe coding. The agents get all the attention, but the environment is where the actual work lives.

Then the constraint moves, to the one thing you can't refactor

And here is where it gets funny, in the way this whole era keeps being funny. Every time you knock down a constraint, the bottleneck doesn't go away, it just moves somewhere else. Building got cheap, so shaping the problem became the scarce skill, and once change got cheap the thinking moved back upstream. I knocked down the environment constraint, and the bottleneck sailed straight past every layer I know how to refactor and landed on physics.

6-10

coding agents in parallel

~4GB

memory per worktree stack

16 / 32GB

used on an M1 Pro

Each one of these per-worktree stacks costs me about four gigabytes of memory. The web app, the API, the database, the queue, the workers, multiply that by every branch I want alive at once and the number climbs fast. I'm sitting at sixteen gigabytes used and I can feel the ceiling getting close. My machine is an M1 Pro with thirty-two gigabytes, which until recently felt like plenty. The thing throttling how many agents I can run is no longer the model, the tooling, or my process. It's the silicon soldered to the board in front of me.

You can shape a problem. You can't shape a memory chip you already bought.

Where this goes next

I haven't solved this part yet, so I'll tell you what I'm actually weighing rather than pretend I have a tidy answer.

Shrink the footprint per worktree. Does every branch really need all five services running, or can the heavy, stateless ones be shared while only the parts a change actually touches spin up fresh?
Share the stateful pieces, namespace the data. One database and one queue across worktrees, with a schema or namespace per branch, trading a little isolation for a lot of memory.
Get the environments off the laptop. Keep the worktree local but run the heavy stack on a remote or cloud dev environment, the way ephemeral preview environments already work for deploys.
Buy more RAM. The least clever option, and sometimes the right one. When the constraint is hardware, the cheapest fix is occasionally just hardware.

Each of those trades something away, isolation, simplicity, or money, and I don't yet know which trade I'll make. That's genuinely where I am. Ask me in a month.

But notice the shape of the thing, because that's the real lesson and it has nothing to do with Docker. Going AI-native never removes the bottleneck. It keeps moving it, and the job is to keep finding where it went. For me, today, it's the memory in my laptop. Next month it'll be something else, somewhere I'm not looking yet. Chasing that down is the actual work, far more than the agents or the tooling ever were, and it's the work I do as an AI-native leader. If you're running agents in parallel too, I want to know where your constraint moved. Let's talk →