An agent just wrote a Python script. In a moment it's going to run somewhere you own. The reflex is to ask how do I sandbox this — and to reach for the usual ranking, where lightweight runtimes are fast-but-weak and virtual machines are slow-but-strong, and you pick a point on that line.

That line is a trap. The substrates competing to host your agent's code — WebAssembly, V8 isolates, and microVMs — don't differ mainly in speed. They differ in what they assume the code is allowed to touch. And that single difference, not latency, is what should pick your architecture. The MCP-code-execution shift made this urgent: the moment you let the model write code instead of emitting one tool call at a time, the protocol stops being the bottleneck and the sandbox becomes the product.

Two security models, wearing three costumes#

Containers and microVMs do confinement. You boot a whole operating system — a real Linux guest, with a real kernel, that can run literally anything: any compiled binary, any pip install with a native C extension, any syscall — and then you wall it off. Firecracker does this beautifully: its spec guarantees under 125ms from the start API call to the guest's /sbin/init, with a hypervisor (KVM) as the trust boundary instead of a shared host kernel. E2B builds its agent sandboxes on exactly this, snapshotting pre-warmed microVMs to provision in ~150ms. You get total compatibility. You pay with a guest kernel, a rootfs, and a fleet to operate.

WASM and V8 isolates do the opposite thing: capability. They start the code with zero ambient authority and grant access explicitly. The WASI design principles state it plainly — "there are no global namespaces at runtime, and no global functions at link time." A WASM module can't open a file, hit the network, or make a syscall unless you handed it that specific capability, as an unforgeable handle. Wasmtime's security model is blunt about the consequence: "There is no raw access to system calls or other forms of I/O." V8 isolates — the engine under Cloudflare Workers — apply the same posture at a different layer: thousands of lightweight contexts share one OS process, each memory-isolated, none able to reach a filesystem or a kernel.

Confinement boots a full machine and takes things away. Capability starts with nothing and hands things over. That's not a speed dial — it's a different bet about where danger lives.

This is better security when it fits, not just faster. Deny-by-default beats confine-after-the-fact: there's no operation to forget to restrict, because nothing is permitted until you permit it. The cost is what the code can be. Capability runtimes run JS and WASM, full stop — and that ceiling is sharp.

Where the capability ceiling actually bites#

Pyodide is the honest case study. It runs CPython compiled to WASM and ships pre-built wheels for NumPy, pandas, SciPy, scikit-learn — so a lot of "data science in a sandbox" just works. But an arbitrary pip install of a C-extension package fails unless it's been cross-compiled for Pyodide, and threading, multiprocessing, subprocess, and raw sockets raise errors. So the model writes correct code — import some_obscure_native_lib — and it fails not on a security violation but on a capability gap: the sandbox refused an operation that was never a threat. Confinement never does this; a real Linux box runs whatever the model dreamed up. That asymmetry — capability fails closed on capability, not just on attack — is the under-discussed tax of the lightweight path.

V8 isolates carry their own ceiling and their own ghost. The Workers security model is candid that because tenants share a process, "Spectre attacks are possible." The mitigations are clever and load-bearing: Date.now() is frozen while your code runs so you can't build a timer, multithreading and shared memory are banned so you can't race one, and any worker with suspicious performance metrics gets rescheduled into its own process. It works — but notice the boundary you're trusting is a set of side-channel countermeasures, not a hypervisor.

What you're really trusting#

Here's the turn nobody slides into the pitch. We tell ourselves microVMs are "strong" and WASM is "lightweight," as if strength were a scalar. It isn't. Confinement trusts the hypervisor. Capability trusts the runtime's compiler. WASM's real-world escapes haven't been breaks in the capability spec at all — they've been miscompilation bugs. CVE-2026-34971 was a critical aarch64 Cranelift bug where a guest could pass a bounds check and then load a different address, yielding an arbitrary read/write primitive — a clean escape, born not in WASM's design but in the JIT that lowers it to machine code. So "is WASM safe?" decomposes into "is your compiler correct on your architecture?" — a very different audit than "is KVM sound?"

That reframing dissolves the whole speed-versus-security argument. The decision was never about milliseconds. It's: **is the model writing code that orchestrates your tools, or code that needs the whole world?**

Cloudflare answered by refusing to choose. Its Dynamic Workers — pitched, precisely, as "sandboxing AI agents, 100x faster" — spin a fresh V8 isolate per task in milliseconds for the common case: the model emits TypeScript that chains your governed tools. And when that's not enough, Cloudflare's Containers-based Sandbox SDK runs arbitrary Python with a real filesystem. Two products, split by code type, because no single substrate is right for both.

So stop ranking them. Ask what the model's code assumes exists. If it's gluing your APIs together and crunching some numbers, a capability runtime is faster, denser, and arguably safer — and you should want it even if the microVM were free. If it might reach for an arbitrary native package or a shell, you need a real operating system, which means confinement, which means a microVM, which means a fleet. The substrate is downstream of one honest question about your own agent — and most teams answer it by accident, then spend a quarter paying for the guess.