The Model Context Protocol describes how an agent talks to a tool server, and most of that conversation is the same regardless of where the server lives: JSON-RPC messages, back and forth. What changes is the pipe the messages travel through. MCP defines three, and the history of how it got from two to three is the most useful thing a developer can know before deploying a remote server — because the marketing story and the spec disagree on what you actually get.

stdio: the local default

The simplest transport is stdio, and it's the one most MCP servers you've used actually run on. The client launches the server as a subprocess and they talk over its standard input and output. Messages are JSON-RPC, UTF-8, one per line — "delimited by newlines, and MUST NOT contain embedded newlines." stderr is reserved for logging. There's a rule that bites newcomers immediately: the server "MUST NOT write anything to its stdout that is not a valid MCP message," so a stray print() or a logging library that defaults to stdout will corrupt the stream and break the session.

stdio has no network surface, no auth to configure, no ports. If your server runs on the same machine as the client — an editor plugin, a local filesystem tool — this is the right answer and you can stop reading the transport spec here.

The two-endpoint design that got deprecated

Remote servers are where it gets interesting, and where MCP changed its mind. The original remote transport, from the 2024-11-05 spec, was HTTP+SSE, and it used two endpoints. The client opened a long-lived GET connection to an SSE stream; the server's first event over that stream was an endpoint event telling the client which separate POST URL to send its requests to. So every session held a persistent GET stream open for server-to-client messages while firing POSTs at a different URL for client-to-server messages.

That worked, and it scaled badly. A long-lived connection per client is exactly the thing serverless platforms hate — connections drop, instances scale to zero, sessions desync on cold start. Two coupled endpoints also make routing awkward. So in the 2025-03-26 revision the spec did something unusually clean: it "replaced the previous HTTP+SSE transport with a more flexible Streamable HTTP transport." Deprecated, not patched.

Streamable HTTP: one endpoint, two response shapes

Streamable HTTP collapses everything onto "a single HTTP endpoint path that supports both POST and GET methods."

The mechanics are worth knowing precisely, because they're the whole pitch. Every client message is a fresh HTTP POST. For a message that needs no reply (a notification, a response), the server returns 202 Accepted and that's it. For a request that needs an answer, the server gets to choose: it "MUST either return Content-Type: text/event-stream" — an SSE stream, for when it wants to send progress events and a final result — "or Content-Type: application/json," a single plain reply, for when it's a quick call. One POST, two possible response shapes, picked per request.

Server-initiated messages, the thing the old GET stream existed for, become an optional GET: the client MAY open an SSE stream with a GET if it wants the server to push to it. Sessions are optional too — the server MAY hand out an Mcp-Session-Id header at initialization. And streams are resumable: the server tags SSE events with IDs, and a client that drops can reconnect with a Last-Event-ID header to get redelivery instead of starting over.

This is genuinely better than HTTP+SSE. It's also where the misunderstanding lives.

"Streamable HTTP" gets sold as stateless and serverless-ready. That property only holds if you never turn on the features that make it useful.

The serverless trap

Here is the part that the comparison posts skip. Streamable HTTP can be stateless — run it with no Mcp-Session-Id and no SSE resumption, treat every POST as self-contained, and yes, it scales to zero and any instance can serve any request. That's the configuration the "serverless-friendly" claim is true for.

But the features people reach for Streamable HTTP to get — durable sessions, resumable streams that survive a flaky network — are exactly the ones that reintroduce state. Turn on Mcp-Session-Id and your load balancer now has to pin each session to the one instance that holds it: sticky routing, the enemy of horizontal autoscaling. Turn on resumability and your gateway has to track event IDs. Route correctly across instances and your proxy has to parse the full JSON-RPC payload rather than doing ordinary HTTP routing. These are the same scaling constraints that made HTTP+SSE painful on serverless — they've just moved from "always on" to "on as soon as you use sessions."

The MCP maintainers know this. The official transport blog from late 2025 frames the next evolution as making the protocol stateless even while applications stay stateful — folding initialization data into each request so no server-side session storage or sticky sessions are required, with spec work targeted through 2026. Until that lands, the rule is blunt: choosing Streamable HTTP does not make your server serverless. Running it without sessions does.

What to actually do

Local server, same machine as the client: stdio, and keep your logging off stdout. Remote server: Streamable HTTP — the old HTTP+SSE transport is deprecated and you should only implement it for backwards compatibility with old clients. Then make one deliberate decision before you deploy: stateless mode for true serverless scale, or sessions-and-resumability for connection durability, knowing the second one buys you the sticky-routing problem. And whichever you pick, secure it like a real endpoint — validate the Origin header against DNS-rebinding attacks (the 2025-11-25 spec now requires a 403 on a bad Origin), bind local servers to 127.0.0.1, and put real authentication in front of anything remote.

The transport is the cheapest part of MCP to get right and the easiest to get subtly wrong. The spec tells you exactly how each one behaves. The only thing it can't tell you is that the convenient features and the scalable architecture are, for now, the same dial turned two different ways.