The Model Context Protocol's 2026-07-28 release candidate is remembered for deleting things — the session, the handshake, the client-side LLM call. Under that headline sits a change that adds capability, and it's the one most likely to bite a tool author who reads the changelog and celebrates: MCP tool schemas now speak full JSON Schema 2020-12. You can finally write oneOf. You can write $ref. And on the most common way agents run today, the model will cheerfully ignore both.

What SEP-2106 actually unlocks#

Until now, a tool's inputSchema was a restricted thing. You declared an object, some typed properties, a required list, and that was roughly the ceiling. If your tool genuinely accepted one of three shapes — a query by ID, or by name, or by a filter object — you couldn't say so. You flattened it into a soup of optional fields and a prose sentence in the description begging the model to pick a coherent subset.

SEP-2106 ends that. Input schemas keep the type: "object" root constraint but now allow composition — oneOf, anyOf, allOf — conditionals via if/then, and references through $ref and $defs. Output schemas go further: they're unrestricted, and structuredContent can now be any JSON value rather than only an object. On paper, MCP tool definitions just became as expressive as the Pydantic or Zod models you were already writing them from.

The reflex is to reach for the new toys. A discriminated union as a real oneOf. A recursive tree node as a $ref to itself. This is where the piece earns its keep, because the protocol just moved the expressiveness of a tool schema ahead of the enforcement of it — and nothing in the spec closes that gap for you.

MCP describes; it does not decode#

Here is the load-bearing fact, and it survived the stateless rewrite unchanged: MCP does not constrain the model's output. The protocol carries your schema to the client. The client's language model is what emits the tool-call arguments. Whether those arguments are forced to satisfy your schema — token by token, at generation time — is a property of the client's decoding backend, not of MCP. The server receives whatever the model produced and is expected to validate it.

So "will my oneOf be respected?" is not a question about the spec. It's a question about which model, behind which provider, the user happened to point at your server. And you, the tool author, do not control that.

The spec made tool schemas more expressive than the layer that's supposed to enforce them. The richest thing you can now declare is precisely the thing a hosted strict mode will refuse to guarantee.

The subset nobody advertises#

Point your MCP client at a hosted provider and the enforcement path is almost certainly strict structured outputs. That path is fast and, on the schemas it accepts, near-perfect — vendors quote roughly 100% compliance versus about 86% for unconstrained function calling. It buys that number by accepting only a subset of JSON Schema.

The subset is exactly the wrong shape for the news. OpenAI's Structured Outputs does not support oneOf, does not support allOf, and does not support $ref at all — no recursive schemas, period. anyOf is allowed only for nullable unions. Keywords like pattern, minimum, and format are accepted and then not enforced. Strict mode additionally demands that every property appear in required and that additionalProperties be false. Feed it the elegant 2020-12 schema the MCP spec now blesses, and it will reject the schema or, worse, quietly enforce only the parts it understands.

That's the trap. Your tool advertises a oneOf. The client strips it. The model, unconstrained on that branch, emits arguments that look valid and satisfy some flattened shape it invented — a call you never designed, arriving with the confidence of a green checkmark. The failure is silent because every layer did its job: the schema was legal, the transport was clean, the model was fluent.

Where the schema is real#

Flip to a self-hosted stack and the story inverts. The constrained-decoding engines that ship inside open inference servers are built on Earley parsers, and they handle composition and references — including recursion. XGrammar is the default structured-generation backend for vLLM, SGLang, and TensorRT-LLM; Microsoft's llguidance is a comparable Rust engine at roughly 50 microseconds per token. Both will genuinely constrain generation to a $ref-laden 2020-12 schema. The JSONSchemaBench results draw the same line from the other side: FSM-based engines like Outlines flatten recursion or time out on complex schemas, while the Earley-based engines enforce them.

So the enforcement gap isn't random. It tracks who runs the inference. Self-hosted with XGrammar or llguidance: your oneOf is law. Hosted strict mode: it's a suggestion. The same MCP server, the same tool definition, two different guarantees — decided entirely downstream of you.

What to actually do#

Treat the new expressiveness as documentation you also enforce, never as enforcement you can skip. Use oneOf and $ref — they make the tool honest and they do bind on self-hosted clients — but design every tool so a subset-enforcing client still can't produce an unsafe call, and validate every incoming argument on the server against the full schema. The model may or may not have been constrained; your handler must behave as if it wasn't.

And read the spec's own footnote as the warning it is: implementations must not auto-dereference external $ref URIs and should bound schema depth and validation time. References and deep nesting are a denial-of-service and SSRF surface; the expressiveness arrived with its own attack surface, which is why the caveat shipped in the same paragraph. The tools got a better vocabulary on 2026-07-28. Whether anything is listening when your tool speaks it is still, quietly, somebody else's decision.