What Does the MCP Stateless Spec Actually Change for Your Team?

The MCP 2026-07-28 spec removes the session handshake and Mcp-Session-Id header, making remote servers horizontally scalable for the first time. Here's what genuinely changes and what the deprecations mean in practice.

Cover art for What Does the MCP Stateless Spec Actually Change for Your Team?

MCP's TypeScript and Python SDKs hit 97 million monthly downloads in March 2026. There are over 9,400 public MCP servers in production. A protocol that large does not issue breaking changes casually. The release candidate for MCP 2026-07-28 does it anyway, and the reason is simple: the session model that made sense for a local assistant on your laptop has become the main obstacle to running MCP servers like ordinary web services.

People are calling it "MCP 2.0." That is overstated, but not by much. Understanding what actually changed - and what you have to do about it before July 28 - takes about ten minutes. Here it is.

Why sessions were the problem

Today, Streamable HTTP sessions are stateful. The client sends an initialize request, the server allocates a session, and all subsequent requests must reach that same server instance. This creates friction with load balancers that distribute requests across instances and autoscaling that spins up new instances which don't have existing sessions.

Teams work around this with sticky sessions, external session stores like Redis, or by running single-instance deployments. None of those options are good at scale. Sticky sessions defeat the purpose of a load balancer. A Redis session store adds a dependency and a failure mode. Single-instance deployments are not production infrastructure.

Hidden session state is convenient for servers, but opaque to models and harder to debug. Explicit handles make state visible. The model can reference it, reason over it, chain it between tools, and include it in multi-step flows. The stateless redesign makes both problems better at once.

What the 2026 MCP release candidate actually removes

SEP-2567 removes the Mcp-Session-Id header and the protocol-level session that came with it. SEP-2575 removes the initialize/initialized handshake - the connection setup that negotiated protocol version, client info, and capabilities at the start of every session. Those values now travel in _meta on every individual request.

The practical result: protocol version, client info, and capabilities now travel inline in a _meta field on each request. Any server instance can serve any request, so a remote MCP server can run behind a plain round-robin load balancer without sticky sessions or a shared session store.

Every Streamable HTTP request must now include Mcp-Method (e.g., tools/call) and Mcp-Name (the name of the tool or resource). This lets load balancers, API gateways, and rate limiters route on the operation without buffering and parsing the JSON-RPC body. L7 routing on a header is significantly cheaper than body inspection - your Cloudflare Workers and gateway configs will be simpler, not more complex.

The spec also standardizes client-side caching for the first time. tools/list, resources/list, and resources/read responses now carry ttlMs and cacheScope fields. Clients finally know how long a tool list is fresh and whether it's safe to share across users. Redundant polling drops significantly.

What gets deprecated - and whether it matters for you

Three features enter deprecation under the new lifecycle policy. They are Roots (replaced by tool parameters or config), Sampling (call the LLM provider API directly), and Logging (use stderr or OpenTelemetry). A formal deprecation policy guarantees a minimum twelve-month window between deprecation and removal, so none of these break on July 28.

Roots and Logging are relatively easy to route around. Sampling is the one worth thinking about. Sampling lets MCP server tools piggyback on the client's LLM for completions - a lightweight way to add inference without calling a separate API. The spec maintainers want servers to make direct LLM provider API calls instead. The argument is cleaner separation of concerns. Developers who built lightweight inference flows using Sampling are now looking at refactoring work.

Whether that refactor is annoying or trivial depends on how deeply Sampling is threaded through your server's logic. If you used it as a one-off convenience for a single tool, the migration is an afternoon. If Sampling is how your server does multi-step reasoning across tool calls, start the migration plan now.

Tasks also move in this release. Tasks shipped as an experimental core feature in 2025-11-25. Production use surfaced enough redesign that the right home for it is an extension rather than the specification. The Tasks extension reshapes the lifecycle around the stateless model: a server can answer tools/call with a task handle, and the client drives it with tasks/get, tasks/update, and tasks/cancel.

Anyone who shipped against the 2025-11-25 experimental Tasks API will need to migrate to the new lifecycle.

The extensions framework is the less-discussed change that matters most long-term

Ecosystems become brittle when every new capability has to land in the core. Either the core grows too fast and becomes hard to stabilise, or innovation slows down. MCP's answer is to make extensions official, versioned, and independently governed. They are identified by reverse-DNS IDs, negotiated through capabilities, and developed in their own lifecycle.

Extensions are identified by reverse-DNS IDs, negotiated through an extensions map on client and server capabilities, live in their own ext-* repositories with delegated maintainers, and version independently of the specification.

This matters because the protocol now has a stable place to put things that are genuinely useful but not yet stable enough for core. Tasks and MCP Apps are the first two inhabitants of that space. Future capabilities - long-running workflows, richer auth flows, observability hooks - can ship on their own timelines without forcing a core spec revision.

One of the official extensions in this release is MCP Apps, which allows servers to ship server-rendered HTML interfaces that hosts render inside sandboxed iframes. MCP Apps was already available; what this release adds is a formal governance home for it.

W3C Trace Context propagation in _meta is now documented, locking down the traceparent, tracestate, and baggage key names so distributed traces correlate across SDKs and gateways. A trace that starts in a host application can follow a tool call through the client SDK, the MCP server, and whatever the server calls downstream, and show up as a single span tree in an OpenTelemetry-compatible backend.

What your migration checklist actually looks like

The final specification will be published on July 28, 2026. The ten-week window is for SDK maintainers and client implementers to validate the changes against real workloads; under the SDK tier system, Tier 1 SDKs are expected to ship support within this window.

For most server maintainers, the work breaks into a short list:

  • Upgrade the SDK - Tier 1 SDKs will ship updated versions during the RC window
  • Remove session assumptions - find every place your server stores state between calls without an explicit handle; give that state a handle the client can pass back
  • Emit the new headers - Mcp-Method and Mcp-Name on all Streamable HTTP requests
  • Add ttlMs to list responses - lets clients cache tool lists correctly
  • Migrate Tasks - if you built against the 2025-11-25 experimental API, switch to the new tasks/get / tasks/update / tasks/cancel lifecycle
  • Harden auth - validate iss, handle dynamic client registration correctly, follow the updated OAuth 2.0 / OIDC guidance

Can your requests move across server instances without losing important context? If they can't, find the hidden session dependency now. Look at every place your server remembers something between tool calls. A repository path, browser session, task, or deployment environment should have an explicit handle or scope that the client can see, log, and pass back safely.

The spec change itself is not the risky part. The risky part is the hidden state you forgot you had. A teammate like Beagle, operating inside Slack as part of an agentic workflow, depends on exactly this kind of session-free, stateless call pattern - so when the plumbing underneath changes, it is worth auditing every assumption your MCP servers carry between turns.

A spec change is exactly the kind of thing that looks fine in unit tests and falls over in a live agent loop. Test against a real client, not just wire-format validation, before July 28.

Keep reading