Most protocol changes are invisible until something breaks. This one is different. The MCP release candidate locked on May 21, 2026, with a final specification due July 28. The headline change is deceptively technical: MCP is now stateless at the protocol layer. No more handshake. No more session ID. Every request stands on its own.
For teams actually running agents in production, that sentence ends about six months of infrastructure pain.
The problem nobody wanted to admit
The previous version of Streamable HTTP required the client to send an initialize request, the server to allocate a session, and all subsequent requests to reach that exact same server instance — creating friction with load balancers that distribute traffic and autoscaling that spins up new instances which have no existing sessions.
The workarounds were ugly. Teams patched around it with sticky sessions, external session stores like Redis, or by running single-instance deployments. None of these are great.
Evolving Streamable HTTP to run statelessly across multiple server instances was the single most requested change from teams running MCP in production. The maintainers knew it. They said so in their March roadmap. The release candidate delivers it.
What actually changed
The Mcp-Session-Id header and the protocol-level session that came with it are removed. With both gone, any MCP request can land on any server instance, and the sticky routing and shared session stores that horizontal deployments needed before are no longer required at the protocol layer.
That is not the same as saying your agent can no longer hold state. Removing the protocol-level session does not mean your application has to be stateless. Servers that need to carry state across calls can mint an explicit handle from a tool and have the model pass it back as an ordinary argument on later calls.
The Streamable HTTP transport now requires Mcp-Method and Mcp-Name headers so load balancers, gateways, and rate-limiters can route on the operation without inspecting the body — and servers reject requests where the headers and body disagree. That detail matters: it means your gateway can finally do its job without needing to understand MCP-specific JSON-RPC bodies.
List and resource read results now carry ttlMs and cacheScope, modeled on HTTP Cache-Control. Clients know exactly how long a tools/list response is fresh and whether it is safe to share across users — a long-lived SSE stream is no longer the only way to learn that a list changed.
The invisible cost of the old model was not the session itself. It was the infrastructure contortion required to keep that session alive.
The other shoe dropped the same week
Two days before the RC locked, Anthropic shipped a related move for teams not yet building their own MCP infrastructure. Anthropic expanded Claude Managed Agents with self-hosted sandboxes and MCP tunnels, letting companies move their AI agents' tool execution into their own infrastructure.
The tunnels piece is worth understanding precisely. With MCP tunnels, agents reach MCP servers inside a private network without exposing them to the public internet. Internal databases, private APIs, knowledge bases, and ticketing systems become tools the agent can call. A lightweight gateway makes a single outbound connection — no inbound firewall rules, no public endpoints, traffic encrypted end to end.
Agent orchestration — context management, error handling, and the actual agent loop — stays on Anthropic's infrastructure. A fully on-premise deployment of the agents is not yet possible. That limitation will matter to some regulated organizations. But for the majority of teams whose blocker has been "we can't expose our internal Jira or Postgres to the open internet just so an agent can call it," tunnels dissolve the constraint.
Self-hosted sandboxes are available as a public beta. MCP tunnels are only a research preview, and companies need to request access.
What this means for teams wiring agents into their work
These two announcements are solving the same underlying problem from two different angles. The RC fixes the protocol layer so MCP servers can run at scale on ordinary infrastructure without heroic workarounds. The tunnels feature handles the access problem — your agent can reach the thing it needs without your security team having to open inbound ports to the world.
Enterprises deploying MCP have been running into a predictable set of problems: audit trails, SSO-integrated auth, gateway behavior, and configuration portability.
The RC does not solve all of that.
W3C Trace Context propagation in _meta is now documented, locking down the traceparent, tracestate, and baggage key names so distributed traces correlate across SDKs and gateways.
That is real progress on observability. Auth and config portability are still on the roadmap rather than in the spec.
Teams building with an AI teammate in channels like Slack or Teams will feel this indirectly. A teammate like Beagle relies on MCP-connected systems — calendars, tickets, knowledge bases — being reliably reachable. When those servers require sticky sessions and single-instance deployments, they become fragile dependencies. The RC removes that fragility from the protocol itself.
The practical question right now
If your agentic system depends on MCP servers, this is not a spec-reading exercise. Test the parts that may fail first. Can your requests move across server instances without losing important context? If they cannot, find the hidden session dependency now.
Migrate at the SDK's pace; nothing in this release forces a hot cut-over.
Roots, Sampling, and Logging are deprecated rather than removed and continue to work for at least 12 months. Existing tools, resources, and prompts keep working.
The ten-week validation window before July 28 is genuinely useful. Look at every place your server remembers something between tool calls. A repository path, browser session, task, or deployment environment should have an explicit handle or scope that the client can see, log, and pass back safely.
The MCP project has been making practical bets all year: keep the transport small, push enterprise concerns to extensions, let production failures teach the spec what to fix. The stateless RC is the clearest evidence yet that the feedback loop is working.