What is S2SP Protocol?

The Server-to-Server Protocol (S2SP) is an open extension for the Model Context Protocol (MCP) that enables direct data exchange between MCP servers while keeping the agent in the loop for decision-making only. An S2SP server is an MCP server — the S2SPServer class embeds a FastMCP instance, so every S2SP server exposes standard MCP tools. The @server.s2sp_resource_tool() decorator adds a transparent layer that separates the control plane (abstract fields the agent reasons over) from the data plane (full data that flows server-to-server over HTTP), so agents never need to see bulk payloads.

The Problem

Today, when an AI agent needs to move data between two MCP servers, every byte must flow through the agent itself. Consider a simple scenario: "Get weather alerts from the weather server and chart them on the analytics server." Here is what happens without S2SP:

The agent calls the weather server's get_alerts tool and receives hundreds of full alert objects in its context.
The agent picks the alerts it cares about, then calls the analytics server's draw_chart tool, passing full alert data as a tool argument.
The LLM processes every token of that data twice — once on read, once on write — even though it only needed a few fields to make its decision.

This approach has three compounding costs:

Token waste: Large payloads consume thousands of input and output tokens. Full alert objects with descriptions, coordinates, and metadata burn tokens the agent never reasons over.
Latency: Data must serialize into the LLM's context window and back out again. Transfers that could complete in seconds over HTTP take minutes through the agent.
Context saturation: Bulk data crowds out the reasoning tokens the model needs for its actual job — deciding what to do, not shuttling bytes.

The Solution

S2SP solves this by introducing a clean separation between the control plane (abstract fields the agent sees) and the data plane (full data that flows server-to-server):

  Without S2SP                         With S2SP (abstract_domains)

  Agent                               Agent
    |                                   |
    |  1. get_alerts(area="CA")         |  1. get_alerts(area="CA",
    v                                   |       abstract_domains="event,severity")
  Weather Server                        v
    |                                 Weather Server
    |  2. full alert objects            |
    |     in context                    |  2. only event+severity+_row_id
    v                                   |     + resource_url
  Agent                                 v
    |                                 Agent (filters by event)
    |  3. draw_chart(full data)         |
    v                                   |  3. draw_chart(abstract_data,
  Stats Server                          |       resource_url)
                                        v
                                      Stats Server ---POST /s2sp/data---> Weather Server
                                        |          (fetches full data directly)
                                        v
                                      Chart generated. Agent never saw full data.

The agent stays in the loop for reasoning and orchestration — it sees only the abstract fields it needs to make decisions. When another server needs the full data, it fetches it directly from the resource server's data plane endpoint, bypassing the agent entirely.

Key Concepts

S2SPServer and @s2sp_resource_tool()

S2SPServer embeds a FastMCP instance — every S2SP server is an MCP server. You write normal MCP tools that return list[dict], then decorate them with @server.s2sp_resource_tool(). Without any special arguments from the caller, the tool behaves exactly like a standard MCP tool. But when the agent passes abstract_domains (a comma-separated list of column names), S2SP activates: only those columns plus a _row_id integer index are returned to the agent (the abstract). The remaining columns become the body_domains and are cached on the server's data plane. A resource_url are returned so another server can fetch the body data later. An optional mode parameter controls whether the body is returned inline ("sync") or only via later data-plane fetch ("async", the default).

Control Plane

The control plane is the agent's view of the data: standard MCP tool calls that return only the abstract fields specified by abstract_domains. The agent uses these lightweight results to reason, filter, and decide which records matter — without ever seeing full payloads. No bulk data crosses the control plane.

Data Plane

The data plane carries body domains — the full data that the LLM never needs to see. It operates in two modes:

Async mode: Body data is cached on the resource server. A downstream server (e.g., Stats Server) fetches it directly via POST <resource_url> with {"row_ids": [...], "columns": [...]}. The agent has no visibility into this channel — data flows server-to-server only.
Sync mode: Body data is returned inline alongside the abstract in a single tool response. The body passes through the agent's data channel (SDK layer) but is not sent to the LLM — only the abstract enters the LLM context. No server-to-server fetch is needed.

The Complete Flow

A typical S2SP interaction works like this:

Agent calls get_alerts(area="CA", abstract_domains="event,severity,urgency,status") on Weather Server.
Weather Server returns only those fields + _row_id for each alert, plus a resource_url. Full data is cached on the server.
Agent examines the abstract results and filters: picks the wind alert _row_id values.
Agent tells Stats Server: draw_chart(resource_url="http://...", row_ids="0,1,5").
Stats Server POSTs to the presigned resource_url directly (data plane), passing the selected row IDs.
Stats Server receives full alert data, generates the chart. The agent never saw the full data.

The Analogy

Think of S2SP like a manager reading executive summaries. Without S2SP, the manager reads every page of every report before deciding what to forward. With S2SP, the manager sees only a summary (control plane) — event name, severity, status — and says "send the full wind reports to the analyst." The analyst then pulls the complete files directly from the filing cabinet (data plane). The manager still decides what matters and who gets it, but never handles the raw data.

Who Benefits?

Developers

Drop-in decorator — add @server.s2sp_resource_tool() to an existing MCP tool and you are done.
Standard HTTP data plane — POST <resource_url>, no proprietary transport.
Each server is a normal MCP server — mcp dev works for debugging independently.

Agent Builders

Dramatically reduce token costs — the agent only sees abstract fields, not full payloads.
Keep the agent's context window free for reasoning and decision-making.
Standard MCP tool interface — no new protocol to learn, just pass abstract_domains.

End Users

Faster results — transfers complete in seconds instead of minutes.
Lower costs — no more paying for tokens that just shuttle data.
Better accuracy — the agent's context is not polluted with raw data, so its reasoning is sharper.

Quick Start

Install the S2SP Python SDK:

$ pip install mcp-s2sp

Create an S2SP server with a tool — it is just a normal MCP tool with a decorator:

from mcp_s2sp import S2SPServer

server = S2SPServer("weather-server")

@server.s2sp_resource_tool()
async def get_alerts(area: str) -> list[dict]:
    # Just return full data — S2SP handles the rest
    data = await fetch_from_nws(area)
    return [f["properties"] for f in data["features"]]

server.run()

Without abstract_domains, the tool behaves like any standard MCP tool and returns all fields. But when the agent wants only a summary for reasoning:

# Agent calls with abstract_domains to get only the fields it needs
get_alerts(area="CA", abstract_domains="event,severity,headline")

# Returns: only event, severity, headline, and _row_id for each alert
#          + resource_url for later data-plane fetch
# Full data is cached on the Weather Server

The agent filters the abstract results, then tells another server to fetch the full data directly from the resource server:

# Agent filters abstract rows, passes them + resource ref to Stats Server:
draw_chart(
    abstract_data=json.dumps(wind_rows),
    resource_url="http://weather:9001/s2sp/data/dK7x_...")

# Stats Server calls POST /s2sp/data/dK7x_... on Weather Server
# fetches full body data directly — agent never sees it

For a complete walkthrough, see the Python SDK documentation. For protocol internals, see Protocol Design.