What is S2SP Protocol?
The Server-to-Server Protocol (S2SP) is an open extension for the
Model Context Protocol (MCP) that enables
direct data exchange between MCP servers while keeping the agent in the loop for
decision-making only. An S2SP server is an MCP server — the S2SPServer
class embeds a FastMCP instance, so every S2SP server exposes standard MCP tools. The
@server.s2sp_resource_tool() decorator adds a transparent layer that separates the
control plane (abstract fields the agent reasons over) from the data plane
(full data that flows server-to-server over HTTP), so agents never need to see bulk payloads.
The Problem
Today, when an AI agent needs to move data between two MCP servers, every byte must flow through the agent itself. Consider a simple scenario: "Get weather alerts from the weather server and chart them on the analytics server." Here is what happens without S2SP:
- The agent calls the weather server's
get_alertstool and receives hundreds of full alert objects in its context. - The agent picks the alerts it cares about, then calls the analytics server's
draw_charttool, passing full alert data as a tool argument. - The LLM processes every token of that data twice — once on read, once on write — even though it only needed a few fields to make its decision.
This approach has three compounding costs:
- Token waste: Large payloads consume thousands of input and output tokens. Full alert objects with descriptions, coordinates, and metadata burn tokens the agent never reasons over.
- Latency: Data must serialize into the LLM's context window and back out again. Transfers that could complete in seconds over HTTP take minutes through the agent.
- Context saturation: Bulk data crowds out the reasoning tokens the model needs for its actual job — deciding what to do, not shuttling bytes.
The Solution
S2SP solves this by introducing a clean separation between the control plane (abstract fields the agent sees) and the data plane (full data that flows server-to-server):
Without S2SP With S2SP (abstract_domains) Agent Agent | | | 1. get_alerts(area="CA") | 1. get_alerts(area="CA", v | abstract_domains="event,severity") Weather Server v | Weather Server | 2. full alert objects | | in context | 2. only event+severity+_row_id v | + resource_url Agent v | Agent (filters by event) | 3. draw_chart(full data) | v | 3. draw_chart(abstract_data, Stats Server | resource_url) v Stats Server ---POST /s2sp/data---> Weather Server | (fetches full data directly) v Chart generated. Agent never saw full data.
The agent stays in the loop for reasoning and orchestration — it sees only the abstract fields it needs to make decisions. When another server needs the full data, it fetches it directly from the resource server's data plane endpoint, bypassing the agent entirely.
Key Concepts
S2SPServer and @s2sp_resource_tool()
S2SPServer embeds a FastMCP instance — every S2SP server is an MCP
server. You write normal MCP tools that return list[dict], then decorate them with
@server.s2sp_resource_tool(). Without any special arguments from the caller, the tool
behaves exactly like a standard MCP tool. But when the agent passes
abstract_domains (a comma-separated list of column names), S2SP activates: only
those columns plus a _row_id integer index are returned to the agent (the abstract).
The remaining columns become the body_domains and are cached on the server's
data plane. A resource_url are returned so another
server can fetch the body data later. An optional mode parameter controls
whether the body is returned inline ("sync") or only via later data-plane fetch
("async", the default).
Control Plane
The control plane is the agent's view of the data: standard MCP tool calls that return
only the abstract fields specified by abstract_domains. The agent uses these
lightweight results to reason, filter, and decide which records matter — without ever
seeing full payloads. No bulk data crosses the control plane.
Data Plane
The data plane carries body domains — the full data that the LLM never needs to see. It operates in two modes:
- Async mode: Body data is cached on the resource server. A downstream
server (e.g., Stats Server) fetches it directly via
POST <resource_url>with{"row_ids": [...], "columns": [...]}. The agent has no visibility into this channel — data flows server-to-server only. - Sync mode: Body data is returned inline alongside the abstract in a single tool response. The body passes through the agent's data channel (SDK layer) but is not sent to the LLM — only the abstract enters the LLM context. No server-to-server fetch is needed.
The Complete Flow
A typical S2SP interaction works like this:
- Agent calls
get_alerts(area="CA", abstract_domains="event,severity,urgency,status")on Weather Server. - Weather Server returns only those fields +
_row_idfor each alert, plus aresource_url. Full data is cached on the server. - Agent examines the abstract results and filters: picks the wind alert
_row_idvalues. - Agent tells Stats Server:
draw_chart(resource_url="http://...", row_ids="0,1,5"). - Stats Server
POSTs to the presignedresource_urldirectly (data plane), passing the selected row IDs. - Stats Server receives full alert data, generates the chart. The agent never saw the full data.
The Analogy
Think of S2SP like a manager reading executive summaries. Without S2SP, the manager reads every page of every report before deciding what to forward. With S2SP, the manager sees only a summary (control plane) — event name, severity, status — and says "send the full wind reports to the analyst." The analyst then pulls the complete files directly from the filing cabinet (data plane). The manager still decides what matters and who gets it, but never handles the raw data.
Who Benefits?
Developers
- Drop-in decorator — add
@server.s2sp_resource_tool()to an existing MCP tool and you are done. - Standard HTTP data plane —
POST <resource_url>, no proprietary transport. - Each server is a normal MCP server —
mcp devworks for debugging independently.
Agent Builders
- Dramatically reduce token costs — the agent only sees abstract fields, not full payloads.
- Keep the agent's context window free for reasoning and decision-making.
- Standard MCP tool interface — no new protocol to learn, just pass
abstract_domains.
End Users
- Faster results — transfers complete in seconds instead of minutes.
- Lower costs — no more paying for tokens that just shuttle data.
- Better accuracy — the agent's context is not polluted with raw data, so its reasoning is sharper.
Quick Start
Install the S2SP Python SDK:
pip install mcp-s2sp
Create an S2SP server with a tool — it is just a normal MCP tool with a decorator:
from mcp_s2sp import S2SPServer server = S2SPServer("weather-server") @server.s2sp_resource_tool() async def get_alerts(area: str) -> list[dict]: # Just return full data — S2SP handles the rest data = await fetch_from_nws(area) return [f["properties"] for f in data["features"]] server.run()
Without abstract_domains, the tool behaves like any standard MCP tool and
returns all fields. But when the agent wants only a summary for reasoning:
# Agent calls with abstract_domains to get only the fields it needs get_alerts(area="CA", abstract_domains="event,severity,headline") # Returns: only event, severity, headline, and _row_id for each alert # + resource_url for later data-plane fetch # Full data is cached on the Weather Server
The agent filters the abstract results, then tells another server to fetch the full data directly from the resource server:
# Agent filters abstract rows, passes them + resource ref to Stats Server: draw_chart( abstract_data=json.dumps(wind_rows), resource_url="http://weather:9001/s2sp/data/dK7x_...") # Stats Server calls POST /s2sp/data/dK7x_... on Weather Server # fetches full body data directly — agent never sees it
For a complete walkthrough, see the Python SDK documentation. For protocol internals, see Protocol Design.