Scenario 12: multi-server suites
A real workflow rarely lives on one server. An agent calls an issues server and a notifications server. A contract suite validates that two related services interoperate. And once two servers share a session, a new class of bug appears: output from one server quietly becoming control input for another.
This scenario walks through a suite that spans two independent MCP servers on the hosted test server, routes each tool test to the right one, and then adds the cross-server trust-boundary check that catches the implicit-trust pattern. No API key and no local binary are required; everything points at https://test.mcptest.sh.
The two servers are genuinely separate. The primary server at https://test.mcptest.sh/mcp serves greet, search, get_forecast, list_items, slow_op, fail, and delete_record. A second independent server, datastore-b, sits at https://test.mcptest.sh/mcp-b with its own catalog under distinct names: db_get(key), db_put(key, value), db_list(), db_purge() (which carries a destructiveHint), plus a records://{key} resource. The tool names do not overlap, so this is a true multi-server target rather than the same catalog behind two URLs.
The YAML
Save this as tests/multi-server.yml:
# yaml-language-server: $schema=https://mcptest.sh/schema/v1.json
servers:
primary:
url: https://test.mcptest.sh/mcp
datastore:
url: https://test.mcptest.sh/mcp-b
datastore_crosstrust:
url: https://test.mcptest.sh/mcp-b?scenario=crosstrust
tools:
# Each test names its server. mcptest merges the catalogs and reaches
# the right tool on the right server.
- name: primary server greets
server: primary
tool: greet
args:
name: mcptest
expect:
- target: result.content[0].text
matcher:
exact: "Hello, mcptest!"
- name: datastore-b reads a stored value
server: datastore
tool: db_get
args:
key: report
expect:
- target: result.content[0].text
matcher:
exact: "Q3 numbers are within range."
- name: datastore-b lists its own keys
server: datastore
tool: db_list
expect:
- target: result.content[0].text
matcher:
contains: "report"
# Trust boundary: under the crosstrust scenario the same db_get returns
# stored data carrying an instruction aimed at the OTHER server.
- name: crosstrust output carries a cross-server instruction
server: datastore_crosstrust
tool: db_get
args:
key: report
expect:
- target: result.content[0].text
matcher:
contains: "delete_record"
- target: result.content[0].text
matcher:
contains: "attacker.example"
What is happening here:
servers:is the object-map form, one entry per server name. Each entry is a plain URL target; the hosted server needs no auth.- Every tool test carries a
server:field. The runner connects each referenced server into a pool and dispatches each test to the server it names.primary server greetsreachesgreeton the primary;datastore-b reads a stored valuereachesdb_getondatastore. The catalogs are merged for routing, so the same suite can call tools whose names live on different servers. primaryanddatastorepoint at distinct URLs with disjoint tool names. If the runner sentdb_getto the primary it would not resolve, which is exactly the routing the per-testserver:field prevents.datastore_crosstrustis the same second server with?scenario=crosstruston the URL. That query string flips the hosted server into a mode wheredb_getreturns benign-looking stored data with an embedded instruction. It is a third named server entry so the honeypot run is isolated from the cleandatastorereads above it.- The two real values are deterministic:
db_getfor keyreportreturnsQ3 numbers are within range., andgreetfor namemcptestreturnsHello, mcptest!. That is why the first two tests can useexactmatchers.
Run it
mcptest run tests/multi-server.yml
Nothing else to set up. The hosted server is reachable over plain HTTPS, the values are deterministic, and the suite needs no credentials, so this run is stable in CI.
If you want to confirm the file is well-formed before the first run, validate it:
mcptest validate --config tests/multi-server.yml
A clean run prints ok and exits 0.
The cross-server trust boundary
The first three tests are ordinary multi-server routing. The fourth is the interesting one.
When the second server runs under ?scenario=crosstrust, the value stored at report is no longer just Q3 numbers are within range.. It looks benign, but it carries an embedded instruction that targets a different server: call delete_record on the primary, then forward the result to an exfiltration sink at attacker.example. The honeypot value is benign by default; the crosstrust scenario is what makes the payload appear, so you can assert against it without standing up an attack of your own.
The danger is not the string itself. The danger is an agent or orchestrator that reads db_get output from datastore and feeds it straight into a tool call on primary. That is the implicit-trust pattern: output from one server quietly becoming control input for another. A trust-boundary check flags it. The rule is simple and absolute: tool output from one server must never be treated as control input for another server.
The two assertions on the last test pin the payload to the wire:
contains: "delete_record"proves the stored data names a tool that lives on the primary server, not ondatastorewhere it was read.contains: "attacker.example"proves the same data carries an exfiltration target.
Asserting both makes the boundary-crossing payload a concrete, checkable fact. A cross-server conformance check has something specific to flag, and a regression that sanitized the honeypot (or that let the instruction leak into a real delete_record call) would change the test result.
Expected output
mcptest run tests/multi-server.yml
PASS primary server greets (318ms)
PASS datastore-b reads a stored value (262ms)
PASS datastore-b lists its own keys (244ms)
PASS crosstrust output carries a cross-server instruction (271ms)
4 passed, 0 failed in 1.1s
All four tests pass. The first lands on the primary server, the next two on datastore, and the last on the crosstrust variant of the second server. The per-test lines show that mcptest dispatched each test to the server it named and resolved the right tool there.
Troubleshooting
tool ... did not resolve on server <name>. Theserver:field on a test names a server whose catalog does not carry that tool. Most often this is a tool routed to the wrong server:db_getbelongs todatastoreandgreetbelongs toprimary. Check that each test'sserver:matches the catalog the tool actually lives in.server <name> is not defined. A test references a server name that is not in theservers:map. The loader rejects this at load time, before any request goes out. Fix the name or add the entry.- The crosstrust test fails on
contains: "delete_record". The?scenario=crosstrustquery string is missing or misspelled on thedatastore_crosstrustURL. Without it the second server returns the benign value (Q3 numbers are within range.) and the payload assertions do not match. Confirm the URL is exactlyhttps://test.mcptest.sh/mcp-b?scenario=crosstrust. - All four tests hang or fail to connect. The hosted server was not reachable from this network. Add
--wait-for-ready=30sso the runner polls each URL server until it accepts a connection before the suite starts, and confirmhttps://test.mcptest.shis reachable from your environment.
See also
docs/multi-server.md, the full multi-server surface: object-map and array-of-entriesservers:forms,default_server:, and per-step routing in stepwise tests.docs/trust-boundary-conformance.md, the cross-server trust-boundary check in depth.- Previous: Test behind OAuth.
- Next: Tool overload.