Setup, teardown, and per-test fixtures
Status: maturing; schema committed. The schema additions documented here land in
schemas/v1.jsontoday, so configs that opt in early are valid and will not need editing once the runtime catches up. For shell-style setup and teardown that runs now, use thebeforeAll:/afterAll:hooks. The richersetup:/teardown:/setup_per_test:fixture surface on this page is.
Many MCP servers wrap stateful systems (databases, queues, file systems, external APIs). Tests for those servers need to seed state before the suite runs and clean up after. The fixture surface declares that orchestration inside the YAML file so the test definition lives in one place.
At a glance
setup: [SetupStep]runs once before any test in the file.setup_per_test: [SetupStep]runs before every test.teardown: [SetupStep]runs once after all tests finish.
Each SetupStep performs one of:
run: [argv]: a shell command, orcall: { tool, args? }: an MCP tool invocation against the file's default server.
Optional knobs per step: background, wait_for, timeout, capture_failure, always (teardown only).
Example 1: database seeding
The server wraps a Postgres-backed issue tracker. Tests need five issues pre-loaded; teardown wipes them.
# yaml-language-server: $schema=https://mcptest.sh/schema/v1.json
servers:
issues:
url: "http://localhost:8080/mcp"
setup:
- run: ["./scripts/seed-issues.sh", "5"]
capture_failure: "Seed script failed; cannot run tests"
timeout: "30s"
teardown:
- run: ["./scripts/cleanup-issues.sh"]
always: true # runs even if tests fail
tools:
- name: "list returns seeded issues"
server: issues
tool: list_issues
expect:
- target: "result.content"
matcher:
schema:
type: array
minItems: 5
capture_failure: hides the underlying shell error behind an operator-friendly message in the reporter output. always: true on the teardown step guarantees cleanup even when tests fail.
Example 2: background API stub
The server depends on a remote API. Tests start a local stub server, point the MCP server at it, then run.
# yaml-language-server: $schema=https://mcptest.sh/schema/v1.json
servers:
proxy:
command: ["node", "./mcp-proxy.js"]
env:
API_URL: "http://localhost:9999"
setup:
- run: ["node", "./test/stub-server.js"]
background: true
wait_for: "tcp://localhost:9999"
timeout: "30s"
teardown:
- run: ["pkill", "-f", "stub-server.js"]
always: true
tools:
- name: "fetches stub response"
server: proxy
tool: fetch
background: true launches the process without waiting for exit. The runner tracks the PID and cleans it up on exit, but the explicit teardown gives the operator a chance to surface failures from the shutdown. wait_for: polls a probe (TCP URL or HTTP path) until it succeeds before the next step runs.
Example 3: fresh state per test
Some tests must run against a freshly-initialized server. Today that takes manual style: stepwise ceremony; setup_per_test: makes it declarative.
# yaml-language-server: $schema=https://mcptest.sh/schema/v1.json
servers:
workspace:
command: ["./workspace-mcp"]
setup_per_test:
- call:
tool: reset_state
capture_failure: "reset_state failed; aborting test"
tools:
- name: "first write succeeds"
server: workspace
tool: write_file
args:
path: "/tmp/foo"
contents: "hello"
- name: "second write goes through"
server: workspace
tool: write_file
args:
path: "/tmp/foo"
contents: "world"
Each test sees a freshly reset workspace because setup_per_test: runs before each tool test. This pairs with run_options.restart_policy for process-level isolation.
Failure semantics
The committed failure semantics that the runtime will match:
- Setup fails: tests are skipped, exit code 2 (configuration error), teardown still runs.
- Test fails: tests fail normally, teardown still runs.
- Teardown fails: warning logged, exit code unchanged (we never mask a real test result behind a cleanup glitch).
Field reference
| Field | Type | Where | Meaning |
|---|---|---|---|
run | array of strings | step | Argv to spawn. Mutually exclusive with call. |
call.tool | string | step | MCP tool name. Mutually exclusive with run. |
call.args | object | step | JSON-serializable arguments. |
background | bool | step | Launch without waiting for exit; cleaned up on exit. |
wait_for | string | step | Readiness probe (TCP URL, HTTP path). |
timeout | duration string | step | Overall timeout for the step (30s). |
capture_failure | string | step | Reporter-friendly error message on failure. |
always | bool | step (teardown) | Run even when tests failed. |
The schema enforces exactly one of run: and call: per step. Setting both, or neither, raises a deserialization error.
Today: what the runner does
Hook-based setup and teardown (the beforeAll: / afterAll: / beforeEach: / afterEach: blocks, ) work today. The richer setup: / teardown: / setup_per_test: surface on this page parses correctly but the runner currently treats those blocks as a no-op. Tracked for the runtime milestone.
The schema is committed, so YAML you author today against this page remains valid through the future runtime release.
Roadmap
Runtime work still pending:
- Subprocess management for
run:steps with proper signal handling. - Background-process tracking and shutdown.
- Readiness-probe polling for
wait_for:. - Cassettes that do not record setup or teardown shell commands (only protocol exchanges).
- Reporter integration showing setup and teardown as separate phases.
These are planned for a future release.
References
docs/test-isolation.md, therun_options.restart_policystory that composes withsetup_per_test:- Docker compose pattern, for containerized fixtures