mcptest docs GitHub

Setup, teardown, and per-test fixtures

Status: maturing; schema committed. The schema additions documented here land in schemas/v1.json today, so configs that opt in early are valid and will not need editing once the runtime catches up. For shell-style setup and teardown that runs now, use the beforeAll: / afterAll: hooks. The richer setup: / teardown: / setup_per_test: fixture surface on this page is.

Many MCP servers wrap stateful systems (databases, queues, file systems, external APIs). Tests for those servers need to seed state before the suite runs and clean up after. The fixture surface declares that orchestration inside the YAML file so the test definition lives in one place.

At a glance

Each SetupStep performs one of:

Optional knobs per step: background, wait_for, timeout, capture_failure, always (teardown only).

Example 1: database seeding

The server wraps a Postgres-backed issue tracker. Tests need five issues pre-loaded; teardown wipes them.

# yaml-language-server: $schema=https://mcptest.sh/schema/v1.json

servers:
  issues:
    url: "http://localhost:8080/mcp"

setup:
  - run: ["./scripts/seed-issues.sh", "5"]
    capture_failure: "Seed script failed; cannot run tests"
    timeout: "30s"

teardown:
  - run: ["./scripts/cleanup-issues.sh"]
    always: true  # runs even if tests fail

tools:
  - name: "list returns seeded issues"
    server: issues
    tool: list_issues
    expect:
      - target: "result.content"
        matcher:
          schema:
            type: array
            minItems: 5

capture_failure: hides the underlying shell error behind an operator-friendly message in the reporter output. always: true on the teardown step guarantees cleanup even when tests fail.

Example 2: background API stub

The server depends on a remote API. Tests start a local stub server, point the MCP server at it, then run.

# yaml-language-server: $schema=https://mcptest.sh/schema/v1.json

servers:
  proxy:
    command: ["node", "./mcp-proxy.js"]
    env:
      API_URL: "http://localhost:9999"

setup:
  - run: ["node", "./test/stub-server.js"]
    background: true
    wait_for: "tcp://localhost:9999"
    timeout: "30s"

teardown:
  - run: ["pkill", "-f", "stub-server.js"]
    always: true

tools:
  - name: "fetches stub response"
    server: proxy
    tool: fetch

background: true launches the process without waiting for exit. The runner tracks the PID and cleans it up on exit, but the explicit teardown gives the operator a chance to surface failures from the shutdown. wait_for: polls a probe (TCP URL or HTTP path) until it succeeds before the next step runs.

Example 3: fresh state per test

Some tests must run against a freshly-initialized server. Today that takes manual style: stepwise ceremony; setup_per_test: makes it declarative.

# yaml-language-server: $schema=https://mcptest.sh/schema/v1.json

servers:
  workspace:
    command: ["./workspace-mcp"]

setup_per_test:
  - call:
      tool: reset_state
    capture_failure: "reset_state failed; aborting test"

tools:
  - name: "first write succeeds"
    server: workspace
    tool: write_file
    args:
      path: "/tmp/foo"
      contents: "hello"
  - name: "second write goes through"
    server: workspace
    tool: write_file
    args:
      path: "/tmp/foo"
      contents: "world"

Each test sees a freshly reset workspace because setup_per_test: runs before each tool test. This pairs with run_options.restart_policy for process-level isolation.

Failure semantics

The committed failure semantics that the runtime will match:

Field reference

FieldTypeWhereMeaning
runarray of stringsstepArgv to spawn. Mutually exclusive with call.
call.toolstringstepMCP tool name. Mutually exclusive with run.
call.argsobjectstepJSON-serializable arguments.
backgroundboolstepLaunch without waiting for exit; cleaned up on exit.
wait_forstringstepReadiness probe (TCP URL, HTTP path).
timeoutduration stringstepOverall timeout for the step (30s).
capture_failurestringstepReporter-friendly error message on failure.
alwaysboolstep (teardown)Run even when tests failed.

The schema enforces exactly one of run: and call: per step. Setting both, or neither, raises a deserialization error.

Today: what the runner does

Hook-based setup and teardown (the beforeAll: / afterAll: / beforeEach: / afterEach: blocks, ) work today. The richer setup: / teardown: / setup_per_test: surface on this page parses correctly but the runner currently treats those blocks as a no-op. Tracked for the runtime milestone.

The schema is committed, so YAML you author today against this page remains valid through the future runtime release.

Roadmap

Runtime work still pending:

These are planned for a future release.

References