Getting started with mcptest

Install mcptest and get your first passing test in under five minutes.

This guide gives you three install tracks. Pick the one that matches how you already work, run the canonical five-step workflow at the end of the track, and stop reading. Every track ends at the same place: mcptest run passing against a fixture MCP server with three green tests.

File issues on GitHub when you find gaps.

Prefer a real server to poke at? examples/reference-server/ is a small, runnable MCP server (five tools, two resources) with a smoke suite. Clone the repo, install it, and run the bundled tests:

cd examples/reference-server && npm install && cd -
mcptest run --config examples/reference-server/tests/smoke.yml

Or let an LLM write the first suite. Hand a model your tool list and the prompt at Generate tests with an LLM, then run mcptest validate on what it produces.

Pick a track

Track 1: direct install on macOS, Linux, or Windows via Homebrew, Cargo, or the install script. Pick this if you run mcptest from a developer laptop or a build box.
Track 2: Docker via docker run soapbucket/mcptest:latest. Pick this if you want a hermetic image, or you already build everything in a container.
Track 3: GitHub Actions via the install script in a workflow step. Pick this if you are wiring mcptest into a pull-request gate today.

After any track lands you on the same mcptest run output, jump to the canonical workflow and the what is next pointers.

Track 1: direct install (macOS, Linux, Windows)

Pick whichever package manager you already trust.

Homebrew

brew install soapbucket/tap/mcptest
mcptest --version

The tap publishes a signed bottle for Intel and Apple Silicon Macs and an x86_64 Linux build. The formula has no native dependencies.

Cargo

cargo install mcptest
mcptest --version

Cargo compiles from source. Use this when you want to track a specific git ref (cargo install --git https://github.com/soapbucket/mcptest --tag v1.0.0) or you need a binary on an architecture the prebuilt releases do not cover.

Install script (curl)

curl -fsSL https://download.mcptest.sh/install.sh | sh
mcptest --version

The script downloads the matching prebuilt release for your platform, verifies the SHA256 against the signed manifest, and drops the mcptest binary on your $PATH. macOS, Linux, and Windows (under Git Bash or WSL) are all covered. The script never asks for root unless $PATH requires it.

Direct binary download

Every release tag publishes signed tarballs and a SHA256SUMS file at github.com/soapbucket/mcptest/releases. Pick the archive for your platform, extract mcptest onto your $PATH, and you are done. The release pipeline cross-compiles for Linux x86_64 and aarch64, macOS x86_64 and aarch64, and Windows x86_64.

Verify the install

mcptest --version
mcptest --help

mcptest --version should print the release tag. mcptest --help lists every subcommand. Per-subcommand detail lives at mcptest <subcommand> --help.

Now run the canonical workflow.

Track 2: Docker

docker run --rm \
  -v "$PWD:/work" -w /work \
  soapbucket/mcptest:latest --version

The image is a static binary on gcr.io/distroless/static. It is small, it has no shell, and it expects your test YAML to be mounted at the working directory. Replace --version with any other subcommand, exactly as you would on the host. Mount ~/.config or ~/.aws read-only if your tests need host credentials.

For day-to-day use, alias the long incantation:

alias mcptest='docker run --rm -v "$PWD:/work" -w /work soapbucket/mcptest:latest'
mcptest --version

With the alias in place, the rest of this guide reads the same as Track 1.

Now run the canonical workflow.

Track 3: GitHub Actions (CI)

Install the CLI with the install script in a workflow step, then run it. Add .github/workflows/mcptest.yml to your repository:

name: mcptest
on:
  pull_request:
  push:
    branches: [main]

jobs:
  mcptest:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Install mcptest
        run: curl -fsSL https://download.mcptest.sh/install.sh | MCPTEST_VERSION=v1.0.0 sh
      - run: mcptest doctor
      - run: mcptest run --reporter junit --output mcptest.junit.xml
      - if: always()
        uses: actions/upload-artifact@v4
        with:
          name: mcptest-results
          path: mcptest.junit.xml

The install step puts the pinned mcptest binary on $PATH. The remaining steps are the same calls you would run locally. The JUnit output rolls up cleanly in the GitHub pull-request UI.

For GitLab CI or CircleCI, run the Docker image directly:

# .gitlab-ci.yml
mcptest:
  image: soapbucket/mcptest:latest
  script:
    - mcptest doctor
    - mcptest run --reporter junit --output mcptest.junit.xml
  artifacts:
    when: always
    reports:
      junit: mcptest.junit.xml

Now run the canonical workflow on your laptop to build the test suite the CI job will execute.

The canonical workflow

Every track ends with the same five-step loop. Run it once after install and you should see three passing tests in well under five minutes.

1. Install

You already did this. Verify with mcptest --version.

2. `mcptest doctor`

mcptest doctor

doctor checks your environment and prints a one-screen report:

Which mcptest binary is on the path, and its version.
Whether any MCP servers are auto-discoverable from your Claude Desktop, Claude Code, or Cursor configuration files.
Whether the JSON Schema at https://mcptest.sh/schema/v1.json is reachable.
Whether your shell environment has the variables a typical test suite reads (MCPTEST_* plus any *_API_KEY your discovered servers reference).

The report exits non-zero if a hard prerequisite is missing. Fix what it asks for before moving on.

3. `mcptest init`

mcptest init

init scaffolds a new test suite in the current directory:

.
├── mcptest.yml
└── tests/
    └── example.yaml

mcptest.yml holds project defaults (reporter, parallelism, timeout). It is not a test suite, so you do not point mcptest run at it. tests/example.yaml is the hello-world suite: it exercises the well-known @modelcontextprotocol/server-filesystem stdio server, so mcptest run --config tests/example.yaml passes out of the box if npx is on your $PATH. Swap the server command: for your own server when you are ready. Two flags adjust the scaffold:

mcptest init --url https://mcp.example.com/v1 --auth oauth2 emits a URL-target suite with the auth recipe wired in (see URL targets).
mcptest init --with-jury appends a commented evals: block that shows the v1.1 LLM-jury YAML shape.

init refuses to overwrite existing files; pass --force to replace them.

4. `mcptest run`

Point run at the suite init scaffolded:

mcptest run --config tests/example.yaml

Sample output from the pretty reporter:

mcptest 1.0.0 run

  [PASS] search returns at least one result  (124 ms)

Summary: 1 passed, 0 failed, 0 skipped in 124 ms

The default pretty reporter prints one line per test plus a summary, and on a failure prints the expectation, the actual value, and a hint inline under the failing test. For a compact one-line count (ran 1 tests: 1 passed, ...), pass --reporter minimal.

With no --config, mcptest run looks for mcptest.yaml (note the .yaml spelling) in the current directory and prints no config if it does not find one. The mcptest.yml that init writes is the project-defaults file, not a suite, so pass the suite path explicitly until you rename your suite to mcptest.yaml.

Useful flags on the first run (all take the same --config tests/example.yaml):

--reporter pretty (the default, one line per test plus a summary; pass --reporter minimal for a one-line count, or --reporter json / --reporter junit to swap formats).
--verbose (prints the full JSON-RPC frames mcptest sends and receives, useful when a test fails for reasons your server logs do not explain).
--filter search (run only tests whose name matches the substring).
mcptest validate --config tests/example.yaml (load the file, validate against the schema, exit non-zero on the first error without spawning a server).

Exit codes:

0 everything passed.
1 at least one test failed.
2 configuration or schema error, or invalid arguments.

run can also exit 5 (cost cap exceeded, or --update-snapshots refused under CI=true), 6 (coverage below --coverage-threshold), and 7 (no tests selected). The full table is in the CLI reference.

5. Read the output

The pretty reporter is the default because it is what you read most often. The summary line at the bottom is the one you scan in CI. Open a failing test and the reporter prints:

the test name and where it lives in the YAML,
the expectation that did not match,
the actual value the server returned,
a hint pointing at the most likely cause (server not started, schema drift, expired auth, missing env var).

Use --reporter junit --output run.xml in CI to produce a machine-readable artifact your CI system renders as a test report; the run still prints progress to stderr. For the saved-run formats (sarif, markdown, html), re-render the record with mcptest report --format <fmt>.

What's next

You have a passing suite. The reading order from here depends on what you want to do next.

Learn how mcptest thinks: Concepts walks through servers, tools, matchers, cassettes, and the test lifecycle in one pass.
Wire it into CI properly: CI integration guide covers GitHub Actions, GitLab CI, CircleCI, reporter selection, and exit-code handling.
See real test suites: github.com/soapbucket/mcptest/tree/main/examples ships annotated suites against the fixture server, a sample HTTP server, and the named-errors scenarios. For complete end-to-end suites against ten popular MCP servers (filesystem, fetch, git, SQLite, GitHub, Notion, Brave Search, and more), each with a README and CI workflow, see the mcptest-examples repo.
Look up every flag: CLI reference, the YAML reference, and the Configuration reference are the three pages to bookmark.
Add LLM evaluations: LLM evals guide and jury consensus methods cover when to reach for a judge or jury, prompt engineering tips, and cost budgeting.
Gate your next model upgrade: mcptest model-compat (a v1.1 feature) catches the case where a model update breaks a server that did not change. Read model compatibility for the workflow and the model-compatibility post for the background.

If something here is wrong, ambiguous, or missing, open an issue on GitHub and we will pick it up.