mcptest docs GitHub

Negative-path conformance

Status: implemented behind the preview schema flag. Tracked as epic WOR-1236 and child WOR-1239.

Most tests check that good input produces a good result. A robust server also has to reject bad input cleanly: a well-formed JSON-RPC error, not a silent acceptance, a crash, or a hang. The MCP runtime-fault taxonomies (A Taxonomy of Runtime Faults in MCP Servers, arXiv:2606.05339; Real Faults in MCP Software, arXiv:2603.05637) catalog the ways servers get this wrong. The negative_path: block runs a small, taxonomy-keyed probe set against a tool and gates on whether each probe was rejected.

The probes

Each probe maps to a fault-taxonomy id so a failure points back to the literature.

ProbeTaxonomy idBad requestContract
unknown_toolFAULT-PROTO-UNKNOWN-METHOD (2606.05339)call a tool that does not exista method-not-found-class error
missing_requiredFAULT-SCHEMA-MISSING-REQUIRED (2603.05637)omit a required argumentan invalid-params-class error
wrong_typeFAULT-SCHEMA-TYPE-MISMATCH (2603.05637)send a wrong-typed argumentan error
extra_fieldFAULT-SCHEMA-UNEXPECTED-FIELD (2603.05637)send an unexpected fieldrejection when additionalProperties is false
oversizedFAULT-INPUT-OVERSIZED (2606.05339)send an oversized argumentan error or a result, never a hang

A probe passes when the server rejects the request: a JSON-RPC error response, or a tool-level error result (isError: true). The oversized probe is softer, since accepting a large input is legitimate: it only requires that the call returns at all, never a hang or a crash.

Targets and the gate

TargetMeaning
negative_path.checks_runNumber of probes that ran.
negative_path.failuresNumber of probes that did not meet the contract.
negative_path.gate_passed1 when every probe passed, 0 otherwise.
tools:
  - name: search rejects bad requests
    server: api
    tool: search
    args: { query: "anthropic" }
    negative_path:
      checks: [unknown_tool, missing_required, wrong_type]

Omit checks: to run the full set, and omit expect: to apply the default gate, which fails on any probe that did not get a clean rejection.

A note on lenient servers

Many servers do not validate argument types, so the wrong_type and extra_field probes will report a finding against them. That is the point: a server that accepts a wrong-typed argument is the type-validation fault the taxonomy describes. Run those probes against a server you expect to validate, and select the universal unknown_tool and missing_required probes for one you do not.

What it does not do

The probes check the error contract, not the error message wording or the exact code. Pair them with the input fuzzer, which sweeps a broader space of malformed input and checks the same crash-and-hang safety, and with ordinary assertion tests for the happy path.