Scenario 2: performance and token budgets
Once your tests pass, you usually want them to keep passing under two real-world constraints: latency and response size. A tool that returns the right answer in 30 seconds is a regression even if the assertion still matches; a tool that returns the right answer plus twelve kilobytes of prose drains a downstream model's context. This scenario shows how to lock both budgets in.
The budgets sit at the step level, alongside the matcher assertions, not inside them. They are documented in the YAML reference under Budgets and headers (not matchers).
The YAML
Save this as tests/budgets.yml:
# yaml-language-server: $schema=https://mcptest.sh/schema/v1.json
servers:
local:
command: ["./target/debug/my-mcp-server"]
tools:
- name: "lookup is fast and small"
server: local
tool: "lookup_record"
args:
id: "rec_abc123"
expect:
assertions:
- target: "result.content[0].text"
matcher:
contains: "rec_abc123"
max_duration_ms: 500
max_response_tokens: 1024
- name: "list is bounded under load"
server: local
tool: "list_records"
args:
page_size: 50
expect:
assertions:
- target: "result.records"
matcher:
schema:
type: array
maxItems: 50
max_duration_ms: 2000
max_response_tokens:
budget: 4096
tokenizer: "cl100k_base"
mode: "warn"
Three things to notice:
- The
expect:block is in its long form (object withassertions:), not the short form (array of assertions). The budget fields live alongsideassertions:, not inside it. max_duration_msis a hard budget. Exceeding it fails the step.max_response_tokensaccepts a short form (integer, hard budget) and a long form (object withbudget,tokenizer,mode, andimage_cost). The long form on the second test usesmode: "warn", which logs a warning instead of failing. Useful for "watch the trend" without flipping CI red. The default mode isfail.
How to run it
mcptest run tests/budgets.yml
Expected output
mcptest run tests/budgets.yml
PASS lookup is fast and small (143ms, 87 tokens)
PASS list is bounded under load (1.2s, 2150 tokens)
2 passed, 0 failed in 1.4s
When a budget fails, the reporter shows both the assertion result and the budget breach:
FAIL lookup is fast and small (612ms, 87 tokens)
duration budget exceeded: 612ms > 500ms (max_duration_ms)
assertion: result.content[0].text contains "rec_abc123" -> PASS
You see the assertion result alongside the budget breach. The test fails as a whole, but the diagnostic line tells you it was the budget, not the assertion.
Pairing with performance for suite-wide defaults
If most of your tests share the same budget, set a suite-wide default in the performance: block and override per test:
# yaml-language-server: $schema=https://mcptest.sh/schema/v1.json
servers:
local:
command: ["./target/debug/my-mcp-server"]
performance:
default_timeout_ms: 5000
p95_latency_ms: 1500
tools:
- name: "fast path"
server: local
tool: "ping"
expect:
assertions:
- target: "result.isError"
matcher: { exact: false }
max_duration_ms: 200 # tighter than the suite default
default_timeout_ms is a hard ceiling per test. p95_latency_ms is a soft suite-level signal the reporter surfaces; it never fails the run on its own.
See also
docs/yaml-reference.md#budgets-and-headers-not-matchers, the budget field reference.- Previous: First test.
- Next: Snapshot tests.