Scenario 14: rate limiting and backoff
You want to know what your test run does when the server pushes back with an HTTP 429. A real MCP server fronted by a gateway, an API quota, or a load shedder will eventually answer "too many requests" with a Retry-After header, and you would rather see that behavior in a controlled run than discover it for the first time in CI at 2am.
The hosted test server makes this easy to exercise. Point a URL target at https://test.mcptest.sh/mcp?scenario=ratelimit and every request comes back as HTTP 429 with a Retry-After: 1 header, before any JSON-RPC is handled. There is no happy path to fall through to; the endpoint exists only to push back. That gives mcptest's transport-level backoff something real to chew on and lets you watch how the failure finally surfaces.
What mcptest actually does here is worth being precise about. The Streamable HTTP transport retries a 429 (and 503 and other 5xx) a small, fixed number of times, honoring Retry-After when the server sends a whole-number-of-seconds value. There is no --retry-style knob for this: the backoff is built into the transport. The per-test --retry flag is a different thing (it re-runs a failing test, for flaky third-party services), and it does not come into play here because the 429 lands during connect, before any test step runs. So this scenario is about observing the built-in backoff and the clean transport error it ends with, not about configuring a retry policy.
The YAML
Save this as tests/ratelimit.yml:
# yaml-language-server: $schema=https://mcptest.sh/schema/v1.json
servers:
ratelimit:
url: "https://test.mcptest.sh/mcp?scenario=ratelimit"
http:
timeout: 30s
connect_timeout: 5s
tools:
- name: "lists tools"
server: ratelimit
tool: "tools/list"
args: {}
expect:
- target: "error"
matcher:
exact: null
message: "tools/list should not return an error"
What is happening here:
- The
ratelimitserver points at the hosted endpoint with thescenario=ratelimitquery parameter. Every request to it, including the MCPinitializehandshake, comes back429withRetry-After: 1. - The transport retries the 429 a small, fixed number of times with exponential backoff. When the response carries a numeric
Retry-After, the transport waits that many seconds instead of the default backoff step. WithRetry-After: 1you get roughly one second between attempts. - Because the endpoint answers 429 to every request, the retries are always exhausted. The transport gives up, the stream closes, and the
initializehandshake fails. The run never reaches thetools/liststep, so thetool:block above is effectively the thing the run was trying to get to, not the thing that fails. http.timeoutandhttp.connect_timeoutare the per-request and TCP connect budgets. They bound how long a single attempt can hang; the retry loop sits on top of them. Neither field turns the 429 into a pass; they only cap how long each attempt waits for bytes.
A note on the Retry-After value: mcptest parses it as a whole number of seconds. The HTTP spec also allows an HTTP-date form of Retry-After; the transport does not parse that form and falls back to its built-in backoff step when the value is not a plain integer. The hosted endpoint sends the integer form (1) so the wait is honored.
Run it
mcptest run --config tests/ratelimit.yml
If you want to watch the retries happen, turn on transport debug logging. The Streamable HTTP transport logs each non-2xx response and each retry decision:
mcptest --log-level "mcptest_core::transport::streamable_http=debug" \
run --config tests/ratelimit.yml
To probe the endpoint on its own before wiring it into a suite, point the layered network diagnostic at it:
mcptest doctor --url "https://test.mcptest.sh/mcp?scenario=ratelimit"
The [AUTH] / [MCP-INIT] rows will not reach OK here, because the server answers 429 to the initialize probe too. That is the expected result for this scenario, not a misconfiguration on your side.
Expected output
A run against the always-429 endpoint exhausts the transport's retries and then fails the connect, since the initialize handshake never completes:
mcptest run --config tests/ratelimit.yml
FAIL ratelimit connect failed
initialize handshake failed: transport closed
last HTTP status: 429 Too Many Requests (Retry-After: 1)
0 passed, 1 failed in 4.3s
exit code: 1
With transport debug logging on, the retries are visible before the final failure:
mcptest --log-level "mcptest_core::transport::streamable_http=debug" run --config tests/ratelimit.yml
WARN mcptest_core::transport::streamable_http: non-2xx HTTP response status=429 Too Many Requests
DEBUG mcptest_core::transport::streamable_http: post attempt failed transient=true
WARN mcptest_core::transport::streamable_http: non-2xx HTTP response status=429 Too Many Requests
DEBUG mcptest_core::transport::streamable_http: post attempt failed transient=true
WARN mcptest_core::transport::streamable_http: non-2xx HTTP response status=429 Too Many Requests
DEBUG mcptest_core::transport::streamable_http: post attempt failed transient=true
...
FAIL ratelimit connect failed
The wall-clock time (about four seconds in the example) reflects the Retry-After: 1 waits stacked across the retry attempts. A server that sent no Retry-After would fail faster, because the default backoff steps are sub-second.
Other transport endpoints on the same host are useful for the same kind of "see how a status surfaces" check:
GET https://test.mcptest.sh/status/503returns the requested status code.503is also retried by the transport; a404is not (only 429, 503, and 5xx are treated as transient).GET https://test.mcptest.sh/errorreturns500.GET https://test.mcptest.sh/healthreturns{"ok": true}, a clean 2xx, handy as await_for_readytarget or a sanity check that the host is reachable at all.GET https://test.mcptest.sh/slow?ms=NwaitsNmilliseconds and then responds, which is the endpoint to use when you want to exercisehttp.timeoutrather than a 429.
Troubleshooting
- The run hangs much longer than expected. A large
Retry-Aftervalue multiplied across the retry attempts adds up. The transport honors the server'sRetry-After(in whole seconds) over its own backoff, so a server advertisingRetry-After: 30will wait far longer than the always-429 demo. Lower the value the server sends, or accept that a genuinely rate-limited server is telling you to slow down. - I expected
--retryto make this pass. The per-test--retry Nflag re-runs a test that failed its assertions; it does not change transport behavior, and it does not help here because the 429 fails the connect before any test step runs. There is no separate transport-retry flag; the backoff is built in and not configurable in v1.0. - A 429 against my real server fails the whole suite. That is the point of this scenario: if your server returns 429 during connect, every test behind it fails because the handshake never completes. Fix it on the server side (raise the quota, slow the client) or reduce concurrency with
--parallel 1so the suite stops tripping the limit. - I want to assert on the rate-limit behavior itself. mcptest surfaces the 429 as a transport-level connect failure, not as a JSON-RPC
resultyou can match with anexpect:block, so there is nothing in the response body to assert against. Treat the non-zero exit code as the signal, and use--log-level "mcptest_core::transport::streamable_http=debug"to confirm the status and the retry path in the log. - The 429 is not retried at all. Only 429, 503, and other 5xx statuses are treated as transient. A 4xx other than 429 (a 400 or a 404) fails on the first attempt with no backoff, which is intentional: those are not worth retrying.
See also
docs/url-targets.md, URL target shapes, thehttp:transport block, and the layereddoctordiagnostics.docs/troubleshooting.md, connection and transport failures, including the "server unreachable" family this scenario lands in.- Previous: Tool overload.
- Next: Migration doctor. </content> </invoke>