--- title: Chaos Testing & Fault Injection shortTitle: Chaos Testing description: Simulate latency, errors, outages and network faults with MockServer to test service resilience, retries, backoff and circuit breakers. layout: page pageOrder: 1 section: 'Chaos Testing' subsection: true sitemap: priority: 0.9 changefreq: 'monthly' lastmod: 2026-05-30T08:00:00+00:00 keywords: chaos testing, fault injection, resilience testing, latency injection, connection drop, simulate outage, outage window, time-based fault injection, SRE testing, retry and backoff testing, mock 429 rate limit, chaos testing in kubernetes schema_faq: - question: "How do I simulate a 429 rate limit with MockServer?" answer: "Add a chaos profile to your expectation with errorStatus set to 429, errorProbability set to 1.0 for every request (or a fraction like 0.3 for 30% of requests), and an optional retryAfter value. The chaos profile works on both mocked responses and forwarded/proxied upstream calls." - question: "How do I inject latency into API responses?" answer: "Add a chaos profile with a latency field specifying the delay duration and time unit (e.g. 2000 milliseconds). Latency is injected into every matched response regardless of whether an error is also injected. This works on both mocked and forwarded responses." - question: "Can I inject faults into a real upstream or proxy?" answer: "Yes. Chaos profiles apply to forwarded and proxied responses, not just mocked ones. Set up a forward expectation pointing at your real upstream service and attach a chaos profile to inject errors or latency into the responses MockServer returns to the caller." - question: "How do I make chaos test results reproducible?" answer: "Set the seed field on your chaos profile to a fixed value. With the same seed, a given errorProbability always produces the same inject-or-skip decision, making fractional-probability chaos deterministic across test runs." - question: "Does MockServer chaos testing work in Kubernetes?" answer: "Yes. Deploy MockServer as a sidecar proxy, egress proxy, or reverse proxy in your Kubernetes cluster and attach chaos profiles to inject faults into the traffic flowing through it. MockServer operates at the HTTP layer (L7) and requires explicit routing (e.g. HTTP_PROXY environment variable or Service rewrite)." - question: "How do I make a mock fail the first few requests then recover?" answer: "Use the succeedFirst and failRequestCount fields on the chaos profile. Set succeedFirst to 0 (or omit it) and failRequestCount to the number of requests that should fail. For example, failRequestCount of 2 with errorStatus 503 and errorProbability 1.0 makes the first 2 matching requests return 503, and all subsequent requests return the normal response. This is useful for testing retry logic and backoff strategies." - question: "How do I simulate a GraphQL partial error?" answer: "Set graphqlErrors to true and graphqlNullifyData to false on your chaos profile. MockServer will parse the original response body as JSON and embed it as the data value in the error envelope, producing {\"data\":{...},\"errors\":[{\"message\":\"...\"}]}. Use graphqlErrorMessage and graphqlErrorCode to customise the error entry. This works on both expectation-level chaos and service-scoped chaos (PUT /mockserver/serviceChaos)." - question: "How do I simulate a broken gRPC server that omits the status trailer?" answer: "Set omitGrpcStatus to true on a gRPC chaos profile registered via PUT /mockserver/grpcChaos. MockServer will send the HTTP 200 response without a grpc-status trailer, which most gRPC clients treat as a protocol error or stream reset. Use this to test that your client does not silently accept an incomplete RPC as a success." ---
MockServer's chaos testing feature lets you inject realistic faults into HTTP responses — both mocked responses and forwarded/proxied upstream responses — so you can verify that your application handles errors, latency, and outages gracefully.
Attach a declarative chaos profile to any expectation to control:
Retry-After header1.0) to "10% of requests" (0.1)seed to make probabilistic outcomes identical across test runsBecause chaos profiles work on both mocked and forwarded responses, MockServer can act as a chaos proxy — sit it in front of a real service (upstream, third-party API, or internal dependency) and inject faults into the responses it relays. This makes it a powerful tool for SRE resilience testing, not just unit/integration test mocking.
A chaos profile is a JSON object (or HttpChaosProfile in the Java client) with the following fields. All fields are optional — omit any you don't need.
| Field | Type | Description | Valid range |
|---|---|---|---|
errorStatus |
integer | The HTTP status code to return when an error is injected (e.g. 500, 503, 429). When an error is injected the response body is a JSON object: {"error":{"type":"chaos_injected","message":"injected HTTP chaos error"}}. |
100 – 599 |
errorProbability |
number | The probability (0.0 to 1.0) that a matched request will receive the error instead of the normal response. 0.0 or omitted means errors are never injected; 1.0 means every request gets the error (deterministic). Fractional values (e.g. 0.3) inject errors on approximately 30% of requests. |
0.0 – 1.0 |
dropConnectionProbability |
number | The probability (0.0 to 1.0) that a matched request will have its TCP connection dropped without any response being sent, simulating a hard connection failure or network blip. When a connection drop fires it takes priority over error and latency injection. Uses a derived seed for independent but reproducible draws. | 0.0 – 1.0 |
retryAfter |
string | Value for the Retry-After HTTP header on injected error responses. Typically a number of seconds (e.g. "30") or an HTTP-date. Only included when an error is actually injected. |
any string (max 100 chars) |
latency |
object | Artificial delay added to every matched response (both normal and error responses). Specified as a Delay object with timeUnit (e.g. MILLISECONDS, SECONDS) and value. Latency is applied in addition to any delay on the action itself and the global response delay. |
valid Delay object |
seed |
integer | A fixed seed for the random number generator used by errorProbability. When set, the same seed + probability always yields the same inject/skip outcome, making tests reproducible. Note: a fixed seed produces the same decision on every request (always inject or always skip for a given probability), so it is most useful for making a known-fractional probability deterministic in a specific test. |
any long integer |
succeedFirst |
integer | The first N matching requests bypass chaos (normal response). Requests 1..N succeed; chaos becomes eligible from request N+1. Combine with failRequestCount for a finite fault window. |
≥ 0 (default: omitted = 0) |
failRequestCount |
integer | After the succeedFirst window, the next M matching requests receive chaos; after succeedFirst + M matches the expectation recovers (normal responses). Omit for unlimited faults after the succeed window. |
≥ 1 (default: omitted = unlimited) |
outageAfterMillis |
integer | Time-based outage window: chaos becomes active this many milliseconds after the expectation's first matched request. Before this point requests behave normally. Combine with outageDurationMillis for a self-healing outage. |
≥ 0 (default: omitted = 0) |
outageDurationMillis |
integer | Time-based outage window: once the outage has started, chaos stays active for this many milliseconds, then the expectation self-heals and serves normal responses again. Omit for an outage that never ends. | ≥ 1 (default: omitted = unbounded) |
truncateBodyAtFraction |
number | Corrupt the response body by keeping only this leading fraction of its bytes (e.g. 0.5 keeps the first half, 0.0 empties the body). Tests how a client copes with a partial / cut-off payload. Applies to the real (non-error) response only and is skipped for streaming bodies. |
0.0 – 1.0 (default: omitted = no truncation) |
malformedBody |
boolean | Corrupt the response body by appending a broken-JSON fragment so it fails to parse. Tests client-side body-parsing resilience. Applies to the real (non-error) response only and is skipped for streaming bodies. | true / false (default: omitted = false) |
slowResponseChunkSize |
integer | Dribble the response body in chunks of this many bytes (chunked transfer-encoding). Combine with slowResponseChunkDelay to trickle the body slowly and test read timeouts. Applies to the real (non-error) response only and is skipped for streaming bodies. |
≥ 1 (default: omitted = no dribble) |
slowResponseChunkDelay |
delay | The delay between dribbled chunks (a { "timeUnit": ..., "value": ... } object). Required alongside slowResponseChunkSize for the slow response to take effect. |
(default: omitted = no dribble) |
quotaName |
string | Stateful rate-limit counter key. Expectations sharing the same quotaName share one counter (model an upstream account limit). Required (with quotaLimit and quotaWindowMillis) to enable the quota. |
(default: omitted = no quota) |
quotaLimit |
integer | Maximum number of requests allowed within the window before requests are rejected. | ≥ 1 (default: omitted = no quota) |
quotaWindowMillis |
integer | Fixed-window length in milliseconds. The first request starts the window; it resets after this duration elapses. | ≥ 1 (default: omitted = no quota) |
quotaErrorStatus |
integer | The HTTP status returned when the quota is exceeded. | 100–599 (default: 429) |
degradationRampMillis |
integer | Gradual degradation: ramp errorProbability and dropConnectionProbability linearly from 0 up to their configured values over this many milliseconds from the expectation's first match, modelling a dependency that deteriorates over time. Measured with the controllable clock. |
≥ 1 (default: omitted = no ramp) |
graphqlErrors |
boolean | Rewrite the response as a GraphQL error envelope: HTTP 200 with a JSON body of the form {"data":null,"errors":[{"message":"...","extensions":{"code":"..."}}]}, Content-Type: application/json, and Content-Length stripped. Takes precedence over truncateBodyAtFraction and malformedBody. Metered as fault_type=graphql. |
true / false (default: omitted = false) |
graphqlErrorMessage |
string | The message placed in errors[0].message. Only used when graphqlErrors is true. Defaults to "simulated GraphQL error" when omitted. |
any string (default: "simulated GraphQL error") |
graphqlErrorCode |
string | Optional value placed in errors[0].extensions.code (for example "INTERNAL_SERVER_ERROR" or "UNAUTHENTICATED"). The extensions object is omitted entirely when this field is not set. |
any string (default: omitted = no extensions) |
graphqlNullifyData |
boolean | Controls the data field in the GraphQL error envelope. When true (the default), data is null — a full error. When false, MockServer tries to parse the original response body as JSON and embed it as the data value, simulating a partial success (data plus errors). Falls back to data: null if the original body is not valid JSON. |
true / false (default: omitted = true, i.e. data:null) |
The outageAfterMillis and outageDurationMillis fields define a
self-healing outage window measured relative to the expectation's first matched request: chaos is active only from
outageAfterMillis until outageAfterMillis +
outageDurationMillis have elapsed, after which the service recovers automatically. This models a
dependency that is healthy for a while, degrades for a bounded period, then comes back — ideal for testing how a service
behaves across a transient downstream outage. The time window composes with the count window and the probability fields: a
fault fires only when the request falls inside the time window and the count window and the probability draw
passes.
Outage windows are measured with MockServer's controllable clock, so tests do not have to wait in real time. Freeze and
advance the clock with the PUT /mockserver/clock
control-plane endpoint to step deterministically from "before the outage" to "during the outage" to "recovered" without any
sleep calls.
Chaos profiles apply to most expectation action types:
| Action type | Chaos supported |
|---|---|
Mocked response (httpResponse) |
Yes |
Response template (httpResponseTemplate) |
Yes |
| Response class callback | Yes |
| Response object callback | Not yet (uses its own write path) |
Forward (httpForward) |
Yes |
| Forward template | Yes |
| Forward class callback | Yes |
Forward with override (httpOverrideForwardedRequest) |
Yes |
| Forward with validation | Yes |
| Forward object callback | Not yet (uses its own write path) |
| Unmatched proxy pass-through | Not yet |
Error (httpError) |
Not applicable (already a fault action) |
Connection drop injection closes the TCP connection without sending any response, simulating a hard network failure or connection reset. This is the most severe fault type and takes priority over error and latency injection when multiple fault types are configured.
How it works: on each matched request, MockServer first draws against dropConnectionProbability. If the draw says "drop," the TCP connection is closed immediately with no response written. When the draw says "skip," the chaos profile falls through to error injection (if configured) and then latency injection. This ordering ensures that connection drops, errors, and latency are evaluated as independent faults with a clear priority: drop > error > latency.
Common use cases:
Error injection replaces the normal response with a synthetic HTTP error. This is the primary mechanism for simulating downstream service failures.
How it works: on each matched request, MockServer draws against errorProbability. If the draw says "inject," the configured errorStatus is returned instead of the real response. The response body is a JSON error object and, if retryAfter is set, a Retry-After header is included. When the draw says "skip," the normal response (or forwarded upstream response) is returned unchanged.
Common use cases:
Retry-After header and verify your client respects the wait periodLatency injection adds artificial delay to the response without changing its content or status code. This is useful for testing how your application handles slow dependencies.
How it works: the latency delay is applied to every matched response — whether or not an error is also injected. It is added on top of any delay configured on the action itself and any global response delay (mockserver.globalResponseDelayMillis).
Common use cases:
For more complex latency patterns (uniform distribution, log-normal, gaussian), use the delay field on the response action itself. See Creating Expectations and Scalability & Latency for details.
Body corruption damages the payload of an otherwise-successful response so you can test how robustly your client parses what it receives — independently of the status code. Two fields are available and can be combined:
truncateBodyAtFraction — keeps only a leading fraction of the body bytes (for example 0.5 returns the first half of the body, 0.0 returns an empty body). Simulates a connection that delivered a partial / cut-off payload.malformedBody — appends a broken-JSON fragment to the body so a JSON parser fails. Simulates a corrupted or truncated-then-garbled payload.How it works: body corruption is deterministic — it is not subject to a probability draw. It applies to the real (mocked or forwarded) response whenever the request is inside the active count window and time-based outage window. When both fields are set, the body is truncated first and the malformed fragment is then appended. To keep the response well-framed, MockServer removes any stale Content-Length header so the response encoder sets the correct length for the corrupted body, and preserves the original Content-Type.
Priority: connection-drop and error injection take precedence — when an error status is injected, its synthetic error body is returned uncorrupted. Body corruption only affects the real response that would otherwise have been returned. Streaming response bodies are not corrupted (the LLM response path has its own mid-stream truncation).
Common use cases:
malformedBody: true and verify your client surfaces a clear parse error rather than crashing.truncateBodyAtFraction: 0.5 and verify your client detects the short / incomplete payload.GraphQL APIs always return HTTP 200, even for errors — error details are carried in a JSON errors array in the response body. Standard HTTP error injection (which changes the status code to 500 or 503) does not reproduce this pattern. The graphqlErrors flag rewrites the response into a spec-compliant GraphQL error envelope so your GraphQL client's error-handling logic is exercised correctly.
How it works: when graphqlErrors: true is set, MockServer replaces the response body with a JSON envelope of the form:
{"data":null,"errors":[{"message":"simulated GraphQL error","extensions":{"code":"INTERNAL_SERVER_ERROR"}}]}
The response is sent as HTTP 200 with Content-Type: application/json. Any stale Content-Length header is removed. Set graphqlNullifyData: false to embed the original response body JSON as the data value instead — this simulates a partial success where the server returns some data alongside errors (a common pattern in GraphQL APIs that use partial responses).
Priority: graphqlErrors takes precedence over truncateBodyAtFraction and malformedBody — when GraphQL injection is active, body corruption is skipped because the envelope is the intended body. The slow-response dribble (slowResponseChunkSize + slowResponseChunkDelay) still applies to trickle the envelope. Like body corruption, GraphQL injection is deterministic (no probability draw) and respects the count window (succeedFirst / failRequestCount). Error and connection-drop injection (which change the HTTP status) still take priority over everything — GraphQL injection only fires on the real, non-error response path.
Works with service-scoped chaos too: graphqlErrors can be set on a service-scoped profile (PUT /mockserver/serviceChaos), making it easy to inject GraphQL errors into all matching forwards to a GraphQL upstream without touching individual expectations.
// Simulate a full GraphQL error on all calls to a GraphQL service
PUT /mockserver/serviceChaos
{
"host": "graphql.api.svc",
"chaos": {
"graphqlErrors": true,
"graphqlErrorMessage": "upstream database unavailable",
"graphqlErrorCode": "INTERNAL_SERVER_ERROR"
}
}
// Simulate a partial success (data present alongside errors)
PUT /mockserver/serviceChaos
{
"host": "graphql.api.svc",
"chaos": {
"graphqlErrors": true,
"graphqlNullifyData": false,
"graphqlErrorMessage": "partial result: one field failed",
"graphqlErrorCode": "DOWNSTREAM_ERROR"
}
}
Common use cases:
errors[0].message and surfaces it to the user instead of silently treating the HTTP 200 as a success.graphqlErrorCode to a code your client acts on (e.g. "UNAUTHENTICATED") and verify the client triggers a re-authentication flow.graphqlNullifyData: false and verify your client renders the partial data while also surfacing the error.A slow response trickles the response body to the client in small chunks with a delay between each, instead of sending it all at once. This is useful for testing read timeouts and slow-network behaviour — distinct from latency (which delays the whole response by a fixed amount before sending it).
How it works: set slowResponseChunkSize (bytes per chunk) and slowResponseChunkDelay (the delay between chunks). MockServer then sends the body using chunked transfer-encoding, writing one chunk at a time with the configured delay between them. Both fields are required — a chunk size with no delay simply chunks the body without slowing it down. Like body corruption, the slow response is deterministic, applies to the real (mocked or forwarded) response within the active count and outage windows, and is skipped for streaming bodies. It is metered as fault_type=slow.
Common use cases:
The request quota is a deterministic, stateful fixed-window rate limit — the counterpart to the probabilistic 429 (which fires randomly per request). It lets you drive an application into a hard rate limit: "the 5th call within 60 seconds gets 429."
How it works: set quotaName, quotaLimit and quotaWindowMillis. MockServer counts matched requests against the named quota; once the count exceeds quotaLimit within the current window, further requests are rejected with quotaErrorStatus (default 429) and the retryAfter header. The window is fixed: the first request starts it and it resets quotaWindowMillis later. Expectations that share a quotaName share one counter, so you can model a single upstream account limit spread across several mocks. The counter is process-wide and is cleared on server reset.
The quota gate takes priority over the probabilistic error and the body/slow faults (it is evaluated right after the connection-drop check), so a rate-limited request always returns the quota status. A misconfigured quota (any of the three fields missing) is ignored.
When the quota is combined with a count window (succeedFirst / failRequestCount), only requests inside that window count against the quota — requests in the "succeed" or "recovered" phases are not counted. Most setups use the quota on its own, where every matched request counts.
Instead of attaching a chaos block to every forwarding expectation, you can register one chaos profile for an entire upstream host and have it applied to all matched forwards to that host — the ergonomic "break service X" control. This is useful when running MockServer as a chaos proxy in front of one or more upstreams.
Register, read and clear service-scoped chaos through a control-plane endpoint (protected by control-plane authentication when configured):
// register a profile for a host (replaces any existing one for that host)
PUT /mockserver/serviceChaos
{ "host": "payments.svc", "chaos": { "errorStatus": 503, "errorProbability": 0.3, "latency": { "timeUnit": "MILLISECONDS", "value": 500 } } }
// one call = "break payments.svc for 5 minutes, then auto-heal" (time-boxed chaos)
PUT /mockserver/serviceChaos
{ "host": "payments.svc", "chaos": { "errorStatus": 503, "errorProbability": 0.3 }, "ttlMillis": 300000 }
// remove the profile for a host
PUT /mockserver/serviceChaos
{ "host": "payments.svc", "remove": true }
// clear all service-scoped chaos
PUT /mockserver/serviceChaos
{ "clear": true }
// read the current host -> profile registrations
GET /mockserver/serviceChaos
How it works: on a matched forward expectation that has no chaos of its own, MockServer looks up the profile registered for the request's Host header (matched case-insensitively, ignoring any :port) and applies it to the forwarded response. An expectation that defines its own chaos always takes precedence. The anonymous, unmatched proxy fall-through path is not affected, and the registrations are cleared on server reset. Because a service-scoped profile has no single owning expectation, the per-expectation count window, outage window and degradation ramp are not anchored — service-scoped profiles are best used for the steady-state faults (errors, drops, latency, body corruption, slow response, and the host-independent quota).
Time-to-live (auto-revert): add an optional ttlMillis to a registration and the chaos automatically reverts that many milliseconds later — a "dead-man's switch" so the fault self-heals even if the matching clear is never sent (for example, an external chaos orchestrator crashes mid-experiment). It is the time-boxed one-shot form: a single call breaks a host for a bounded window. Expiry is measured with the controllable clock, so it tracks real time by default but is deterministic under PUT /mockserver/clock freeze/advance. Without ttlMillis a registration persists until explicitly cleared or the server is reset. GET /mockserver/serviceChaos reports a ttlRemainingMillis map alongside the active profiles, so you can see the countdown for each time-boxed registration.
Besides the REST endpoint, convenience wrappers are available in the client libraries — Java (mockServerClient.setServiceChaos(host, chaos) / removeServiceChaos(host) / clearServiceChaos() / serviceChaosStatus()), Node (setServiceChaos / removeServiceChaos / clearServiceChaos / serviceChaosStatus), Python (set_service_chaos / remove_service_chaos / clear_service_chaos / service_chaos_status), and Ruby (the same snake-case names) — and via the manage_service_chaos MCP tool (action of register / remove / clear) for AI assistants.
The dashboard UI also has a Chaos tab for managing service-scoped chaos interactively: register a host with an error status / error probability / drop probability / latency (and an optional TTL), see every active registration with a summary of its faults, watch the live TTL auto-revert countdown, and remove a single host or clear them all.
Because the control plane is a single HTTP call with a built-in TTL safety timer, it is also the integration point for external chaos orchestrators — register the fault at the start of an experiment, verify the application copes, and let the TTL revert it even if the orchestrator never sends the clear. See Driving MockServer from Chaos Orchestrators for Chaos Toolkit, AWS FIS, Azure Chaos Studio and LitmusChaos recipes.
Once a service-scoped chaos profile is registered, you can update individual fields without replacing the whole profile and without restarting MockServer. Use PATCH /mockserver/serviceChaos to apply JSON Merge Patch semantics: only non-null fields in the request body are updated; all other fields and the current TTL are preserved.
This is useful when you want to adjust a live experiment's fault rates mid-run — for example, ramp error probability up or down while the application is running — without having to recreate the profile or lose the TTL countdown on an active time-boxed registration.
// Increase the error probability on an already-active profile without touching latency or TTL
PATCH /mockserver/serviceChaos
{ "host": "payments.svc", "chaos": { "errorProbability": 0.9 } }
// Switch from error injection to connection-drop injection mid-experiment
PATCH /mockserver/serviceChaos
{ "host": "payments.svc", "chaos": { "errorProbability": 0.0, "dropConnectionProbability": 0.5 } }
// Add latency to an existing error-injection profile
PATCH /mockserver/serviceChaos
{ "host": "payments.svc", "chaos": { "latency": { "timeUnit": "MILLISECONDS", "value": 2000 } } }
Semantics: the PATCH request requires a host field and a chaos object containing at least one field to update. Only the fields you supply are changed — unspecified fields in the existing profile are left unchanged. If no profile exists yet for the host, the partial is registered as a new profile with no TTL. The TTL of an existing timed registration is always preserved; PATCH cannot change or extend it. To replace a profile entirely, use PUT /mockserver/serviceChaos.
The response body echoes the host and the resulting merged chaos profile:
{ "status": "patched", "host": "payments.svc", "chaos": { "errorStatus": 503, "errorProbability": 0.9, "latency": { ... } } }
Gradual degradation models a dependency that deteriorates over time rather than failing all at once — useful for testing alerting thresholds and SLO error-budget burn.
How it works: set degradationRampMillis. The probabilistic fault rates — errorProbability and dropConnectionProbability — are scaled by a factor that climbs linearly from 0.0 at the expectation's first matched request to 1.0 once the ramp duration has elapsed, then stays at full strength. So an expectation with errorProbability: 1.0 and degradationRampMillis: 600000 injects no errors at first, ~50% of requests at the 5-minute mark, and 100% after 10 minutes. The ramp is measured with MockServer's controllable clock, so it is deterministic under clock freeze/advance (PUT /mockserver/clock) with no real-time waiting. Only the probabilistic rates ramp — the deterministic faults (latency, body corruption, slow response, quota) are unaffected.
When errorProbability is fractional (not 0.0 or 1.0), the inject/skip decision is random by default. Set the seed field to a fixed value to make this decision deterministic and reproducible across test runs.
Determinism rules:
errorProbability of 0.0 (or omitted) — errors are never injected, regardless of seederrorProbability of 1.0 — errors are always injected, regardless of seedseed, the same draw is made every time (same result on every request)Note: a fixed seed with a fractional probability yields the same decision on every request (always inject or always skip), because the seed resets the random state for each evaluation. This is by design — it ensures test reproducibility. For probabilistic variation within a single test run, omit the seed.
One of the most powerful uses of chaos profiles is on forwarded/proxied responses. Instead of mocking a service, you can forward requests to the real upstream and inject faults into the responses MockServer relays back to the caller.
This turns MockServer into a chaos proxy — sit it between your application and a dependency (internal service or external API) and test what happens when that dependency becomes unreliable.
Deployment patterns:
HTTP_PROXY / HTTPS_PROXY environment variables to point at MockServer, then create forward expectations with chaos profiles for specific hosts/pathsproxyRemoteHost / proxyRemotePort configuration) and add chaos profiles to inject faultsMockServer operates at the HTTP layer (L7) and requires explicit routing — it does not transparently intercept traffic at the network layer. See Isolating Single Service for detailed proxy deployment patterns.
The succeedFirst and failRequestCount fields let you define a window of request numbers where chaos is active. This enables deterministic, count-based fault patterns without writing custom test logic.
How the window works: MockServer tracks the 1-based match count for each expectation. For every matched request, the chaos profile checks whether the current match count falls within the eligible window:
errorProbability)When both fields are omitted, every request is eligible for chaos (backward compatible with the original probabilistic-only behaviour).
Canonical patterns:
succeedFirst: 0, failRequestCount: N. The first N requests receive the chaos error; subsequent requests succeed normally.succeedFirst: N, failRequestCount omitted. Requests 1..N succeed; every request after N receives the chaos error.succeedFirst: N-1, failRequestCount: 1. Only request N gets the error; requests before and after succeed normally.The count window composes with errorProbability: a request must be within the window and pass the probability check to receive a fault. Latency injection follows the same window — chaos latency is only applied to requests within the eligible window.
percentage field controls how often a request matches the expectation; errorProbability controls how often a matched request gets an error. These compose multiplicatively: an expectation with percentage: 50 and errorProbability: 0.5 injects errors on roughly 25% of structurally matching requests.mockserver.globalResponseDelayMillis is added on top of any chaos latency.maxSocketTimeout will be cut off by the socket timeout.When metrics are enabled, MockServer exposes Prometheus metrics for chaos: a counter that tracks every fault injected, and a gauge for the number of hosts with currently-active service-scoped chaos:
| Metric | Type | Labels | Description |
|---|---|---|---|
mock_server_http_chaos_injected_total |
Counter | fault_type = drop | error | latency | truncate | malformed | slow | graphql |
Cumulative count of HTTP chaos faults injected, split by fault type |
mock_server_active_service_chaos |
Gauge | fault_type = drop | error | latency | truncate | malformed | slow | quota | graphql |
Number of currently-active service-scoped chaos profiles configured with each fault type (a profile with several faults counts under each); drops to 0 as profiles are cleared or their TTLs lapse |
Both metrics are also surfaced in the dashboard UI Metrics view as an "HTTP Chaos Faults" section — a stat per fault type the server emits (drop, error, latency, truncate, malformed, slow, quota, graphql), a per-fault-type chart of cumulative injections, and a per-fault-type chart of the active service-scoped chaos gauge (visible only when a chaos metric has non-zero data).
Both metrics are also mirrored over OpenTelemetry OTLP when OTLP metrics export is enabled, so OTLP-only consumers can observe them without a Prometheus scrape.
Example PromQL queries:
# Rate of error faults injected over the last 5 minutes
rate(mock_server_http_chaos_injected_total{fault_type="error"}[5m])
# Total latency faults injected
mock_server_http_chaos_injected_total{fault_type="latency"}
# Alert while any service-scoped chaos is still live (across all fault types)
sum(mock_server_active_service_chaos) > 0
# Active service-scoped chaos injecting connection drops
mock_server_active_service_chaos{fault_type="drop"}
Chaos profiles can be configured directly from the MockServer dashboard when composing a standard HTTP expectation. Toggle the Inject fault / chaos switch in the expectation composer to reveal fields for all seven chaos profile properties. The dashboard generates the correct JSON payload (with chaos as a top-level expectation field) and the equivalent Java client and curl snippets. Active expectations that have a chaos profile display a Chaos summary chip in the expectations panel.
MockServer can inject gRPC-level faults — error status codes, latency, and rate-limit exhaustion — on matched gRPC RPC calls. This is distinct from the gRPC health-check chaos (which controls the grpc.health.v1.Health/Check serving-status response): gRPC fault injection fires before normal gRPC request conversion in GrpcToHttpRequestHandler and applies to any RPC method on any loaded service, not just health probes.
Register one chaos profile per gRPC service name. The profile is applied to every matched call to that service. An empty string ("") registers a default profile that applies to all services without a more-specific override.
Why use this: gRPC clients have their own retry, deadline, and circuit-breaker logic that is separate from HTTP clients. Injecting UNAVAILABLE, DEADLINE_EXCEEDED, or RESOURCE_EXHAUSTED statuses at the RPC layer tests that gRPC-native error handling, backoff, and deadline propagation work correctly — without having to modify the real service.
The gRPC chaos profile covers: status injection, latency, request quota, count windows, trailer manipulation (omitGrpcStatus, corruptGrpcStatus, customTrailers), and client-streaming abort (abortAfterMessages). Drop or truncation of individual stream messages mid-stream is planned for a future release.
A gRPC chaos profile is a JSON object (or GrpcChaosProfile in the Java model) with the following fields. All fields are optional — omit any you don't need.
| Field | Type | Description | Valid values / range |
|---|---|---|---|
errorStatusCode |
string | The gRPC status code name to return when a fault fires. When set, the handler writes an HTTP 200 response with the corresponding grpc-status trailer. Must be one of the 17 canonical gRPC status code names. |
See table below |
errorMessage |
string | The grpc-message trailer value sent alongside the error status code. Optional — omit to send no message. |
any string |
errorProbability |
number | Probability (0.0 to 1.0) that a matched call receives the error instead of normal processing. 0.0 or omitted means no error injection; 1.0 means every call gets the error. |
0.0 – 1.0 |
seed |
integer | Fixed seed for the random number generator used by errorProbability. When set, the same seed + probability always produces the same inject/skip outcome, making tests reproducible. |
any long integer |
latencyMs |
integer | Artificial delay in milliseconds added before the response (whether an error is injected or not). | ≥ 0 |
succeedFirst |
integer | The first N calls to this service bypass fault injection (succeed normally). Combine with failRequestCount for a finite fault window. |
≥ 0 (default: omitted = 0) |
failRequestCount |
integer | After the succeedFirst window, the next M calls receive the fault; after that the service recovers. Omit for unlimited faults after the succeed window. |
≥ 1 (default: omitted = unlimited) |
quotaName |
string | Stateful rate-limit counter key. Required (with quotaLimit and quotaWindowMillis) to enable quota enforcement. Calls over the limit return RESOURCE_EXHAUSTED. |
(default: omitted = no quota) |
quotaLimit |
integer | Maximum number of calls allowed within the window before quota is exceeded. | ≥ 1 (default: omitted = no quota) |
quotaWindowMillis |
integer | Fixed-window length in milliseconds. The first call starts the window; it resets after this duration elapses. Calls beyond quotaLimit within the window return RESOURCE_EXHAUSTED. |
≥ 1 (default: omitted = no quota) |
omitGrpcStatus |
boolean | When true, the fault response is sent with no grpc-status trailer at all. This simulates an incomplete or broken RPC stream — the server starts a response but never terminates it correctly, causing gRPC clients to raise a stream-reset or missing-trailer error. Takes precedence over corruptGrpcStatus. |
true / false (default: omitted = false) |
corruptGrpcStatus |
boolean | When true (and omitGrpcStatus is false), the fault response sets grpc-status to the non-numeric value malformed, which violates the gRPC spec (grpc-status must be a decimal integer). Tests how clients cope with an unparseable status trailer — a genuine protocol violation rather than merely an unrecognised numeric code. |
true / false (default: omitted = false) |
customTrailers |
object | A JSON object of arbitrary trailer key/value pairs injected on the fault response. These are written in addition to (or instead of, when omitGrpcStatus is set) the normal status trailers. Useful for injecting vendor-specific error metadata that your client reads (e.g. x-ratelimit-remaining: 0). |
any string-to-string map (default: omitted = no custom trailers) |
abortAfterMessages |
integer | For client-streaming RPCs: when the number of decoded gRPC messages in the request body reaches this threshold, MockServer immediately responds with ABORTED (or the profile's errorMessage if set). The message count is determined by decoding the 5-byte gRPC length-prefixed frames in the request body. Use this to test how a streaming client handles mid-stream server abort — for example, a server that rejects an oversized batch. |
≥ 1 (default: omitted = no abort) |
The errorStatusCode field accepts any of the 17 canonical gRPC status code names. The most commonly injected for resilience testing are:
| Code | Name | Typical test scenario |
|---|---|---|
| 4 | DEADLINE_EXCEEDED | Client deadline propagation, timeout handling |
| 8 | RESOURCE_EXHAUSTED | Rate-limit / quota exhaustion handling (also the quota fault code) |
| 13 | INTERNAL | Unexpected server-side failure handling |
| 14 | UNAVAILABLE | Service outage, retry and circuit-breaker testing |
| 16 | UNAUTHENTICATED | Auth-token expiry and re-authentication flows |
| 10 | ABORTED | Optimistic concurrency, transaction conflict handling |
All 17 canonical codes are accepted: OK, CANCELLED, UNKNOWN, INVALID_ARGUMENT, DEADLINE_EXCEEDED, NOT_FOUND, ALREADY_EXISTS, PERMISSION_DENIED, RESOURCE_EXHAUSTED, FAILED_PRECONDITION, ABORTED, OUT_OF_RANGE, UNIMPLEMENTED, INTERNAL, UNAVAILABLE, DATA_LOSS, UNAUTHENTICATED.
// register a fault profile for a specific gRPC service (replaces any existing one)
PUT /mockserver/grpcChaos
{ "service": "com.example.payments.PaymentService", "chaos": { "errorStatusCode": "UNAVAILABLE", "errorProbability": 0.3, "latencyMs": 200 } }
// register with a TTL (auto-revert after 5 minutes)
PUT /mockserver/grpcChaos
{ "service": "com.example.payments.PaymentService", "chaos": { "errorStatusCode": "UNAVAILABLE", "errorProbability": 1.0 }, "ttlMillis": 300000 }
// register a default profile that applies to ALL gRPC services (empty string key)
PUT /mockserver/grpcChaos
{ "service": "", "chaos": { "errorStatusCode": "DEADLINE_EXCEEDED", "errorProbability": 0.2 } }
// remove the profile for a service
PUT /mockserver/grpcChaos
{ "service": "com.example.payments.PaymentService", "remove": true }
// clear all gRPC chaos profiles
PUT /mockserver/grpcChaos
{ "clear": true }
// read all active gRPC chaos profiles and TTL countdowns
GET /mockserver/grpcChaos
// merge-patch — update only the specified fields, preserve the rest and the TTL
PATCH /mockserver/grpcChaos
{ "service": "com.example.payments.PaymentService", "chaos": { "errorProbability": 0.9 } }
// simulate a server that omits the grpc-status trailer (broken stream)
PUT /mockserver/grpcChaos
{ "service": "com.example.orders.OrderService", "chaos": { "omitGrpcStatus": true } }
// simulate a server that sends a non-numeric grpc-status (genuine protocol violation)
PUT /mockserver/grpcChaos
{ "service": "com.example.orders.OrderService", "chaos": { "corruptGrpcStatus": true } }
// inject custom trailers alongside the error (e.g. rate-limit metadata)
PUT /mockserver/grpcChaos
{
"service": "com.example.orders.OrderService",
"chaos": {
"errorStatusCode": "RESOURCE_EXHAUSTED",
"errorProbability": 1.0,
"customTrailers": { "x-ratelimit-remaining": "0", "x-ratelimit-reset": "1748700000" }
}
}
// abort a client-streaming RPC after 5 messages
PUT /mockserver/grpcChaos
{ "service": "com.example.upload.UploadService", "chaos": { "abortAfterMessages": 5 } }
The GET response reports active profiles under a services map (service name → profile) and, when any registrations carry a TTL, a ttlRemainingMillis map alongside it.
The PATCH endpoint applies JSON Merge Patch semantics: only the fields you supply are updated; unspecified fields in the existing profile and the current TTL are left unchanged. If no profile exists yet for the service, the partial is registered as a new profile with no TTL.
Like service-scoped HTTP chaos and TCP chaos, registrations support an optional ttlMillis for automatic expiry and are cleared on server reset.
Relationship to gRPC health-check chaos: the gRPC health-check feature (PUT /mockserver/grpc/health) controls the grpc.health.v1.Health/Check serving-status response — changing it makes a Kubernetes readiness probe fail. gRPC fault injection (this section, PUT /mockserver/grpcChaos) injects status errors into any RPC method on your application services. The two mechanisms are independent.
In addition to HTTP-level chaos (which operates on decoded HTTP requests and responses), MockServer supports TCP-layer chaos that operates on raw bytes before HTTP decoding. This enables transport-layer fault injection that mirrors Toxiproxy's named toxics.
TCP-layer chaos is managed separately from HTTP chaos profiles. It is registered against a host and applied to all connections from that host at the raw byte level.
| Fault type | Field | Type | Description |
|---|---|---|---|
| latency | latencyMs |
long | Delays all inbound data by the specified milliseconds before it reaches the HTTP decoder |
| down | down |
boolean | Silently drops all inbound data so the service appears completely down |
| bandwidth | bandwidthBytesPerSec |
long | Throttles inbound data to the specified bytes per second |
| slow_close | slowClose |
boolean | Delays the TCP FIN by 2 seconds on close, simulating a slow connection teardown |
| timeout | timeout |
boolean | Never sends FIN on close; the connection hangs indefinitely |
| reset_peer | resetPeer |
boolean | Sends a TCP RST and closes the connection immediately on first data |
| slicer | slicerChunkSize |
integer | Fragments inbound data into chunks of the specified size (bytes) |
| limit_data | limitDataBytes |
long | Closes the connection after the specified total bytes have been received |
When multiple fault types are configured on the same profile, they are evaluated in priority order: down > reset_peer > limit_data > slicer > bandwidth > latency.
// register a TCP chaos profile for a host
PUT /mockserver/tcpChaos
{ "host": "upstream.svc", "chaos": { "latencyMs": 500, "slicerChunkSize": 64 } }
// register with a TTL (auto-revert after 5 minutes)
PUT /mockserver/tcpChaos
{ "host": "upstream.svc", "chaos": { "down": true }, "ttlMillis": 300000 }
// remove the profile for a host
PUT /mockserver/tcpChaos
{ "host": "upstream.svc", "remove": true }
// clear all TCP chaos profiles
PUT /mockserver/tcpChaos
{ "clear": true }
// read all active TCP chaos profiles
GET /mockserver/tcpChaos
// merge-patch an existing profile (only update specified fields)
PATCH /mockserver/tcpChaos
{ "host": "upstream.svc", "chaos": { "down": true } }
The API follows the same patterns as service-scoped HTTP chaos: host matching is case-insensitive and ignores port suffixes, registrations support optional TTL-based auto-expiry, and all registrations are cleared on server reset.
Difference from HTTP chaos: TCP chaos operates at the raw byte level before HTTP decoding. This means it can simulate faults that are impossible to reproduce with HTTP-level chaos, such as TCP RST, connection timeouts (never sending FIN), bandwidth throttling, and data fragmentation. HTTP chaos operates on decoded HTTP requests/responses and can inject application-level faults like error status codes, body corruption, and rate limiting.
MockServer can retrieve recorded proxy traffic and replay it with an optional HTTP chaos overlay so you can compare how a service behaves under fault conditions against its known baseline. This is useful for a "record once, stress under chaos" workflow: record normal traffic through MockServer in proxy mode, then replay that traffic with errors or latency injected and inspect the per-request comparison report.
Note: replay is a foundational capability. The report tracking, rate limiting, and chaos overlay infrastructure are complete, but the underlying network sender is a placeholder pending lifecycle wiring — replayed requests currently complete immediately with an error result. The API contract and report shape are stable and described here so you can plan integrations.
PUT /mockserver/replay
{
"ratePerSecond": 10,
"chaosProfile": {
"errorStatus": 503,
"errorProbability": 0.3
}
}
Fields:
ratePerSecond — maximum requests per second to send; 0 means unlimited (all requests sent as fast as possible)chaosProfile — optional chaos profile applied as a service-scoped overlay for all hosts seen in the recorded traffic; uses a 5-minute TTL that auto-clears after the replay windowMockServer reads all previously-recorded FORWARDED_REQUEST log entries as the traffic source. The request returns immediately with a 201 Created response containing a replayId and the total count of requests to replay:
{ "replayId": "a3f2...", "totalRequests": 42, "status": "RUNNING" }
GET /mockserver/replay/{replayId}
Returns a ReplayReport object:
{
"replayId": "a3f2...",
"status": "RUNNING", // RUNNING | COMPLETED | FAILED
"totalRequests": 42,
"completedRequests": 17,
"successCount": 14,
"failureCount": 3,
"results": [
{
"method": "GET",
"path": "/api/orders",
"baselineStatusCode": 200,
"replayStatusCode": 503,
"baselineLatencyMs": 45,
"replayLatencyMs": 38,
"latencyDeltaMs": -7,
"statusMatch": false
}
]
}
A result entry is added for each completed request. statusMatch is true when the replayed HTTP status code equals the baseline. latencyDeltaMs is positive when the replay was slower than baseline, negative when faster. When the replay finishes, status changes to COMPLETED and the full results array is available. An unknown replayId returns 404.
The following chaos features are planned for future releases:
Add a chaos profile to your expectation with errorStatus set to 429, errorProbability set to 1.0 (for every request) or a fraction like 0.3 (for 30% of requests), and an optional retryAfter value. The chaos profile works on both mocked responses and forwarded/proxied upstream calls. See the rate limiting example above.
Add a chaos profile with a latency field specifying the delay duration and time unit (e.g. 2000 milliseconds). Latency is injected into every matched response regardless of whether an error is also injected. This works on both mocked and forwarded responses. See the latency injection example above.
Yes. Chaos profiles apply to forwarded and proxied responses, not just mocked ones. Set up a forward expectation pointing at your real upstream service and attach a chaos profile to inject errors or latency into the responses MockServer returns to the caller. See the forward/proxy example and the Chaos Proxy section above.
Set the seed field on your chaos profile to a fixed value (e.g. 42). With the same seed, a given errorProbability always produces the same inject-or-skip decision, making fractional-probability chaos deterministic across test runs. See the Reproducibility section above.
Yes. Deploy MockServer as a sidecar proxy, egress proxy, or reverse proxy in your Kubernetes cluster and attach chaos profiles to inject faults into the traffic flowing through it. MockServer operates at the HTTP layer (L7) and requires explicit routing (e.g. HTTP_PROXY environment variable or Service rewrite). See Chaos Proxy in Kubernetes for sidecar/egress/reverse-proxy deployment patterns, and Isolating Single Service for general proxy setup.
Use the succeedFirst and failRequestCount fields on the chaos profile. Set succeedFirst to 0 (or omit it) and failRequestCount to the number of requests that should fail. For example, failRequestCount: 2 with errorStatus: 503 and errorProbability: 1.0 makes the first 2 matching requests return 503, and all subsequent requests return the normal response. This is useful for testing retry logic and backoff strategies. See the fail-then-recover example and the Stateful / Count-Based Faults section above.
Set graphqlErrors: true and graphqlNullifyData: false on your chaos profile. MockServer will try to parse the original response body as JSON and embed it as the data value in the GraphQL error envelope, resulting in a response like {"data":{...},"errors":[{"message":"..."}]}. If the original body is not valid JSON, data falls back to null. Use graphqlErrorMessage and graphqlErrorCode to customise the error entry. This works on both expectation-level chaos and service-scoped chaos (PUT /mockserver/serviceChaos). See the GraphQL Error Injection section above.
Set omitGrpcStatus: true on a gRPC chaos profile. MockServer will send the HTTP 200 response with a content-type: application/grpc header but deliberately omit the grpc-status trailer. Most gRPC clients treat this as a protocol error or stream reset — which is exactly what you want to test to ensure the client does not silently accept an incomplete RPC as a success. Register the profile via PUT /mockserver/grpcChaos with {"service": "...", "chaos": {"omitGrpcStatus": true}}. See the gRPC Fault Injection section above.