Memory Tuning Guide

MockServer stores expectations and log entries in memory using ring buffers. The two most important settings that affect memory usage are maxExpectations and maxLogEntries. Each HTTP request processed by MockServer generates 2-3 log entries (the request itself, the expectation match result, and the response).

Both settings have automatic defaults based on available JVM heap space. The table below provides recommended values if you want to override the defaults for different heap sizes:

JVM Heap Size maxExpectations maxLogEntries Approx HTTP Requests Retained
256 MB 1,000 5,000 ~1,500 - 2,500
512 MB 2,000 15,000 ~5,000 - 7,500
1 GB 5,000 40,000 ~13,000 - 20,000

These are conservative estimates assuming typical request/response sizes of 1-5 KB. Large request or response bodies will consume more memory per entry.

Tips for reducing memory usage:

Docker example with memory-constrained settings:

docker run -d --rm -p 1080:1080 \
  --env MOCKSERVER_MAX_EXPECTATIONS=1000 \
  --env MOCKSERVER_MAX_LOG_ENTRIES=5000 \
  mockserver/mockserver

Introduces a delay (in milliseconds) before protocol detection on new TCP connections. This can be used to simulate slow connection establishment, such as when testing client timeout handling or connection pooling behaviour under latency.

Type: long Default: 0

Java Code:

ConfigurationProperties.connectionDelayMillis(long millis)
Configuration.connectionDelay(Delay delay)

System Property:

-Dmockserver.connectionDelayMillis=...

Environment Variable:

MOCKSERVER_CONNECTION_DELAY_MILLIS=...

Property File:

mockserver.connectionDelayMillis=...

Example:

-Dmockserver.connectionDelayMillis="500"
 

Troubleshooting: MockServer Becomes Slow or Unresponsive

If MockServer appears to freeze, hang, or become progressively slower under sustained load, the most likely cause is memory pressure from log entry accumulation. This section explains why it happens and how to fix it.

Why Does MockServer Slow Down?

Every HTTP request that MockServer processes generates 2-3 log entries that are stored in memory regardless of the configured log level. These entries record the received request, expectation match result, and response — they are always stored to support request verification. Each log entry consumes approximately 4-10 KB of heap for small request/response bodies, scaling proportionally for larger bodies. Under sustained high-throughput load, log entry allocation drives significant GC pressure:

Request Rate Response Body Size Log Data Generated Per Minute
1 req/s 1 KB ~1 MB
10 req/s 1 KB ~12 MB
10 req/s 100 KB ~120 MB
1 req/s 1 MB+ ~120 MB

Log entries are stored in a bounded circular queue (maxLogEntries), so total memory usage does not grow indefinitely. However, the constant allocation and eviction of log entries creates GC pressure. When the JVM heap fills up, the garbage collector runs more frequently and for longer, causing pauses that make MockServer appear to freeze. In extreme cases, the JVM may spend almost all of its time in garbage collection, effectively halting request processing.

Large response bodies amplify this problem significantly. A single expectation returning a 10 MB response at 1 request per second generates over 600 MB of log data per minute — far more than a default heap can handle. Even with ring buffer eviction, the JVM must allocate and then garbage-collect these large objects continuously.

Note: Expectations with large response bodies also consume heap proportionally (e.g., a 50 KB response body results in ~55-75 KB per stored expectation). If you have many expectations with large bodies, reduce maxExpectations as well.

How To Fix It

Apply one or more of the following, depending on your use case:

Fix When To Use Trade-off
Increase JVM heap size Always recommended for large responses or high request rates Uses more container/host memory
Reduce maxLogEntries The single most effective fix — fewer entries means less memory and less GC pressure Fewer requests available for verification
Reduce maxExpectations When expectations contain large response bodies Fewer expectations can be stored simultaneously
Switch to ZGC (-XX:+UseZGC) Heap ≥ 4 GB and matcher latency matters — typically gives single-digit-millisecond GC pauses where G1 commonly sits in the 50–200 ms range under sustained allocation Fixed memory overhead (less attractive below ~4 GB); set -Xms and -Xmx to the same value (e.g. -Xms4g -Xmx4g) so the heap is pre-committed

Note on GC selection: Java 17 makes ZGC production-ready in every JDK distribution MockServer supports. For deployments with deep ring buffers (high maxLogEntries) or large heaps, -XX:+UseZGC can reduce p99 matcher latency by holding stop-the-world pauses to the single-digit-millisecond range (typically 1–5 ms in Java 17's non-generational ZGC). For small-fixture deployments (a sidecar in a test pipeline), the default GC is fine — ZGC's fixed overhead isn't worth it below ~4 GB heap. In containerised deployments using ZGC, size the container memory limit at ~1.5× your -Xmx value to leave headroom for JVM overhead (code cache, metaspace, thread stacks) and Netty's direct buffer pool. Note that ZGC multi-maps the same physical pages for its coloured-pointer scheme; under some cgroup RSS-accounting modes those pages are counted multiple times against the container limit, so the process can be OOM-killed even though the actual physical footprint fits (e.g. -Xmx4g--memory=6g). Shenandoah is not recommended — it is not available in Oracle JDK 17 and not universally available across all JDK distributions, so ZGC is the simpler choice.

Note on logLevel and disableLogging: Setting logLevel to WARN reduces diagnostic TRACE/DEBUG log entries but does not prevent request/response recording — the memory-intensive log entries (received requests, matched expectations, and responses) are always stored regardless of log level, as they are required for verification. Similarly, disableLogging only suppresses system-out output and does not reduce memory usage. To reduce memory, lower maxLogEntries or increase heap size.

Note on logLevel and matching throughput: while stored log entries are independent of log level (above), the transient per-request allocation on the matching hot path is not. At INFO, every request that scans expectations builds and logs a diagnostic “matched”/“did not match because…” message per matcher; below INFO that work — the per-matcher log entry and the human-readable “because” string assembly — is skipped entirely. For a deployment with many expectations under sustained load this is the single largest matching-path allocation, so running at WARN noticeably cuts allocation churn and GC pressure (it does not change which requests match, only whether the diagnostic narrative is produced). This is purely a throughput/GC lever; it is independent of the steady-state memory footprint controlled by maxLogEntries.

Recommended Configurations

High-throughput with small responses (e.g., API mocking at >10 req/s with <10 KB bodies):

docker run -d --rm -p 1080:1080 \
  -e MOCKSERVER_MAX_LOG_ENTRIES=5000 \
  mockserver/mockserver

Large response bodies (e.g., responses >100 KB):

docker run -d --rm -p 1080:1080 \
  -e JAVA_TOOL_OPTIONS="-Xmx1g" \
  -e MOCKSERVER_MAX_LOG_ENTRIES=1000 \
  -e MOCKSERVER_MAX_EXPECTATIONS=100 \
  mockserver/mockserver

Maximum throughput, minimal memory (verification limited to most recent requests):

docker run -d --rm -p 1080:1080 \
  -e JAVA_TOOL_OPTIONS="-Xmx512m" \
  -e MOCKSERVER_MAX_LOG_ENTRIES=100 \
  -e MOCKSERVER_LOG_LEVEL=WARN \
  mockserver/mockserver

MOCKSERVER_LOG_LEVEL=WARN drops the per-matcher diagnostic logging on the matching hot path (see the note above), which is the largest matching-path allocation when many expectations are registered. If your matchers rely heavily on regular expressions and your patterns and inputs are trusted, also add -e MOCKSERVER_REGEX_MATCHING_TIMEOUT_MILLIS=0 to evaluate regexes inline and skip the per-regex thread hand-off (this removes the catastrophic-backtracking guard — see Regex Matching Timeout).

Configuring JVM Heap Size

The default JVM heap in the MockServer Docker image is determined by the JVM's container-aware defaults (typically 25% of the container's memory limit). To set an explicit heap size, use the JAVA_TOOL_OPTIONS environment variable:

docker run -d --rm -p 1080:1080 \
  -e JAVA_TOOL_OPTIONS="-Xmx512m" \
  mockserver/mockserver

When running with docker compose:

services:
  mockServer:
    image: mockserver/mockserver
    ports:
      - "1080:1080"
    environment:
      JAVA_TOOL_OPTIONS: "-Xmx512m"
      MOCKSERVER_MAX_LOG_ENTRIES: "5000"

When running as a standalone JAR:

java -Xmx512m -jar mockserver-netty.jar -serverPort 1080

Monitoring Memory Usage

To diagnose memory issues, enable CSV memory tracking:

docker run -d --rm -p 1080:1080 \
  -e MOCKSERVER_OUTPUT_MEMORY_USAGE_CSV=true \
  -e MOCKSERVER_MEMORY_USAGE_CSV_DIRECTORY=/config \
  -v $(pwd):/config \
  mockserver/mockserver

This creates a memoryUsage_<date>.csv file that records heap usage, log entry count, and expectation count over time. If you see heap usage consistently near the maximum, increase -Xmx or reduce maxLogEntries.

Scalability Configuration:

Number of threads for main event loop

These threads are used for fast non-blocking activities such as:

Expectation actions are handled in a separate thread pool to ensure slow object or class callbacks and response / forward delays do not impact the main event loop.

Type: int Default: 5

Java Code:

ConfigurationProperties.nioEventLoopThreadCount(int count)

System Property:

-Dmockserver.nioEventLoopThreadCount=...

Environment Variable:

MOCKSERVER_NIO_EVENT_LOOP_THREAD_COUNT=...

Property File:

mockserver.nioEventLoopThreadCount=...

Example:

-Dmockserver.nioEventLoopThreadCount="5"

Number of threads for the action handler thread pool

These threads are used for handling actions such as:

Type: int Default: maximum of 5 or available processors count

Java Code:

ConfigurationProperties.actionHandlerThreadCount(int count)

System Property:

-Dmockserver.actionHandlerThreadCount=...

Environment Variable:

MOCKSERVER_ACTION_HANDLER_THREAD_COUNT=...

Property File:

mockserver.actionHandlerThreadCount=...

Example:

-Dmockserver.actionHandlerThreadCount="5"

Number of threads for client event loop when calling downstream

These threads are used for fast non-blocking activities such as, reading and de-serialise all requests and responses

Type: int Default: 5

Java Code:

ConfigurationProperties.clientNioEventLoopThreadCount(int count)

System Property:

-Dmockserver.clientNioEventLoopThreadCount=...

Environment Variable:

MOCKSERVER_CLIENT_NIO_EVENT_LOOP_THREAD_COUNT=...

Property File:

mockserver.clientNioEventLoopThreadCount=...

Example:

-Dmockserver.clientNioEventLoopThreadCount="5"

Number of threads for each expectation with a method / closure callback (i.e. web socket client) in the org.mockserver.client.MockServerClient

This setting only effects the Java client and how requests each method / closure callbacks it can handle, the default is 5 which should be suitable except in extreme cases.

Type: int Default: 5

Java Code:

ConfigurationProperties.webSocketClientEventLoopThreadCount(int count)

System Property:

-Dmockserver.webSocketClientEventLoopThreadCount=...

Environment Variable:

MOCKSERVER_WEB_SOCKET_CLIENT_EVENT_LOOP_THREAD_COUNT=...

Property File:

mockserver.webSocketClientEventLoopThreadCount=...

Example:

-Dmockserver.webSocketClientEventLoopThreadCount="5"

Maximum time allowed in milliseconds for any future to wait, for example when waiting for a response over a web socket callback.

Type: long Default: 90000

Java Code:

ConfigurationProperties.maxFutureTimeout(long milliseconds)

System Property:

-Dmockserver.maxFutureTimeout=...

Environment Variable:

MOCKSERVER_MAX_FUTURE_TIMEOUT=...

Property File:

mockserver.maxFutureTimeout=...

Example:

-Dmockserver.maxFutureTimeout="90000"

If true (the default) request matchers will fail on the first non-matching field, if false request matchers will compare all fields.

Set to false when debugging matching issues to see all mismatching fields in a single log entry. See Troubleshooting Matching for a step-by-step guide.

Type: boolean Default: true

Java Code:

ConfigurationProperties.matchersFailFast(boolean enable)

System Property:

-Dmockserver.matchersFailFast=...

Environment Variable:

MOCKSERVER_MATCHERS_FAIL_FAST=...

Property File:

mockserver.matchersFailFast=...

Example:

-Dmockserver.matchersFailFast="false"

The the minimum level of logs to record in the event log and to output to system out (if system out log output is not disabled). The lower the log level the more log entries will be captured, particularly at TRACE level logging.

Type: string Default: INFO

Java Code:

ConfigurationProperties.logLevel(String level)

System Property:

-Dmockserver.logLevel=...

Environment Variable:

MOCKSERVER_LOG_LEVEL=...

Property File:

mockserver.logLevel=...

The log level, which can be TRACE, DEBUG, INFO, WARN, ERROR, OFF, FINEST, FINE, INFO, WARNING, SEVERE

Example:

-Dmockserver.logLevel="DEBUG"

Disable logging to the system output

Type: boolean Default: false

Java Code:

ConfigurationProperties.disableSystemOut(boolean disableSystemOut)

System Property:

-Dmockserver.disableSystemOut=...

Environment Variable:

MOCKSERVER_DISABLE_SYSTEM_OUT=...

Property File:

mockserver.disableSystemOut=...

Example:

-Dmockserver.disableSystemOut="true"

Disable logging output to system out. Request/response log entries are still recorded in memory for verification.

Type: boolean Default: false

Java Code:

ConfigurationProperties.disableLogging(boolean disableLogging)

System Property:

-Dmockserver.disableLogging=...

Environment Variable:

MOCKSERVER_DISABLE_LOGGING=...

Property File:

mockserver.disableLogging=...

Example:

-Dmockserver.disableLogging="true"

Maximum request body size in bytes that conversation-aware LLM matchers will parse. LLM conversation matchers (whenLatestMessageContains, whenContainsToolResultFor, etc.) parse the inbound request body as JSON to extract the message history. For deep conversation histories or large tool results, this parse step is proportional to body size.

Requests whose body exceeds this cap skip conversation-aware matching and are treated as a no-match for conversation predicates (the scenario state machine is unaffected). Increase this value only when your LLM conversations regularly include very large tool results or long message histories. Reduce it in memory-constrained environments to bound the maximum allocation per matching attempt.

Type: int Default: 1048576 (1 MiB) Range: 16384 (16 KiB) — 67108864 (64 MiB)

Java Code:

ConfigurationProperties.maxLlmConversationBodySize(int size)
Configuration.maxLlmConversationBodySize(Integer size)

System Property:

-Dmockserver.maxLlmConversationBodySize=...

Environment Variable:

MOCKSERVER_MAX_LLM_CONVERSATION_BODY_SIZE=...

Property File:

mockserver.maxLlmConversationBodySize=...

Example:

-Dmockserver.maxLlmConversationBodySize="4194304"