--- title: Scalability & Latency description: "MockServer performance benchmarks: sub-2ms average latency at 95k requests/sec, tuning guidance for memory, thread pools, and log entry limits." layout: page pageOrder: 6 section: 'General' subsection: true sitemap: priority: 0.8 changefreq: 'monthly' lastmod: 2019-11-10T08:00:00+01:00 ---
Average of 1.58ms and p99 of 4ms for 150 parallel clients sending 95,228 requests per second.
MockServer is build to support massive scale from a single instance:
Both performance testing frameworks show similar results up to 2,000 parallel clients at which point Locust reports warnings and the figures are no longer consistent with Apache Benchmark.
The following frameworks & techniques are used to maximise scalability:
MockServer has been performance tested using Apache Benchmark and Locust with the following scenario:
During the test MockServer was run on a Java 13 JVM, with the following command:
java -Xmx500m -Dmockserver.logLevel=WARN -Dmockserver.disableLogging=true -jar ~/.m2/repository/org/mock-server/mockserver-netty/{{ site.mockserver_version }}/mockserver-netty-{{ site.mockserver_version }}-no-dependencies.jar -serverPort 1080
Note: The benchmark uses -Dmockserver.disableLogging=true to suppress system-out log output and -Dmockserver.logLevel=WARN to reduce diagnostic log entries, minimising I/O and formatting overhead. Request/response log entries are still stored in memory for verification. For sustained high-throughput operation, the most effective tuning is to reduce maxLogEntries — see the troubleshooting section below for guidance.
The following graph shows how the p99 increases as the number of parallel clients increase.
Apache Benchmark was executed as follows:
ab -k -n 10000000 -c <parallel clients> http://127.0.0.1:1080/simple
The test results are:
| parallel clients | 50% | 66% | 75% | 80% | 90% | 95% | 98% | 99% | requests/s | mean | |
| 10 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 77,122 | 0.13 | |
| 50 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 85,765 | 0.58 | |
| 100 | 1 | 1 | 1 | 1 | 1 | 2 | 2 | 3 | 92,846 | 1.08 | |
| 150 | 1 | 2 | 2 | 2 | 2 | 2 | 3 | 4 | 95,228 | 1.58 | |
| 250 | 3 | 3 | 3 | 3 | 4 | 5 | 7 | 8 | 86,470 | 2.89 | |
| 500 | 6 | 6 | 6 | 6 | 7 | 7 | 8 | 9 | 83,209 | 6.01 | |
| 750 | 9 | 9 | 10 | 10 | 11 | 11 | 12 | 15 | 75,554 | 9.93 | |
| 1000 | 11 | 12 | 13 | 13 | 14 | 16 | 17 | 21 | 75,423 | 13.26 | |
| 2000 | 24 | 24 | 25 | 26 | 27 | 29 | 31 | 35 | 82,191 | 24.33 | |
| 3000 | 37 | 39 | 40 | 40 | 43 | 46 | 51 | 58 | 78,171 | 38.38 | |
| 4000 | 52 | 55 | 57 | 59 | 64 | 70 | 82 | 91 | 73,552 | 54.38 | |
| 5000 | 65 | 67 | 70 | 71 | 75 | 79 | 90 | 102 | 74,065 | 67.51 | |
| 6000 | 80 | 84 | 88 | 90 | 97 | 104 | 122 | 137 | 70,432 | 85.19 |
Apache Benchmark was executed as follows:
locust --loglevel=WARNING --headless --only-summary -u <parallel clients> -r 100 -t 180 --host=http://127.0.0.1:1080
The test results are:
| parallel clients | 50% | 66% | 75% | 80% | 90% | 95% | 98% | 99% | 99.90% | 99.99% | requests/s | mean |
| 10 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 2 | 5 | 5 | 11 | 0 |
| 50 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 2 | 3 | 5 | 50 | 0 |
| 100 | 0 | 1 | 1 | 1 | 1 | 2 | 2 | 3 | 4 | 8 | 100 | 0 |
| 150 | 1 | 1 | 1 | 1 | 2 | 3 | 3 | 4 | 5 | 6 | 149 | 0 |
| 250 | 2 | 3 | 3 | 4 | 5 | 6 | 7 | 8 | 15 | 46 | 245 | 2 |
| 500 | 2 | 2 | 3 | 3 | 4 | 5 | 6 | 7 | 9 | 46 | 479 | 2 |
| 750 | 3 | 4 | 5 | 6 | 8 | 10 | 12 | 14 | 29 | 34 | 699 | 3 |
| 1000 | 3 | 4 | 6 | 6 | 8 | 10 | 13 | 16 | 36 | 52 | 909 | 3 |
| 2000 | 4 | 7 | 10 | 12 | 22 | 34 | 49 | 59 | 87 | 110 | 1626.14 | 8 |
| 3000 | 51 | 78 | 99 | 110 | 160 | 180 | 220 | 240 | 290 | 310 | 2629.92 | 54 |
The following locustfile.py was used
import locust.stats
from locust import task, between
locust.stats.CONSOLE_STATS_INTERVAL_SEC = 60
from locust.contrib.fasthttp import FastHttpLocust
class UserBehavior(FastHttpUser):
wait_time = between(1, 1)
@task
def request(self):
self.client.get("/simple", verify=False)
{% include_subpage _includes/clustering.html %}
{% include_subpage _includes/performance_configuration.html %}