-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document how to benchmark Dragonfly #101
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,102 @@ | ||
--- | ||
sidebar_position: 3 | ||
--- | ||
|
||
# Benchmarking | ||
|
||
Do you have an existing Redis environment and would like to see if Dragonfly could be a better | ||
replacement? <br/> | ||
Are you developing a service and would like to determine which cloud instance type to | ||
allocate for Dragonfly? <br/> | ||
Do you wonder how many replicas you need to support your workload? | ||
|
||
If so, read on, because this page is for you! | ||
|
||
## Squeezing the Best Performance | ||
|
||
A benchmark is done to assess the performance aspects of a system. In the case of Dragonfly, a | ||
benchmark is commonly used to assess the CPU and memory performance & utilization. | ||
|
||
Depending on the goals of your benchmark, you should choose the machine size accordingly. For a | ||
production mimicking benchmark, you should use a machine size and traffic load similar to that of | ||
your busiest production timing, or even higher to allow for some cushion. | ||
|
||
### `io_uring` | ||
|
||
Dragonfly supports both `epoll` and [`io_uring`](https://en.wikipedia.org/wiki/Io_uring) Linux APIs. | ||
`io_uring` is a newer API, which is faster. Dragonfly runs best with `io_uring`, but it is only | ||
available with Linux kernels >= 5.1. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Dragonfly requires 5.10 or later for iouring. Before that iouring API was partial and not reliable. |
||
|
||
`io_uring` is available in Debian versions Bullseye (11) or later, Ubuntu 21.04 or later, Red Hat | ||
Enterprise Linux 9.3 or later, Fedora 37 or later. | ||
|
||
To find if your machine has `io_uring` support you could run the following: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. since we require iouring starting from specific version the most straightforward approach would be to use |
||
|
||
```shell | ||
grep io_uring_setup /proc/kallsyms | ||
``` | ||
|
||
### Choosing Instance Type | ||
|
||
Cloud providers, such as Amazon AWS, provide different types and sizes of virtual machines. When in | ||
doubt, you could always opt in for a bigger instance (for both Dragonfly and the client to send the | ||
benchmarking traffic) so that you'll know what the upper limit is. | ||
|
||
### Choosing Thread Count | ||
|
||
By default, Dragonfly will create a thread for each available CPU on the machine. You can modify | ||
this behavior with the `--proactor_threads` flag. Generally you should not use this flag for a | ||
machine dedicated to running Dragonfly. You can specify a lower number if you only want Dragonfly to | ||
utilize some of the machine, but don't specify a higher number (i.e. more than CPUs) as it would | ||
degrade performance. | ||
|
||
## Setting Up Dragonfly | ||
|
||
Dragonfly can run in [Docker](/getting-started/docker) or directly installed as a | ||
[binary](/getting-started/binary) on your machine. See the [Getting Started](/getting-started) page | ||
for other options and the latest documentation. | ||
|
||
## Reducing Noise | ||
|
||
Ideally, a benchmark should be run in as similar as possible environment as the production setup. | ||
|
||
In busy production deployments, it is common to run Dragonfly in its own machine (virtual or | ||
dedicated). If you plan to do so in your production setup as well (which we highly recommend), | ||
consider running the benchmark in a similar way. | ||
|
||
In practice, it means that any other systems in your setup (like other services & databases) should | ||
run in other machines. Importantly, also **the software that sends the traffic should run in another | ||
machine.** | ||
|
||
## Sending Traffic | ||
|
||
If your service already has existing benchmarking tools, or ways to record and replay production | ||
traffic, you should definitely use them. That would be the closest estimation to what a real | ||
production deployment with a backing Dragonfly would look like. | ||
|
||
If, like many others, you do not (yet) have such a tool, you could either write your own tool to | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Lets be opinionated and say - we usually use memtier - and this is how we do it. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can also mention redis-benchmark - from my experience (have not been using it for the last two years) it's less efficient than memtier but it has more predefined loadtest options specific to redis. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would actually suggest that users write their on load tests, as an end-to-end kind of thing, but that's really beside the point here :) |
||
simulate production traffic or use an existing tool like `memtier_benchmark`. | ||
|
||
When writing your own tool, try to recreate the production traffic as closely as possible. Use the | ||
same commands (like `SET`, `GET`, `SADD`, etc), with the expected ratio between them, and the | ||
expected key and value sizes. | ||
|
||
If you choose to use an existing benchmarking tool, a popular and mature one is | ||
[`memtier_benchmark`](https://github.com/RedisLabs/memtier_benchmark). It's an Open Source tool for | ||
generic load generation and benchmarking with many features. We use it for benchmarking constantly. | ||
Check out their [documentation | ||
page](https://redis.com/blog/memtier_benchmark-a-high-throughput-benchmarking-tool-for-redis-memcached/) | ||
for more details, but as a quick reference you could use: | ||
|
||
```shell | ||
memtier_benchmark \ | ||
--server=<IP / Host> \ | ||
--threads=<thread count> \ | ||
--clients=<clients per thread> \ | ||
--requests=<requests per client> | ||
``` | ||
|
||
## Having Troubles? Anything Unclear? | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. btw, every docs page has "edit page" button at the end... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Shall I remove this then? |
||
|
||
Improving our documentation and helping the community is always of the higher priority for us, so | ||
please feel free to reach out! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this paragraph generated by ChatGPT? :)
I thought about providing specific requirements of how to say reach 1M qps on m5 family instance.
--threads
should not be higher than number of vcpus on that machine.i.e. keep everything very technical and specific.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or any other instance family and interesting target goal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was not generated by ChatGPT, but I've been called worse :)
re/ 1: done
re/ 2: what is indeed the minimal instance?
re/ 3: I already talk about it below
re/ 4: done
re/ 5: done (but I think you meant
--proactor_threads
?)re/ 6+7: do you have these? or would you like me to run them until I figure it out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry 😞 , I was joking.
re/5 - I meant
-t
on memtier side. Dragonfly actually spans all the cpus automatically. memtier always uses 4 by default.