-
Notifications
You must be signed in to change notification settings - Fork 653
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce Valkey Over RDMA transport (experimental) #477
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## unstable #477 +/- ##
=========================================
Coverage 70.23% 70.24%
=========================================
Files 112 112
Lines 60602 60602
=========================================
+ Hits 42566 42568 +2
+ Misses 18036 18034 -2 |
This PR could be tested by client. To build client with RDMA:
To test by commands:
|
Many cloud providers offer RDMA acceleration on their cloud platforms, and I think that there is a foundational basis for the application of Valkey over RDMA. We performed some performance tests on this PR on the 8th generation ECS instances (g8ae.4xlarge, 16 vCPUs, 64G DDR) provided by Alibaba Cloud. Test results indicate that, compared to TCP sockets, the use of RDMA can significantly enhance performance. Test command of server side:
Test command of client side:
The performance test results are as shown in the following table. Apart from LRANGE_100 (performance improvement but not substantially), in other scenarios (PING, SET, GET) the throughput can be increased by at least 76%, and the average (AVG) and P99 latencies can be reduced by at least 40%.
|
Hi, @hz-cheng I notice that you are the author of alibaba-cloud erdma driver for both linux kernel and rdma-core. Cooooooooool! |
More, If necessary, I could try reaching out to relevant colleagues to see if we can offer some Alibaba Cloud ECS instances to the community for free, so that the community can use and test Valkey over RDMA, as well as for future CI/CD purposes. |
Is there a corresponding client that enables RDMA? |
See this comment please. |
Hi @madolson , |
Almost doubled throughput is impressive. I don't know much about RDMA. It's many lines of code, but all of it is the module. That's great, but what are the risks of breaking it if we change something in the connection abstractions? We need to be aware that when we merge this, we will have to keep maintaining this. Is it possible to use TLS with RDMA? |
@pizhenwei The numbers do look great. I haven't gotten a chance to look at it yet, hopefully some time this week. |
Hi, Because the valkey-rdma.so(if built as a module) uses the To avoid the ricks from the mismatched
Once the core connection abstraction changes, all the connection types should do compat work, this rule is also applicative for rdma. I volunteer to maintain this rdma support. PS: I have experience on open source community like Linux kernel, QEMU, Redis, SPDK, libiscsi, tgt, atop, utils-linux and procps-ns.
As far as I can see, we can't use TLS with RDMA currently. I read document of openssl Abstract Record Layer, TLS with RDMA is workable in theory. But it would be amount of work. |
@pizhenwei Thanks for your contribution and @hz-cheng Thanks for your perfect number.
Let's core team member discuss this important feature, and send you feedback ASAP, Thanks |
There are two parts of this PR:
I have no experience on windows RDMA, I read document and found that windows does support RDMA, but not Linux style Verbs API. This means that we need a windows version in the future. (I imagine rdma-windows.c is needed).
It's quite easy to build RDMA support into Valkey with a few lines change. If so, the valkey-server has to link libibvers.so and librdmacm.so. Let's look at the dynamic shared libraries of module version:
If a user starts valkey-server with rdma module, valkey-server loads the additional shared libraries on demand. If building RDMA into valkey is necessary, please let me know.
Currently, the valkey-server support 3 connection type:
Run valkey-server by command: Once loading RDMA: The RDMA has better performance in a good network env like @hz-cheng's and my test report, but I tested mlx5 with packet drop rate 0.001, TCP performance affects a few, but RDMA performance drops a lot. I imagine a topo like:
It's possible to use RDMA within a short distance, or TCP over a long distance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to get this merged. It's a good contribution. I like that it's a module. When it's merged, we can let clients implement it and test it.
The RDMA port is neither a TCP port nor an UDP port? We've been talking about the possibility of adding QUIC in the future (optional dependency, maybe as a module too) and that can be on the same port too, right, since it's UDP?
Actually, the client side(for C only) is ready(as you see, several guys and me have already got the test report). Once the server side gets merged, I'll create PR for client as soon as possible.
Right.
Right. |
@pizhenwei Actually the latest RDMA technology such as Alibaba Cloud Elastic RDMA doesn't encounter this performance drop when packet drop rate 0.001,because the latest RDMA technology widely supports SACK lossy optimization. |
Hi, Let's focus on 'why does Valkey need to enable both TCP/IP and RDMA together' or 'enabling both TCP/IP and RDMA is useful or not in the real scenario', but not extend the topic to 'the latest RDMA technology' here. |
Hi @zuiderkwast , I create a new PR for the document part, and force pushed a new version here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good now. Just some minor comments.
RDMA is the abbreviation of remote direct memory access. It is a technology that enables computers in a network to exchange data in the main memory without involving the processor, cache, or operating system of either computer. This means RDMA has a better performance than TCP, the test results show Valkey Over RDMA has a ~2.5X QPS and lower latency. In recent years, RDMA gets popular in the data center, especially RoCE(RDMA over Converged Ethernet) architecture has been widely used. Cloud Vendors also start to support RDMA instance in order to accelerate networking performance. End-user would enjoy the improvement easily. Introduce Valkey Over RDMA protocol as a new transport for Valkey. For now, we defined 4 commands: - GetServerFeature & SetClientFeature: the two commands are used to negotiate features for further extension. There is no feature definition in this version. Flow control and multi-buffer may be supported in the future, this needs feature negotiation. - Keepalive - RegisterXferMemory: the heart to transfer the real payload. The 'TX buffer' and 'RX buffer' are designed by RDMA remote memory with RDMA write/write with imm, it's similar to several mechanisms introduced by papers(but not same): - Socksdirect: datacenter sockets can be fast and compatible <https://dl.acm.org/doi/10.1145/3341302.3342071> - LITE Kernel RDMA Support for Datacenter Applications <https://dl.acm.org/doi/abs/10.1145/3132747.3132762> - FaRM: Fast Remote Memory <https://www.usenix.org/system/files/conference/nsdi14/nsdi14-paper-dragojevic.pdf> Link: valkey-io/valkey#477 Co-authored-by: Xinhao Kong <[email protected]> Co-authored-by: Huaping Zhou <[email protected]> Co-authored-by: zhuo jiang <[email protected]> Co-authored-by: Yiming Zhang <[email protected]> Co-authored-by: Jianxi Ye <[email protected]> Signed-off-by: zhenwei pi <[email protected]>
RDMA is the abbreviation of remote direct memory access. It is a technology that enables computers in a network to exchange data in the main memory without involving the processor, cache, or operating system of either computer. This means RDMA has a better performance than TCP, the test results show Valkey Over RDMA has a ~2.5X QPS and lower latency. In recent years, RDMA gets popular in the data center, especially RoCE(RDMA over Converged Ethernet) architecture has been widely used. Cloud Vendors also start to support RDMA instance in order to accelerate networking performance. End-user would enjoy the improvement easily. Introduce Valkey Over RDMA protocol as a new transport for Valkey. For now, we defined 4 commands: - GetServerFeature & SetClientFeature: the two commands are used to negotiate features for further extension. There is no feature definition in this version. Flow control and multi-buffer may be supported in the future, this needs feature negotiation. - Keepalive - RegisterXferMemory: the heart to transfer the real payload. The 'TX buffer' and 'RX buffer' are designed by RDMA remote memory with RDMA write/write with imm, it's similar to several mechanisms introduced by papers(but not same): - Socksdirect: datacenter sockets can be fast and compatible <https://dl.acm.org/doi/10.1145/3341302.3342071> - LITE Kernel RDMA Support for Datacenter Applications <https://dl.acm.org/doi/abs/10.1145/3132747.3132762> - FaRM: Fast Remote Memory <https://www.usenix.org/system/files/conference/nsdi14/nsdi14-paper-dragojevic.pdf> Link: valkey-io/valkey#477 Co-authored-by: Xinhao Kong <[email protected]> Co-authored-by: Huaping Zhou <[email protected]> Co-authored-by: zhuo jiang <[email protected]> Co-authored-by: Yiming Zhang <[email protected]> Co-authored-by: Jianxi Ye <[email protected]> Signed-off-by: zhenwei pi <[email protected]>
RDMA is the abbreviation of remote direct memory access. It is a technology that enables computers in a network to exchange data in the main memory without involving the processor, cache, or operating system of either computer. This means RDMA has a better performance than TCP, the test results show Valkey Over RDMA has a ~2.5X QPS and lower latency. In recent years, RDMA gets popular in the data center, especially RoCE(RDMA over Converged Ethernet) architecture has been widely used. Cloud Vendors also start to support RDMA instance in order to accelerate networking performance. End-user would enjoy the improvement easily. Introduce Valkey Over RDMA protocol as a new transport for Valkey. For now, we defined 4 commands: - GetServerFeature & SetClientFeature: the two commands are used to negotiate features for further extension. There is no feature definition in this version. Flow control and multi-buffer may be supported in the future, this needs feature negotiation. - Keepalive - RegisterXferMemory: the heart to transfer the real payload. The 'TX buffer' and 'RX buffer' are designed by RDMA remote memory with RDMA write/write with imm, it's similar to several mechanisms introduced by papers(but not same): - Socksdirect: datacenter sockets can be fast and compatible <https://dl.acm.org/doi/10.1145/3341302.3342071> - LITE Kernel RDMA Support for Datacenter Applications <https://dl.acm.org/doi/abs/10.1145/3132747.3132762> - FaRM: Fast Remote Memory <https://www.usenix.org/system/files/conference/nsdi14/nsdi14-paper-dragojevic.pdf> Link: valkey-io/valkey#477 Co-authored-by: Xinhao Kong <[email protected]> Co-authored-by: Huaping Zhou <[email protected]> Co-authored-by: zhuo jiang <[email protected]> Co-authored-by: Yiming Zhang <[email protected]> Co-authored-by: Jianxi Ye <[email protected]> Signed-off-by: zhenwei pi <[email protected]>
RDMA is the abbreviation of remote direct memory access. It is a technology that enables computers in a network to exchange data in the main memory without involving the processor, cache, or operating system of either computer. This means RDMA has a better performance than TCP, the test results show Valkey Over RDMA has a ~2.5X QPS and lower latency. In recent years, RDMA gets popular in the data center, especially RoCE(RDMA over Converged Ethernet) architecture has been widely used. Cloud Vendors also start to support RDMA instance in order to accelerate networking performance. End-user would enjoy the improvement easily. Introduce Valkey Over RDMA protocol as a new transport for Valkey. For now, we defined 4 commands: - GetServerFeature & SetClientFeature: the two commands are used to negotiate features for further extension. There is no feature definition in this version. Flow control and multi-buffer may be supported in the future, this needs feature negotiation. - Keepalive - RegisterXferMemory: the heart to transfer the real payload. The 'TX buffer' and 'RX buffer' are designed by RDMA remote memory with RDMA write/write with imm, it's similar to several mechanisms introduced by papers(but not same): - Socksdirect: datacenter sockets can be fast and compatible <https://dl.acm.org/doi/10.1145/3341302.3342071> - LITE Kernel RDMA Support for Datacenter Applications <https://dl.acm.org/doi/abs/10.1145/3132747.3132762> - FaRM: Fast Remote Memory <https://www.usenix.org/system/files/conference/nsdi14/nsdi14-paper-dragojevic.pdf> Link: valkey-io/valkey#477 Co-authored-by: Xinhao Kong <[email protected]> Co-authored-by: Huaping Zhou <[email protected]> Co-authored-by: zhuo jiang <[email protected]> Co-authored-by: Yiming Zhang <[email protected]> Co-authored-by: Jianxi Ye <[email protected]> Signed-off-by: zhenwei pi <[email protected]>
RDMA is the abbreviation of remote direct memory access. It is a technology that enables computers in a network to exchange data in the main memory without involving the processor, cache, or operating system of either computer. This means RDMA has a better performance than TCP, the test results show Valkey Over RDMA has a ~2.5X QPS and lower latency. In recent years, RDMA gets popular in the data center, especially RoCE(RDMA over Converged Ethernet) architecture has been widely used. Cloud Vendors also start to support RDMA instance in order to accelerate networking performance. End-user would enjoy the improvement easily. Introduce Valkey Over RDMA protocol as a new transport for Valkey. For now, we defined 4 commands: - GetServerFeature & SetClientFeature: the two commands are used to negotiate features for further extension. There is no feature definition in this version. Flow control and multi-buffer may be supported in the future, this needs feature negotiation. - Keepalive - RegisterXferMemory: the heart to transfer the real payload. The 'TX buffer' and 'RX buffer' are designed by RDMA remote memory with RDMA write/write with imm, it's similar to several mechanisms introduced by papers(but not same): - Socksdirect: datacenter sockets can be fast and compatible <https://dl.acm.org/doi/10.1145/3341302.3342071> - LITE Kernel RDMA Support for Datacenter Applications <https://dl.acm.org/doi/abs/10.1145/3132747.3132762> - FaRM: Fast Remote Memory <https://www.usenix.org/system/files/conference/nsdi14/nsdi14-paper-dragojevic.pdf> Link: valkey-io/valkey#477 Co-authored-by: Xinhao Kong <[email protected]> Co-authored-by: Huaping Zhou <[email protected]> Co-authored-by: zhuo jiang <[email protected]> Co-authored-by: Yiming Zhang <[email protected]> Co-authored-by: Jianxi Ye <[email protected]> Signed-off-by: zhenwei pi <[email protected]>
RDMA is the abbreviation of remote direct memory access. It is a technology that enables computers in a network to exchange data in the main memory without involving the processor, cache, or operating system of either computer. This means RDMA has a better performance than TCP, the test results show Valkey Over RDMA has a ~2.5X QPS and lower latency. In recent years, RDMA gets popular in the data center, especially RoCE(RDMA over Converged Ethernet) architecture has been widely used. Cloud Vendors also start to support RDMA instance in order to accelerate networking performance. End-user would enjoy the improvement easily. Introduce Valkey Over RDMA protocol as a new transport for Valkey. For now, we defined 4 commands: - GetServerFeature & SetClientFeature: the two commands are used to negotiate features for further extension. There is no feature definition in this version. Flow control and multi-buffer may be supported in the future, this needs feature negotiation. - Keepalive - RegisterXferMemory: the heart to transfer the real payload. The 'TX buffer' and 'RX buffer' are designed by RDMA remote memory with RDMA write/write with imm, it's similar to several mechanisms introduced by papers(but not same): - Socksdirect: datacenter sockets can be fast and compatible <https://dl.acm.org/doi/10.1145/3341302.3342071> - LITE Kernel RDMA Support for Datacenter Applications <https://dl.acm.org/doi/abs/10.1145/3132747.3132762> - FaRM: Fast Remote Memory <https://www.usenix.org/system/files/conference/nsdi14/nsdi14-paper-dragojevic.pdf> Link: valkey-io/valkey#477 Co-authored-by: Xinhao Kong <[email protected]> Co-authored-by: Huaping Zhou <[email protected]> Co-authored-by: zhuo jiang <[email protected]> Co-authored-by: Yiming Zhang <[email protected]> Co-authored-by: Jianxi Ye <[email protected]> Signed-off-by: zhenwei pi <[email protected]>
This script is used to: - auto-detect already existing RDMA device - create RXE device - launch Valkey Over RDMA server, and launch rdma-test client Example: ./runtest-rdma Valkey Over RDMA build rdma-test program [OK] Valkey Over RDMA test prepare rxe_virbr1 <192.168.123.1> [OK] Valkey Over RDMA valkey-server start [OK] Valkey Over RDMA test in 4.92s [OK] Valkey Over RDMA test thread[1476602] PING/PONG [OK] Valkey Over RDMA test thread[1476603] PING/PONG [OK] Valkey Over RDMA test thread[1476605] PING/PONG [OK] Valkey Over RDMA test thread[1476606] PING/PONG [OK] Valkey Over RDMA test thread[1476607] PING/PONG [OK] Valkey Over RDMA test thread[1476600] PING/PONG [OK] Valkey Over RDMA test thread[1476604] PING/PONG [OK] Valkey Over RDMA test thread[1476601] PING/PONG [OK] Valkey Over RDMA test thread[1476602] prepare 640 KVs [OK] Valkey Over RDMA test thread[1476602] SET 640 KVs [OK] Valkey Over RDMA test thread[1476602] BGSAVE [OK] Valkey Over RDMA test thread[1476602] GET 640 KVs [OK] Valkey Over RDMA test thread[1476601] prepare 5032 KVs [OK] Valkey Over RDMA test thread[1476601] SET 5032 KVs [OK] Valkey Over RDMA test thread[1476600] prepare 5067 KVs [OK] Valkey Over RDMA test thread[1476601] GET 5032 KVs [OK] Valkey Over RDMA test thread[1476600] SET 5067 KVs [OK] Valkey Over RDMA test thread[1476600] GET 5067 KVs [OK] Valkey Over RDMA test thread[1476607] prepare 5781 KVs [OK] Valkey Over RDMA test thread[1476607] SET 5781 KVs [OK] Valkey Over RDMA test thread[1476605] prepare 5803 KVs [OK] Valkey Over RDMA test thread[1476607] GET 5781 KVs [OK] Valkey Over RDMA test thread[1476605] SET 5803 KVs [OK] Valkey Over RDMA test thread[1476605] GET 5803 KVs [OK] Valkey Over RDMA test thread[1476606] prepare 7305 KVs [OK] Valkey Over RDMA test thread[1476606] SET 7305 KVs [OK] Valkey Over RDMA test thread[1476604] prepare 7803 KVs [OK] Valkey Over RDMA test thread[1476603] prepare 8107 KVs [OK] Valkey Over RDMA test thread[1476606] GET 7305 KVs [OK] Valkey Over RDMA test thread[1476604] SET 7803 KVs [OK] Valkey Over RDMA test thread[1476603] SET 8107 KVs [OK] Valkey Over RDMA test thread[1476604] GET 7803 KVs [OK] Valkey Over RDMA test thread[1476603] GET 8107 KVs [OK] Valkey Over RDMA test [OK] Valkey Over RDMA test over 192.168.123.1 [OK] Thanks to Viktor Söderqvist for review suggestions! Signed-off-by: zhenwei pi <[email protected]>
Once install/remove RXE in separated steps of github workflow, hit failure on unknown reasons. So merge these steps into one. Signed-off-by: zhenwei pi <[email protected]>
Thanks for your suggestions! Apply all your suggestions. |
We normally squash-merge. Should we keep the individual commits in this case? @valkey-io/core-team |
Squash-merge is fine to me too. :) |
great, it finally get merged, although i did not participate in the review (not familiar with it and also don't have enough time to dive in), thank you for your time and the work. |
@enjoy-binbin Yeah, I merged it. :) I'm not very familiar with RDMA either, but all of the code is in separate files and it is not compiled by default, so I'm confident it doesn't break anything. In the future, if we want it to be officially supported (not experimental) we should probably...
|
Thanks to all the folks (@zuiderkwast @enjoy-binbin @PingXie @madolson @hwware @daniel-house @coderyanghang @zvi-code @hz-cheng @baronwangr) in this long and interesting journey! Please feel free to contact me [email protected] on any issue and feedback. |
Thanks @pizhenwei ! It's a cool feature, and hopefully we can get other folks to use it in production soon. |
Sure, I'm working on libvalkey to support RDMA ASAP. |
Nice work @pizhenwei , glad to see it finally got merged! |
I'm happy to see this commit merged. @pizhenwei, I opened an issue to support this as part of vlakey-glide valkey-io/valkey-glide#1963. Would you be interested in collaboration. The main advantage of valkey-glide is that we can implement it once and it shall be available in all supported languages currently, java, python and node.js, but in the future more to come. Feel free to share your ideas for client implementation at vlakey-glide valkey-io/valkey-glide#1963 |
Hi, |
Valkey Over RDMA[1] has been supported as experimental feature since Valkey 8.0. Support RDMA transport for the client side. RDMA is not a builtin feature, supported as module only, so we have to run test.sh with more argument @VALKEY_RDMA_MODULE and @VALKEY_RDMA_ADDR. An example to run test.sh: VALKEY_RDMA_MODULE=/path/to/valkey-rdma.so VALKEY_RDMA_ADDR=192.168.122.1 TEST_RDMA=1 ./test.sh ... Testing against RDMA connection (192.168.122.1:56379): #138 Is able to deliver commands: PASSED #139 Is a able to send commands verbatim: PASSED #140 %s String interpolation works: PASSED #141 %b String interpolation works: PASSED #142 Binary reply length is correct: PASSED #143 Can parse nil replies: PASSED #144 Can parse integer replies: PASSED #145 Can parse multi bulk replies: PASSED #146 Can handle nested multi bulk replies: PASSED #147 Send command by passing argc/argv: PASSED #148 Can pass NULL to valkeyGetReply: PASSED #149 RESP3 PUSH messages are handled out of band by default: PASSED #150 We can set a custom RESP3 PUSH handler: PASSED #151 We properly handle a NIL invalidation payload: PASSED #152 With no handler, PUSH replies come in-band: PASSED #153 With no PUSH handler, no replies are lost: PASSED #154 We set a default RESP3 handler for valkeyContext: PASSED #155 We don't set a default RESP3 push handler for valkeyAsyncContext: PASSED #156 Our VALKEY_OPT_NO_PUSH_AUTOFREE flag works: PASSED #157 We can use valkeyOptions to set a custom PUSH handler for valkeyContext: PASSED #158 We can use valkeyOptions to set a custom PUSH handler for valkeyAsyncContext: PASSED #159 We can use valkeyOptions to set privdata: PASSED #160 Our privdata destructor fires when we free the context: PASSED #161 Successfully completes a command when the timeout is not exceeded: PASSED #162 Does not return a reply when the command times out: SKIPPED #163 Reconnect properly reconnects after a timeout: PASSED #164 Reconnect properly uses owned parameters: PASSED #165 Returns I/O error when the connection is lost: PASSED #166 Returns I/O error on socket timeout: PASSED #167 Set error when an invalid timeout usec value is used during connect: PASSED #168 Set error when an invalid timeout sec value is used during connect: PASSED #169 Append format command: PASSED #170 Throughput: (1000x PING: 0.010s) (1000x LRANGE with 500 elements: 0.060s) (1000x INCRBY: 0.012s) (10000x PING (pipelined): 0.066s) (10000x LRANGE with 500 elements (pipelined): 0.523s) (10000x INCRBY (pipelined): 0.024s) ... Link[1]: valkey-io/valkey#477 Signed-off-by: zhenwei pi <[email protected]>
Valkey Over RDMA[1] has been supported as experimental feature since Valkey 8.0. Support RDMA transport for the client side. RDMA is not a builtin feature, supported as module only, so we have to run test.sh with more argument @VALKEY_RDMA_MODULE and @VALKEY_RDMA_ADDR. An example to run test.sh: VALKEY_RDMA_MODULE=/path/to/valkey-rdma.so VALKEY_RDMA_ADDR=192.168.122.1 TEST_RDMA=1 ./test.sh ... Testing against RDMA connection (192.168.122.1:56379): #138 Is able to deliver commands: PASSED #139 Is a able to send commands verbatim: PASSED #140 %s String interpolation works: PASSED #141 %b String interpolation works: PASSED #142 Binary reply length is correct: PASSED #143 Can parse nil replies: PASSED #144 Can parse integer replies: PASSED #145 Can parse multi bulk replies: PASSED #146 Can handle nested multi bulk replies: PASSED #147 Send command by passing argc/argv: PASSED #148 Can pass NULL to valkeyGetReply: PASSED #149 RESP3 PUSH messages are handled out of band by default: PASSED #150 We can set a custom RESP3 PUSH handler: PASSED #151 We properly handle a NIL invalidation payload: PASSED #152 With no handler, PUSH replies come in-band: PASSED #153 With no PUSH handler, no replies are lost: PASSED #154 We set a default RESP3 handler for valkeyContext: PASSED #155 We don't set a default RESP3 push handler for valkeyAsyncContext: PASSED #156 Our VALKEY_OPT_NO_PUSH_AUTOFREE flag works: PASSED #157 We can use valkeyOptions to set a custom PUSH handler for valkeyContext: PASSED #158 We can use valkeyOptions to set a custom PUSH handler for valkeyAsyncContext: PASSED #159 We can use valkeyOptions to set privdata: PASSED #160 Our privdata destructor fires when we free the context: PASSED #161 Successfully completes a command when the timeout is not exceeded: PASSED #162 Does not return a reply when the command times out: SKIPPED #163 Reconnect properly reconnects after a timeout: PASSED #164 Reconnect properly uses owned parameters: PASSED #165 Returns I/O error when the connection is lost: PASSED #166 Returns I/O error on socket timeout: PASSED #167 Set error when an invalid timeout usec value is used during connect: PASSED #168 Set error when an invalid timeout sec value is used during connect: PASSED #169 Append format command: PASSED #170 Throughput: (1000x PING: 0.010s) (1000x LRANGE with 500 elements: 0.060s) (1000x INCRBY: 0.012s) (10000x PING (pipelined): 0.066s) (10000x LRANGE with 500 elements (pipelined): 0.523s) (10000x INCRBY (pipelined): 0.024s) ... Link[1]: valkey-io/valkey#477 Signed-off-by: zhenwei pi <[email protected]>
Valkey Over RDMA[1] has been supported as experimental feature since Valkey 8.0. Support RDMA transport for the client side. RDMA is not a builtin feature, supported as module only, so we have to run test.sh with more argument @VALKEY_RDMA_MODULE and @VALKEY_RDMA_ADDR. An example to run test.sh: VALKEY_RDMA_MODULE=/path/to/valkey-rdma.so VALKEY_RDMA_ADDR=192.168.122.1 TEST_RDMA=1 ./test.sh ... Testing against RDMA connection (192.168.122.1:56379): #138 Is able to deliver commands: PASSED #139 Is a able to send commands verbatim: PASSED #140 %s String interpolation works: PASSED #141 %b String interpolation works: PASSED #142 Binary reply length is correct: PASSED #143 Can parse nil replies: PASSED #144 Can parse integer replies: PASSED #145 Can parse multi bulk replies: PASSED #146 Can handle nested multi bulk replies: PASSED #147 Send command by passing argc/argv: PASSED #148 Can pass NULL to valkeyGetReply: PASSED #149 RESP3 PUSH messages are handled out of band by default: PASSED #150 We can set a custom RESP3 PUSH handler: PASSED #151 We properly handle a NIL invalidation payload: PASSED #152 With no handler, PUSH replies come in-band: PASSED #153 With no PUSH handler, no replies are lost: PASSED #154 We set a default RESP3 handler for valkeyContext: PASSED #155 We don't set a default RESP3 push handler for valkeyAsyncContext: PASSED #156 Our VALKEY_OPT_NO_PUSH_AUTOFREE flag works: PASSED #157 We can use valkeyOptions to set a custom PUSH handler for valkeyContext: PASSED #158 We can use valkeyOptions to set a custom PUSH handler for valkeyAsyncContext: PASSED #159 We can use valkeyOptions to set privdata: PASSED #160 Our privdata destructor fires when we free the context: PASSED #161 Successfully completes a command when the timeout is not exceeded: PASSED #162 Does not return a reply when the command times out: SKIPPED #163 Reconnect properly reconnects after a timeout: PASSED #164 Reconnect properly uses owned parameters: PASSED #165 Returns I/O error when the connection is lost: PASSED #166 Returns I/O error on socket timeout: PASSED #167 Set error when an invalid timeout usec value is used during connect: PASSED #168 Set error when an invalid timeout sec value is used during connect: PASSED #169 Append format command: PASSED #170 Throughput: (1000x PING: 0.010s) (1000x LRANGE with 500 elements: 0.060s) (1000x INCRBY: 0.012s) (10000x PING (pipelined): 0.066s) (10000x LRANGE with 500 elements (pipelined): 0.523s) (10000x INCRBY (pipelined): 0.024s) ... Link[1]: valkey-io/valkey#477 Signed-off-by: zhenwei pi <[email protected]>
Valkey Over RDMA[1] has been supported as experimental feature since Valkey 8.0. Support RDMA transport for the client side. RDMA is not a builtin feature, supported as module only, so we have to run test.sh with more argument @VALKEY_RDMA_MODULE and @VALKEY_RDMA_ADDR. An example to run test.sh: VALKEY_RDMA_MODULE=/path/to/valkey-rdma.so VALKEY_RDMA_ADDR=192.168.122.1 TEST_RDMA=1 ./test.sh ... Testing against RDMA connection (192.168.122.1:56379): #138 Is able to deliver commands: PASSED #139 Is a able to send commands verbatim: PASSED #140 %s String interpolation works: PASSED #141 %b String interpolation works: PASSED #142 Binary reply length is correct: PASSED #143 Can parse nil replies: PASSED #144 Can parse integer replies: PASSED #145 Can parse multi bulk replies: PASSED #146 Can handle nested multi bulk replies: PASSED #147 Send command by passing argc/argv: PASSED #148 Can pass NULL to valkeyGetReply: PASSED #149 RESP3 PUSH messages are handled out of band by default: PASSED #150 We can set a custom RESP3 PUSH handler: PASSED #151 We properly handle a NIL invalidation payload: PASSED #152 With no handler, PUSH replies come in-band: PASSED #153 With no PUSH handler, no replies are lost: PASSED #154 We set a default RESP3 handler for valkeyContext: PASSED #155 We don't set a default RESP3 push handler for valkeyAsyncContext: PASSED #156 Our VALKEY_OPT_NO_PUSH_AUTOFREE flag works: PASSED #157 We can use valkeyOptions to set a custom PUSH handler for valkeyContext: PASSED #158 We can use valkeyOptions to set a custom PUSH handler for valkeyAsyncContext: PASSED #159 We can use valkeyOptions to set privdata: PASSED #160 Our privdata destructor fires when we free the context: PASSED #161 Successfully completes a command when the timeout is not exceeded: PASSED #162 Does not return a reply when the command times out: SKIPPED #163 Reconnect properly reconnects after a timeout: PASSED #164 Reconnect properly uses owned parameters: PASSED #165 Returns I/O error when the connection is lost: PASSED #166 Returns I/O error on socket timeout: PASSED #167 Set error when an invalid timeout usec value is used during connect: PASSED #168 Set error when an invalid timeout sec value is used during connect: PASSED #169 Append format command: PASSED #170 Throughput: (1000x PING: 0.010s) (1000x LRANGE with 500 elements: 0.060s) (1000x INCRBY: 0.012s) (10000x PING (pipelined): 0.066s) (10000x LRANGE with 500 elements (pipelined): 0.523s) (10000x INCRBY (pipelined): 0.024s) ... Thanks to Michael Grunder for lots of review suggestions! Link[1]: valkey-io/valkey#477 Signed-off-by: zhenwei pi <[email protected]>
Valkey Over RDMA[1] has been supported as experimental feature since Valkey 8.0. Support RDMA transport for the client side. RDMA is not a builtin feature, supported as module only, so we have to run test.sh with more argument @VALKEY_RDMA_MODULE and @VALKEY_RDMA_ADDR. An example to run test.sh: VALKEY_RDMA_MODULE=/path/to/valkey-rdma.so VALKEY_RDMA_ADDR=192.168.122.1 TEST_RDMA=1 ./test.sh ... Testing against RDMA connection (192.168.122.1:56379): #138 Is able to deliver commands: PASSED #139 Is a able to send commands verbatim: PASSED #140 %s String interpolation works: PASSED #141 %b String interpolation works: PASSED #142 Binary reply length is correct: PASSED #143 Can parse nil replies: PASSED #144 Can parse integer replies: PASSED #145 Can parse multi bulk replies: PASSED #146 Can handle nested multi bulk replies: PASSED #147 Send command by passing argc/argv: PASSED #148 Can pass NULL to valkeyGetReply: PASSED #149 RESP3 PUSH messages are handled out of band by default: PASSED #150 We can set a custom RESP3 PUSH handler: PASSED #151 We properly handle a NIL invalidation payload: PASSED #152 With no handler, PUSH replies come in-band: PASSED #153 With no PUSH handler, no replies are lost: PASSED #154 We set a default RESP3 handler for valkeyContext: PASSED #155 We don't set a default RESP3 push handler for valkeyAsyncContext: PASSED #156 Our VALKEY_OPT_NO_PUSH_AUTOFREE flag works: PASSED #157 We can use valkeyOptions to set a custom PUSH handler for valkeyContext: PASSED #158 We can use valkeyOptions to set a custom PUSH handler for valkeyAsyncContext: PASSED #159 We can use valkeyOptions to set privdata: PASSED #160 Our privdata destructor fires when we free the context: PASSED #161 Successfully completes a command when the timeout is not exceeded: PASSED #162 Does not return a reply when the command times out: SKIPPED #163 Reconnect properly reconnects after a timeout: PASSED #164 Reconnect properly uses owned parameters: PASSED #165 Returns I/O error when the connection is lost: PASSED #166 Returns I/O error on socket timeout: PASSED #167 Set error when an invalid timeout usec value is used during connect: PASSED #168 Set error when an invalid timeout sec value is used during connect: PASSED #169 Append format command: PASSED #170 Throughput: (1000x PING: 0.010s) (1000x LRANGE with 500 elements: 0.060s) (1000x INCRBY: 0.012s) (10000x PING (pipelined): 0.066s) (10000x LRANGE with 500 elements (pipelined): 0.523s) (10000x INCRBY (pipelined): 0.024s) ... Thanks to Michael Grunder for lots of review suggestions! Link[1]: valkey-io/valkey#477 Signed-off-by: zhenwei pi <[email protected]>
Valkey Over RDMA[1] has been supported as experimental feature since Valkey 8.0. Support RDMA transport for the client side. RDMA is not a builtin feature, supported as module only, so we have to run test.sh with more argument @VALKEY_RDMA_MODULE and @VALKEY_RDMA_ADDR. An example to run test.sh: VALKEY_RDMA_MODULE=/path/to/valkey-rdma.so VALKEY_RDMA_ADDR=192.168.122.1 TEST_RDMA=1 ./test.sh ... Testing against RDMA connection (192.168.122.1:56379): #138 Is able to deliver commands: PASSED #139 Is a able to send commands verbatim: PASSED #140 %s String interpolation works: PASSED #141 %b String interpolation works: PASSED #142 Binary reply length is correct: PASSED #143 Can parse nil replies: PASSED #144 Can parse integer replies: PASSED #145 Can parse multi bulk replies: PASSED #146 Can handle nested multi bulk replies: PASSED #147 Send command by passing argc/argv: PASSED #148 Can pass NULL to valkeyGetReply: PASSED #149 RESP3 PUSH messages are handled out of band by default: PASSED #150 We can set a custom RESP3 PUSH handler: PASSED #151 We properly handle a NIL invalidation payload: PASSED #152 With no handler, PUSH replies come in-band: PASSED #153 With no PUSH handler, no replies are lost: PASSED #154 We set a default RESP3 handler for valkeyContext: PASSED #155 We don't set a default RESP3 push handler for valkeyAsyncContext: PASSED #156 Our VALKEY_OPT_NO_PUSH_AUTOFREE flag works: PASSED #157 We can use valkeyOptions to set a custom PUSH handler for valkeyContext: PASSED #158 We can use valkeyOptions to set a custom PUSH handler for valkeyAsyncContext: PASSED #159 We can use valkeyOptions to set privdata: PASSED #160 Our privdata destructor fires when we free the context: PASSED #161 Successfully completes a command when the timeout is not exceeded: PASSED #162 Does not return a reply when the command times out: SKIPPED #163 Reconnect properly reconnects after a timeout: PASSED #164 Reconnect properly uses owned parameters: PASSED #165 Returns I/O error when the connection is lost: PASSED #166 Returns I/O error on socket timeout: PASSED #167 Set error when an invalid timeout usec value is used during connect: PASSED #168 Set error when an invalid timeout sec value is used during connect: PASSED #169 Append format command: PASSED #170 Throughput: (1000x PING: 0.010s) (1000x LRANGE with 500 elements: 0.060s) (1000x INCRBY: 0.012s) (10000x PING (pipelined): 0.066s) (10000x LRANGE with 500 elements (pipelined): 0.523s) (10000x INCRBY (pipelined): 0.024s) ... Thanks to Michael Grunder for lots of review suggestions! Link[1]: valkey-io/valkey#477 Signed-off-by: zhenwei pi <[email protected]>
Valkey Over RDMA[1] has been supported as experimental feature since Valkey 8.0. Support RDMA transport for the client side. RDMA is not a builtin feature, supported as module only, so we have to run test.sh with more argument @VALKEY_RDMA_MODULE and @VALKEY_RDMA_ADDR. An example to run test.sh: VALKEY_RDMA_MODULE=/path/to/valkey-rdma.so VALKEY_RDMA_ADDR=192.168.122.1 TEST_RDMA=1 ./test.sh ... Testing against RDMA connection (192.168.122.1:56379): #138 Is able to deliver commands: PASSED #139 Is a able to send commands verbatim: PASSED #140 %s String interpolation works: PASSED #141 %b String interpolation works: PASSED #142 Binary reply length is correct: PASSED #143 Can parse nil replies: PASSED #144 Can parse integer replies: PASSED #145 Can parse multi bulk replies: PASSED #146 Can handle nested multi bulk replies: PASSED #147 Send command by passing argc/argv: PASSED #148 Can pass NULL to valkeyGetReply: PASSED #149 RESP3 PUSH messages are handled out of band by default: PASSED #150 We can set a custom RESP3 PUSH handler: PASSED #151 We properly handle a NIL invalidation payload: PASSED #152 With no handler, PUSH replies come in-band: PASSED #153 With no PUSH handler, no replies are lost: PASSED #154 We set a default RESP3 handler for valkeyContext: PASSED #155 We don't set a default RESP3 push handler for valkeyAsyncContext: PASSED #156 Our VALKEY_OPT_NO_PUSH_AUTOFREE flag works: PASSED #157 We can use valkeyOptions to set a custom PUSH handler for valkeyContext: PASSED #158 We can use valkeyOptions to set a custom PUSH handler for valkeyAsyncContext: PASSED #159 We can use valkeyOptions to set privdata: PASSED #160 Our privdata destructor fires when we free the context: PASSED #161 Successfully completes a command when the timeout is not exceeded: PASSED #162 Does not return a reply when the command times out: SKIPPED #163 Reconnect properly reconnects after a timeout: PASSED #164 Reconnect properly uses owned parameters: PASSED #165 Returns I/O error when the connection is lost: PASSED #166 Returns I/O error on socket timeout: PASSED #167 Set error when an invalid timeout usec value is used during connect: PASSED #168 Set error when an invalid timeout sec value is used during connect: PASSED #169 Append format command: PASSED #170 Throughput: (1000x PING: 0.010s) (1000x LRANGE with 500 elements: 0.060s) (1000x INCRBY: 0.012s) (10000x PING (pipelined): 0.066s) (10000x LRANGE with 500 elements (pipelined): 0.523s) (10000x INCRBY (pipelined): 0.024s) ... Thanks to Michael Grunder for lots of review suggestions! Link[1]: valkey-io/valkey#477 Signed-off-by: zhenwei pi <[email protected]>
Valkey Over RDMA[1] has been supported as experimental feature since Valkey 8.0. Support RDMA transport for the client side. RDMA is not a builtin feature, supported as module only, so we have to run test.sh with more argument @VALKEY_RDMA_MODULE and @VALKEY_RDMA_ADDR. An example to run test.sh: VALKEY_RDMA_MODULE=/path/to/valkey-rdma.so VALKEY_RDMA_ADDR=192.168.122.1 TEST_RDMA=1 ./test.sh ... Testing against RDMA connection (192.168.122.1:56379): #138 Is able to deliver commands: PASSED #139 Is a able to send commands verbatim: PASSED #140 %s String interpolation works: PASSED #141 %b String interpolation works: PASSED #142 Binary reply length is correct: PASSED #143 Can parse nil replies: PASSED #144 Can parse integer replies: PASSED #145 Can parse multi bulk replies: PASSED #146 Can handle nested multi bulk replies: PASSED #147 Send command by passing argc/argv: PASSED #148 Can pass NULL to valkeyGetReply: PASSED #149 RESP3 PUSH messages are handled out of band by default: PASSED #150 We can set a custom RESP3 PUSH handler: PASSED #151 We properly handle a NIL invalidation payload: PASSED #152 With no handler, PUSH replies come in-band: PASSED #153 With no PUSH handler, no replies are lost: PASSED #154 We set a default RESP3 handler for valkeyContext: PASSED #155 We don't set a default RESP3 push handler for valkeyAsyncContext: PASSED #156 Our VALKEY_OPT_NO_PUSH_AUTOFREE flag works: PASSED #157 We can use valkeyOptions to set a custom PUSH handler for valkeyContext: PASSED #158 We can use valkeyOptions to set a custom PUSH handler for valkeyAsyncContext: PASSED #159 We can use valkeyOptions to set privdata: PASSED #160 Our privdata destructor fires when we free the context: PASSED #161 Successfully completes a command when the timeout is not exceeded: PASSED #162 Does not return a reply when the command times out: SKIPPED #163 Reconnect properly reconnects after a timeout: PASSED #164 Reconnect properly uses owned parameters: PASSED #165 Returns I/O error when the connection is lost: PASSED #166 Returns I/O error on socket timeout: PASSED #167 Set error when an invalid timeout usec value is used during connect: PASSED #168 Set error when an invalid timeout sec value is used during connect: PASSED #169 Append format command: PASSED #170 Throughput: (1000x PING: 0.010s) (1000x LRANGE with 500 elements: 0.060s) (1000x INCRBY: 0.012s) (10000x PING (pipelined): 0.066s) (10000x LRANGE with 500 elements (pipelined): 0.523s) (10000x INCRBY (pipelined): 0.024s) ... Thanks to Michael Grunder for lots of review suggestions! Link[1]: valkey-io/valkey#477 Signed-off-by: zhenwei pi <[email protected]>
RDMA is the abbreviation of remote direct memory access. It is a technology that enables computers in a network to exchange data in the main memory without involving the processor, cache, or operating system of either computer. This means RDMA has a better performance than TCP, the test results show Valkey Over RDMA has a ~2.5X QPS and lower latency. In recent years, RDMA gets popular in the data center, especially RoCE(RDMA over Converged Ethernet) architecture has been widely used. Cloud Vendors also start to support RDMA instance in order to accelerate networking performance. End-user would enjoy the improvement easily. Introduce Valkey Over RDMA protocol as a new transport for Valkey. For now, we defined 4 commands: - GetServerFeature & SetClientFeature: the two commands are used to negotiate features for further extension. There is no feature definition in this version. Flow control and multi-buffer may be supported in the future, this needs feature negotiation. - Keepalive - RegisterXferMemory: the heart to transfer the real payload. The 'TX buffer' and 'RX buffer' are designed by RDMA remote memory with RDMA write/write with imm, it's similar to several mechanisms introduced by papers(but not same): - Socksdirect: datacenter sockets can be fast and compatible <https://dl.acm.org/doi/10.1145/3341302.3342071> - LITE Kernel RDMA Support for Datacenter Applications <https://dl.acm.org/doi/abs/10.1145/3132747.3132762> - FaRM: Fast Remote Memory <https://www.usenix.org/system/files/conference/nsdi14/nsdi14-paper-dragojevic.pdf> Thanks to Daniel House for review suggestions! Link: valkey-io/valkey#477 Co-authored-by: Xinhao Kong <[email protected]> Co-authored-by: Huaping Zhou <[email protected]> Co-authored-by: zhuo jiang <[email protected]> Co-authored-by: Yiming Zhang <[email protected]> Co-authored-by: Jianxi Ye <[email protected]> Signed-off-by: zhenwei pi <[email protected]>
Adds an option to build RDMA support as a module:
To start valkey-server with RDMA, use a command line like the following:
Implement server side of connection module only, this means we can NOT
compile RDMA support as built-in.
Add necessary information in README.md
Support 'CONFIG SET/GET', for example, 'CONFIG Set rdma.port 6380', then
check this by 'rdma res show cm_id' and valkey-cli (with RDMA support,
but not implemented in this patch).
The full listeners show like:
Because the lack of RDMA support from TCL, use a simple C program to test
Valkey Over RDMA (under tests/rdma/). This is a quite raw version with basic
library dependence: libpthread, libibverbs, librdmacm. Run using the script:
To run RDMA in GitHub actions, a kernel module RXE for emulated soft RDMA, needs
to be installed. The kernel module source code is fetched a repo containing only
the RXE kernel driver from the Linux kernel, but stored in an separate repo to
avoid cloning the whole Linux kernel repo.
Since 2021/06, I created a PR for Redis Over RDMA proposal. Then I did some work to fully abstract connection and make TLS dynamically loadable, a new connection type could be built into Redis statically, or a separated shared library(loaded by Redis on startup) since Redis 7.2.0.
Base on the new connection framework, I created a new PR, some guys(@xiezhq-hermann @zhangyiming1201 @JSpewock @uvletter @FujiZ) noticed, played and tested this PR. However, because of the lack of time and knowledge from the maintainers, this PR has been pending about 2 years.
Related doc: Introduce Valkey Over RDMA specification. (same as Redis, and this should be same)
Changes in this PR:
Finally, if this feature is considered to merge, I volunteer to maintain it.