-
Notifications
You must be signed in to change notification settings - Fork 150
cargo build returns error and perf improvement #29
Comments
Thanks for your report. I will look into it first thing tomorrow! |
We hit a known issue in cargo: rust-lang/cargo#4544 At the moment it seems like there is no way around the problem but removing the dependency. The last commit 0b65cb2 should make it compile also on your machine. Can you confirm? Cheers, |
It works now! Thanks! on centos libclang version > 3.9 is needed. So had to get that there as the default version is 3.4 I'm trying to run the performance test: ➜ performance git:(master) ✗ python rediSQL_bench.py The python redis installed is: https://pypi.org/project/redis/ 2.10.6 Looks like there is some protocol/version mismatch. Could you pls also shed some light on this? Thanks, |
Hi Xin, actually it is my fault. The performance test should not be there. It is quite hard to get out sensible measurement from python script like I tried to do. If you want to stress test RediSQL the best thing to do is to run (I wrote the scripts that follow by heart, there may be some minor mistake) What I usually do is something like: Inside redis-cli we create the environment.
Then we stress test it with
With this test you are inser $total_time (100_000 by default) times the value 1 and 2 into the table using $n_client (50 by default). What you are really testing here is the number of transaction you are executing. I expect it to be well over 10k/sec but it definitely depends on your machine. On my machines setting $n_client to ~200 gives the best results. Again, you are testing the number of write transaction not the number of insert, doing something like this:
Will insert 3 times as much data in the same number of transaction so it will be pretty much similar in throughput and latency. Similarly you could test the reads, but be aware that if you are reading a lot of data you need to consider the time to transfer the data and the time to find the value in the database. The write transaction is definitely the simplest thing to test. If you need you could also test multiple databases:
And then in multiple instances of redis-benchmark
I am expecting this to scale sublinearly to the number of database. I would love if you could share the result of your benchmarks. |
Thanks for the detailed response. I'm evaluating redisql for a project, with a lot of writes, and reasonable read. I'll try more and let you know. Thanks! |
I figure out that I should provide at least some form of benchmarking so I get some data. These measurements are from an AWS EC2 c4.8xlarge which is a beast machine and the module leaves it pretty idle. After all redis is single thread and SQLite is single thread on writing so getting 180% of cpu was already quite good IMHO. Anyway here are the raw result: https://gist.github.com/siscia/b8a960a1af68cf3c6d27a2513cd44046 I will write an article in the next few days about them. Spoiler, I got north of 50k write transaction per second. I don't see many ways to get orders of magnitude better than this, maybe we can shave something off but nothing too big. Cheers, |
Thanks for the results. Have you tried with bigger text columns? I tried an experiment with like 20 columns, 10 int, 10 text. Each text column like 10k size (yes it's big). So one row around 100k. Benchmarking shows around 2k requests/s. With pure redis SET, 100k value size, around 10k request/s. Of course pure set is easy and redisql does a lot of indexing stuff background. This is with redis-benchmark, do you think the bulk input pipe would help? I guess not given redis-benchmark has pipeline. So looks like to get even higher throughput, one has to do sharding... |
Wow, you are really stressing it! When you talk about 10k you mean 10k bytes? Do you mind to share your test? I am talking about the table structure and the redis-benchmark command you used. I haven't thought about such extreme use case, but let's see if we can improve it. |
Sure. I'm just doing something like this: where cmd_short.txt is like: Each ASDF... is a big string with size around 10k. This gives me about 2k requests/s. I also tried pure SET: So there is about 5 times difference, and I'm assuming this is due to pure SET is really simply and redisql does a lot more stuff. |
Ahahaha, alright was simpler than I though. Thank you so much! There are some places where we are doing some allocation and data movement that we should be able to avoid. I guess this is the reason for the huge difference in performance. I will try some experiment and I will come back to you as soon as possible (it may take some days.) Or if you know rust and want to take a shot at this I can provide guidance, beware it could become quite complex quite fast since there are going to be quite a bit of lifetime to manage. |
I don't know rust... but if you could provide a few pointers of where those allocs happen, I can probably take a stab and see how far I can go. Thanks! |
It would be quite complex in rust especially if you don't know the language. If you are interested in learning it and you want to try I would start looking here: https://github.com/RedBeardLab/rediSQL/blob/master/src/lib.rs#L154 and see if we can remove that I believe it is not gonna be easy and I am quite sure it is not the most friendly way to learn rust. Anyway, what I would suggest first is being sure that SQLite can provide better performance. I run this TCL script
And it seems to have quite reasonable performances, if you could replicate it and confirm that this is what we want it would be great! |
This is from the same host where I tried redis: ➜ redis tclsh sqlite_test.tcl Looks pretty good. Is it 300k rows per second? I see 10m rows added in 30s :--) And each row is like 100k size |
How you can see single thread performance are not the issues. Unfortunately we will never get to 300k/s since Redis top to 10k/s something more realistic would be 5k/s but also that is not easy... I will try my best and keep you updated :) |
Thanks! I have a few quick questions regarding redisql and hope you could help:
|
Sure!
https://sqlite.org/queryplanner.html |
Hi Xin, would you mind to run again your performance tests? I would like you to compare the result of the master branch with the result from the branch In my test the performance for very heavy insert doubled from one to the other. I would like some indipendent confirmation also from you. As always you should be able to compile with Thanks, Simone |
1 similar comment
Hi Xin, would you mind to run again your performance tests? I would like you to compare the result of the master branch with the result from the branch In my test the performance for very heavy insert doubled from one to the other. I would like some indipendent confirmation also from you. As always you should be able to compile with Thanks, Simone |
Sure. I just pulled it and got a compilation error. Is there something not comitted? error[E0583]: file not found for module error: aborting due to previous error The following warnings were emitted during compilation: warning: src/CDeps/SQLite/sqlite3.c: In function ‘exprAnalyze’: error: Could not compile |
Sorry, you should be good now. Thanks for catching this error up, really appreciated. |
I get around 8.2k requests/s with around 100k payload :--) Much much better! |
btw, just tried pure redis set with a few combinations of benchmark arguments. turns out i can get to 27k/s ➜ redis redis-benchmark -t set -n 1000000 -c 1 -d 114000 -P 100 I'll try out redisql a few different combinations and see how it goes |
Looks like I can get to a bit more than half of what pure set can get. 14710.21 requests per second Mon Apr 23 07:15:56 PDT 2018 |
Wow, personally I am quite satisfied by 8k request/s (which is the same figure I was getting), if you can get that up to 14k request/s even better. Do you need it to go even faster? It would require some significant work that I am not even sure what make it slower than pure redis. Between today and tomorrow I will make an official release with this changes :) Thanks so much for your feedback, it really helped. If you are satisfied I would suggest you to close the issue 😃 |
Edited the title to reflect more of what is discussed, and closing. Thanks! I'll also try with some more indexing with redisql :--) will see. Thanks! |
Wonderful! Thank you so much that your feedback is been fundamental. Also, I am looking for beta tester for the PRO version (the one with replication) if you are interested let me know (you should be able to see my email) and I will provide to you the executable. Thanks again! |
➜ rediSQL git:(master) cargo build --release
error: failed to load source for a dependency on
engine_pro
Caused by:
Unable to update file:///home/tetter/redis/rediSQL/engine_pro
Caused by:
failed to read
/home/tetter/redis/rediSQL/engine_pro/Cargo.toml
Caused by:
No such file or directory (os error 2)
Looks like the Cargo.toml file is missing in engine_pro?
Thanks,
Xin
The text was updated successfully, but these errors were encountered: