-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Benchmark latency and storage of new events DB design #191
Comments
Doc to track all benchmarking datapoints: https://docs.google.com/document/d/14PTIC1dTNZBbuAtI91baFSPv9se4rO9sqcphV-81Yo8/edit |
Crux of benchmarking code: Total events = 1 mil * 3 (different types of events)
// Iterations:
Entire code here in gist It is most likely that we are going to throw away this code as we do not want to ingest these much data as a part of running tests. |
Some initial benchmarks: Command: Here N = ledger restriction number meaning how many ledger we look at for given fetch query.
As we can see here, latency goes up as N goes upwards. We ideally want a sweet spot of both N being not too high/low and latency with the upper bound of 20-50 ms range. Note: Current avg latency of getEvents is at around 4-6 ms but with the new DB design it will go up. |
Uhm, I am sure the allocations can reduced. |
@2opremio What would be an acceptable latency for getEvents. Right now it is super fast (6 ms) bc of in-memory design. Wanted to get some ballpark numbers. Is 50 ms - 70 ms sounds okay? |
I think we should try to get it down as as much as possible (within reason by timeboxing it). My guess is that the allocations can be reduced by quite a bit which should help a lot |
Thanks for the input. Need to do some detailed memory profiling to analyze the allocs here. N = 2000 and events = 1 million events * 3
fyi, we will not go lower than 2000 ledger limit pprof mem.prof:
Profile of getEvents:
|
Much faster with contract_ids in filter: [Indexed] Reduction in allocs as well. Case: Requested Also removed ingestion part from benchmarking test and got following --> 0.17 ms N = 2000 and events = 1 million events * 3 |
Updated Benchmark doc: https://docs.google.com/document/d/14PTIC1dTNZBbuAtI91baFSPv9se4rO9sqcphV-81Yo8/edit IMO it is safe to start with 4000 ledger restriction limit. just fyi, We also have following things in place to protect us from intensive fetch request:
Allocs are still high for worst case query, one possible reason for that is we load all the rows from (x to x + N) range. Dont know if there are any efficient way of doing it to reduce allocs. cc @2opremio |
Maybe you can stream the rows into a smaller slice to build the replay instead of allocating all of them at once? |
You may even be able to use a |
TODO:
Ideally we should run benchmarking on standard EC2 instance or find a way to limit core and memory while running the benchmark (e.g. passing cpu_limit to benchmark commands)
The text was updated successfully, but these errors were encountered: