-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache Real Time Streamer Message in Redis #262
Comments
darunrs
changed the title
Store Real Time Blocks in Redis
Store Real Time Streamer Message in Redis
Sep 29, 2023
darunrs
changed the title
Store Real Time Streamer Message in Redis
Cache Real Time Streamer Message in Redis
Oct 2, 2023
darunrs
added a commit
that referenced
this issue
Oct 5, 2023
The streamer message is used by both the coordinator and runner. However, both currently poll the message from S3. There is a huge latency impact for pulling the message from S3. In order to improve this, the streamer message will now be cached in Redis with a TTL and pulled by runner from Redis. Only if there is a cache miss will runner pull from S3 again. Pulling from S3 currently takes up 200-500ms, which is roughly 80-85% of the overall execution time of a function in runner. By caching the message, a cache hit leads to loading the data in 1-3ms in local testing, which corresponds to about 3-5% of the execution time, or a 1100% improvement in latency. The reduction of network related activity to a much lower percentage of execution time also reduces the variability of a function's execution time greatly. Cache hits and misses will be logged for further tuning of TTL to reduce cache misses. In addition, processing the block takes around 1-3ms. This processing has been moved to be done before caching, saving an extra 1-3ms each time that block is read from cache. The improvement there will be important for historical backfill, which is planned to be optimized soon. Tracking Issue: #262 Parent Issue: #204
darunrs
added a commit
that referenced
this issue
Oct 30, 2023
The streamer message is used by both the coordinator and runner. However, both currently poll the message from S3. There is a huge latency impact for pulling the message from S3. In order to improve this, the streamer message will now be cached in Redis with a TTL and pulled by runner from Redis. Only if there is a cache miss will runner pull from S3 again. Pulling from S3 currently takes up 200-500ms, which is roughly 80-85% of the overall execution time of a function in runner. By caching the message, a cache hit leads to loading the data in 1-3ms in local testing, which corresponds to about 3-5% of the execution time, or a 1100% improvement in latency. The reduction of network related activity to a much lower percentage of execution time also reduces the variability of a function's execution time greatly. Cache hits and misses will be logged for further tuning of TTL to reduce cache misses. In addition, processing the block takes around 1-3ms. This processing has been moved to be done before caching, saving an extra 1-3ms each time that block is read from cache. The improvement there will be important for historical backfill, which is planned to be optimized soon. Tracking Issue: #262 Parent Issue: #204
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Migrate AWS SDK from v2 to v3.
Cache streamer message in Redis in Coordinator.
Pull cached message from Redis in Runner if not historical.
Migrate camel case processing to coordinator.
Log metrics for cache hits and misses.
The text was updated successfully, but these errors were encountered: