-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support fiber.yield on storages for crud.pairs() with new option #312
Comments
DifferentialOrange
added
teamE
bug
Something isn't working
question
Further information is requested
labels
Aug 3, 2022
DifferentialOrange
added a commit
that referenced
this issue
Mar 21, 2023
DifferentialOrange
added a commit
that referenced
this issue
Mar 21, 2023
Yield on `select`/`pairs` tuple lookup on storage. Fiber yields each `yield_every` records (same as in `count`). `yield_every` should be a positive integer, default is 1000. crud code contains several `while true` loops. Three of them are related to the retry strategy. Since retries wraps around net box calls which yield, it shouldn't be dangerous. The other two are in storage select procedure: on after tuple scroll and records filtering. If there are a lot of records which not satisfy any conditions, storage will stuck with 100% CPU load, like in #312. This patch covers this two cases. There are pairs loops without any yields on router: fields extraction. Since the count of records is not expected to be that big there (we already work with the final set which would be sent to user), this patch doesn't change the behavior on router. Closes #312
3 tasks
DifferentialOrange
added a commit
that referenced
this issue
Mar 21, 2023
Yield on `select`/`pairs` tuple lookup on storage. Fiber yields each `yield_every` records (same as in `count`). `yield_every` should be a positive integer, default is 1000. crud code contains several `while true` loops. Three of them are related to the retry strategy. Since retries wraps around net box calls which yield, it shouldn't be dangerous. The other two are in storage select procedure: on after tuple scroll and records filtering. If there are a lot of records which not satisfy any conditions, storage will stuck with 100% CPU load, like in #312. This patch covers these two cases. There are pairs loops without any yields on router: fields extraction. Since the count of records is not expected to be that big there (we already work with the final set which would be sent to user), this patch doesn't change the behavior on router. Closes #312
DifferentialOrange
added a commit
that referenced
this issue
Mar 22, 2023
DifferentialOrange
added a commit
that referenced
this issue
Mar 22, 2023
Yield on `select`/`pairs` tuple lookup on storage. Fiber yields each `yield_every` records (same as in `count`). `yield_every` should be a positive integer, default is 1000. crud code contains several `while true` loops. Three of them are related to the retry strategy. Since retries wraps around net box calls which yield, it shouldn't be dangerous. The other two are in storage select procedure: on after tuple scroll and records filtering. If there are a lot of records which not satisfy any conditions, storage will stuck with 100% CPU load, like in #312. This patch covers these two cases. There are pairs loops without any yields on router: fields extraction. Since the count of records is not expected to be that big there (we already work with the final set which would be sent to user), this patch doesn't change the behavior on router. Closes #312
DifferentialOrange
added a commit
that referenced
this issue
Mar 24, 2023
Overview This release fixes a critical bug that resulted in 100% storage CPU load and fixes a couple of issues related to the development pipeline. Changes * Rename `DEV` environment variable to `TARANTOOL_CRUD_ENABLE_INTERNAL_CHECKS` (#250). Bugfixes * Yield on select/pairs storage tuple lookup (#312). * Fix loaded functions misleading coverage (#249).
Merged
DifferentialOrange
added a commit
that referenced
this issue
Mar 24, 2023
Overview This release fixes a critical bug that resulted in 100% storage CPU load and fixes a couple of issues related to the development pipeline. Changes * Rename `DEV` environment variable to `TARANTOOL_CRUD_ENABLE_INTERNAL_CHECKS` (#250). Bugfixes * Yield on select/pairs storage tuple lookup (#312). * Fix loaded functions misleading coverage (#249).
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Our aim is to get data from the Tarantool cluster using crud.pairs() with filtering by the second field of the PK index and batching.
We have a significant amount of data in the space we need to process (10Gb per replica set and 4 replica sets), and most of the tuples we need are located at the beginning of the index.
In this situation, we faced a 100% CPU load when CRUD try to return the last batch of data from each storage instance.
In this case storage instance will be locked until the index scan will be finished.
This is a bad situation, and it would be great to have an option in
crud.pairs()
that allow us to enablefiber.yield
for storage instance if we can sacrifice data consistency.Space definition:
Script:
The text was updated successfully, but these errors were encountered: