Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support fiber.yield on storages for crud.pairs() with new option #312

Closed
dkasimovskiy opened this issue Jul 20, 2022 · 0 comments · Fixed by #351
Closed

Support fiber.yield on storages for crud.pairs() with new option #312

dkasimovskiy opened this issue Jul 20, 2022 · 0 comments · Fixed by #351
Assignees
Labels
bug Something isn't working feature A new functionality question Further information is requested

Comments

@dkasimovskiy
Copy link

Our aim is to get data from the Tarantool cluster using crud.pairs() with filtering by the second field of the PK index and batching.
We have a significant amount of data in the space we need to process (10Gb per replica set and 4 replica sets), and most of the tuples we need are located at the beginning of the index.
In this situation, we faced a 100% CPU load when CRUD try to return the last batch of data from each storage instance.
In this case storage instance will be locked until the index scan will be finished.
This is a bad situation, and it would be great to have an option in crud.pairs() that allow us to enable fiber.yield for storage instance if we can sacrifice data consistency.

Space definition:

return {
    up = function()
        local utils = require('migrator.utils')
        
        local example_space = box.schema.space.create('example', { if_not_exists = true })
        example_space:format({
            { name = "id", type = "integer" },
            { name = "code", type = "string" },
            { name = "channel", type = "string" },
            { name = "content", type = "string" },
            { name = "bucket_id", type = "unsigned" }
        })
        example_space:create_index("primary", { parts = { { field = "id" }, { field = "code" }, { field = "channel" } },
                                                  if_not_exists = true })
        example_space:create_index("bucket_id", { parts = { { field = "bucket_id" } },
                                                    unique = false,
                                                    if_not_exists = true })
        utils.register_sharding_key('example', {'id'})                                                  

        return true
    end
}

Script:

local crud = require('crud')
local json = require('json')
local log = require('log')
local vshard = require('vshard')

local function exapmle()
    local replica_sets = {}
    for k, v in ipairs(vshard.router.buckets_info()) do
        if v.uuid ~= nil then
            replica_sets[v.uuid] = k
        end
    end
    log.info('REPLICA SETS: %s', json.encode(replica_sets))

    local i = 0
    for rs_uuid, bucket_id in pairs(replica_sets) do
        log.info('Begin replica set processing: %s, %s', rs_uuid, bucket_id)
        for _, object in crud.pairs('settings_object',
            { { '==', 'code', 'TEST' } },
            { batch_size = 1000, bucket_id = bucket_id, prefer_replica = true, timeout = 60 }) do
            -- do some work
        end
        log.info('End replica set processing: %s, %s', rs_uuid, bucket_id)
    end
end

return {
    exapmle = exapmle
}
@dkasimovskiy dkasimovskiy added the feature A new functionality label Jul 20, 2022
@DifferentialOrange DifferentialOrange added teamE bug Something isn't working question Further information is requested labels Aug 3, 2022
@DifferentialOrange DifferentialOrange self-assigned this Mar 21, 2023
DifferentialOrange added a commit that referenced this issue Mar 21, 2023
DifferentialOrange added a commit that referenced this issue Mar 21, 2023
Yield on `select`/`pairs` tuple lookup on storage. Fiber yields each
`yield_every` records (same as in `count`). `yield_every` should be a
positive integer, default is 1000.

crud code contains several `while true` loops. Three of them are related
to the retry strategy. Since retries wraps around net box calls which
yield, it shouldn't be dangerous. The other two are in storage select
procedure: on after tuple scroll and records filtering. If there are
a lot of records which not satisfy any conditions, storage will stuck
with 100% CPU load, like in #312. This patch covers this two cases.

There are pairs loops without any yields on router: fields extraction.
Since the count of records is not expected to be that big there
(we already work with the final set which would be sent to user),
this patch doesn't change the behavior on router.

Closes #312
DifferentialOrange added a commit that referenced this issue Mar 21, 2023
Yield on `select`/`pairs` tuple lookup on storage. Fiber yields each
`yield_every` records (same as in `count`). `yield_every` should be a
positive integer, default is 1000.

crud code contains several `while true` loops. Three of them are related
to the retry strategy. Since retries wraps around net box calls which
yield, it shouldn't be dangerous. The other two are in storage select
procedure: on after tuple scroll and records filtering. If there are
a lot of records which not satisfy any conditions, storage will stuck
with 100% CPU load, like in #312. This patch covers these two cases.

There are pairs loops without any yields on router: fields extraction.
Since the count of records is not expected to be that big there
(we already work with the final set which would be sent to user),
this patch doesn't change the behavior on router.

Closes #312
DifferentialOrange added a commit that referenced this issue Mar 22, 2023
DifferentialOrange added a commit that referenced this issue Mar 22, 2023
Yield on `select`/`pairs` tuple lookup on storage. Fiber yields each
`yield_every` records (same as in `count`). `yield_every` should be a
positive integer, default is 1000.

crud code contains several `while true` loops. Three of them are related
to the retry strategy. Since retries wraps around net box calls which
yield, it shouldn't be dangerous. The other two are in storage select
procedure: on after tuple scroll and records filtering. If there are
a lot of records which not satisfy any conditions, storage will stuck
with 100% CPU load, like in #312. This patch covers these two cases.

There are pairs loops without any yields on router: fields extraction.
Since the count of records is not expected to be that big there
(we already work with the final set which would be sent to user),
this patch doesn't change the behavior on router.

Closes #312
DifferentialOrange added a commit that referenced this issue Mar 24, 2023
Overview

  This release fixes a critical bug that resulted in 100% storage CPU
  load and fixes a couple of issues related to the development pipeline.

Changes
  * Rename `DEV` environment variable to
    `TARANTOOL_CRUD_ENABLE_INTERNAL_CHECKS` (#250).

Bugfixes
  * Yield on select/pairs storage tuple lookup (#312).
  * Fix loaded functions misleading coverage (#249).
DifferentialOrange added a commit that referenced this issue Mar 24, 2023
Overview

  This release fixes a critical bug that resulted in 100% storage CPU
  load and fixes a couple of issues related to the development pipeline.

Changes
  * Rename `DEV` environment variable to
    `TARANTOOL_CRUD_ENABLE_INTERNAL_CHECKS` (#250).

Bugfixes
  * Yield on select/pairs storage tuple lookup (#312).
  * Fix loaded functions misleading coverage (#249).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working feature A new functionality question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants