Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dragonfly does not allow accessing undeclared keys in lua scripts #272

Closed
dranikpg opened this issue Sep 7, 2022 · 28 comments
Closed

Dragonfly does not allow accessing undeclared keys in lua scripts #272

dranikpg opened this issue Sep 7, 2022 · 28 comments
Assignees
Labels
bug Something isn't working

Comments

@dranikpg
Copy link
Contributor

dranikpg commented Sep 7, 2022

Describe the bug

Discovered when testing #182. The following script:

local conj_ttl = redis.call('ttl', conj_key)
if conj_ttl < timeout then
    -- some code
end

expects conj_ttl always to be a number. Dragonfly instead returns errors as lua tables (aka js-like objects) with a detailed cause.

To Reproduce

Dragonfly:

127.0.0.1:6379> EVAL "return redis.call('ttl', 'nonexistent')" 0
(error) ERR script tried accessing undeclared key

Redis:

127.0.0.1:6379> EVAL "return redis.call('ttl', 'nonexistent')" 0
(integer) -2

Expected behavior

Return sentinel errors in scripts as lua numbers

@dranikpg dranikpg added the bug Something isn't working label Sep 7, 2022
@dranikpg dranikpg changed the title Dragonfly propagates sentinel errors to tables in lua scripts Dragonfly propagates sentinel errors to tables with cause in lua scripts Sep 7, 2022
@romange
Copy link
Collaborator

romange commented Sep 8, 2022

Well, it's not a bug yet - it's by design.
DF implements lua scripting in strict mode, where if you access keys within a script you need to declare them first like this:
EVAL "return redis.call('ttl', 'nonexistent')" 1 nonexistent

@romange
Copy link
Collaborator

romange commented Sep 8, 2022

the question is whether https://github.com/Suor/django-cacheops tests pass - #184

@dranikpg
Copy link
Contributor Author

dranikpg commented Sep 8, 2022

the question is whether https://github.com/Suor/django-cacheops tests pass - #184

They don't, because the comparison in the snippet fails. It expects a number on the left

@romange
Copy link
Collaborator

romange commented Sep 8, 2022

please deep dive into this because when I look at their code:

https://github.com/Suor/django-cacheops/blob/917a0fbd92dccd2f64a2a28a245436057c3f1f27/cacheops/query.py#L53
i see that they do pass keys they access. So I wonder where the gap is.

you can run dragonfly with --vmodule=main_service=2 to see all the incoming commands.

@dranikpg
Copy link
Contributor Author

dranikpg commented Sep 8, 2022

conj_key is not passed anywhere - its generated in conj_cache_key

-- the keys that are passed to EVAL
local prefix = KEYS[1]
local key = KEYS[2]
local precall_key = KEYS[3]

-- conj depends on arguments
local conj_cache_key = function (db_table, conj)
    local parts = {}
    for field, val in pairs(conj) do
        table.insert(parts, field .. '=' .. tostring(val))
    end

    return prefix .. 'conj:' .. db_table .. ':' .. table.concat(parts, '&')
end

 -- conj_key is generated here
local conj_key = conj_cache_key(db_table, conj)
redis.call('sadd', conj_key, key)

So its programmatically generated

The Redis docs state the following:

Important: to ensure the correct execution of scripts, both in standalone and clustered deployments, all names of keys that a script accesses must be explicitly provided as input key arguments. The script should only access keys whose names are given as input arguments. Scripts should never access keys with programmatically-generated names or based on the contents of data structures stored in the database

I guess django-cacheops is using it incorrectly

@romange
Copy link
Collaborator

romange commented Sep 8, 2022

hmmm, ok...

then this task will be harder for you to fix. we can support undeclared keys as well, but it requires a really good understanding how transactions work in DF.

@romange romange changed the title Dragonfly propagates sentinel errors to tables with cause in lua scripts Dragonfly does not allow accessing undeclated keys in lua scripts Sep 9, 2022
@romange romange changed the title Dragonfly does not allow accessing undeclated keys in lua scripts Dragonfly does not allow accessing undeclared keys in lua scripts Sep 9, 2022
@tayler-king
Copy link

Is there a timeline on this?

@romange
Copy link
Collaborator

romange commented Jan 27, 2023

Can you describe a use case or the context?

@tayler-king
Copy link

Can you describe a use case or the context?

Yes, the use case is being able to access and work on keys that do not exist in the namespace during a transaction.

For example, incrementing the value of a key in a hash table that might not yet exist.

@romange
Copy link
Collaborator

romange commented Jan 28, 2023 via email

@tayler-king
Copy link

Ah, yes. I see. My apologies.

@dranikpg
Copy link
Contributor Author

dranikpg commented Jan 31, 2023

https://github.com/hibiken/asynq/blob/cc777ebdaa62b69bd6e985fa97117b854e7d1cd6/internal/rdb/rdb.go#L219

// Input:
// KEYS[1] -> asynq:{<qname>}:pending
// KEYS[2] -> asynq:{<qname>}:paused
// KEYS[3] -> asynq:{<qname>}:active
// KEYS[4] -> asynq:{<qname>}:lease
// --
// ARGV[1] -> initial lease expiration Unix time
// ARGV[2] -> task key prefix
//
// Output:
// Returns nil if no processable task is found in the given queue.
// Returns an encoded TaskMessage.
//
// Note: dequeueCmd checks whether a queue is paused first, before
// calling RPOPLPUSH to pop a task from the queue.
var dequeueCmd = redis.NewScript(`
if redis.call("EXISTS", KEYS[2]) == 0 then
	local id = redis.call("RPOPLPUSH", KEYS[1], KEYS[3])
	if id then
		local key = ARGV[2] .. id
		redis.call("HSET", key, "state", "active")
		redis.call("HDEL", key, "pending_since")
		redis.call("ZADD", KEYS[4], ARGV[1], id)
		return redis.call("HGET", key, "msg")
	end
end
return nil`)

Specifically this part generates a key

local id = redis.call("RPOPLPUSH", KEYS[1], KEYS[3])
if id then
	local key = ARGV[2] .. id

@romange
Copy link
Collaborator

romange commented Feb 22, 2023

I spent time some time to think what if we had chosen another locking scheme, i.e. not VLL - would it improve our option to solve the issue? The answer is no - any "non-determenistic transaction that accesses undeclared keys would require rollback, and that contradicts the nature of Redis and Dragonfly.

To summarize: To schedule distributed atomic transactions, we must either predeclare all the keys, or be able to rollback the transaction that was partially "applied".

@romange
Copy link
Collaborator

romange commented Feb 22, 2023

Does it mean, we are defeated? I do not think so. @dranikpg

  1. with our recent changes we can at least allow the option to run non-determenistic scripts under the critical section
  2. Secondly, if you look at the asyncq golang code, you can see it allows passing a prefix. A prefix could be {foobar} slot notation. If we can control the prefix and we can set it as {<qname>} thus making the whole script a single-sharded entity that can run atomically on {<qname>} shard.
  3. Finally, we can give up on the atomicity at run-time. i.e. when we recognize a key that we have not declared, we run it in the context of another transaction that is spawned ad-hoc. The good news, the queuing logic of asyncq script would still work correctly, thanks to atomic rpoplpush but the monitoring dashboard might have glitches by observing "state:active" and "pending_since" together. I do not consider this as an issue. The disadvantage of this approach - latency.

I think, that option 2 is preferrable since it solves the latency issue as well.

@dranikpg
Copy link
Contributor Author

dranikpg commented Feb 22, 2023

  1. We provide global and non atomic mode
  2. It will help with latency because we don't need to dispatch, but running on a single shard doesn't mean we can access undeclared keys. You can still have transactions running that are holding locks
  3. If we give up on atomicity we can run in non-atomic mode for simplicity

@romange
Copy link
Collaborator

romange commented Feb 22, 2023

2 - it is, if we support "slot" locks. For a lua script to contend on such undeclared key, say {q1}abc someone else should access {q1}abc before and lock it. If we recognize it has a slot inside, we could also lock {q1} lock. And since this lua script declares asynq:{<qname>}:pending etc, we could similarly lock {q1} for it. Hence, it will be scheduled in txq after the transaction that locks {q1}

@dranikpg
Copy link
Contributor Author

True.... Slot locks nice feature for DF in general

@dranikpg
Copy link
Contributor Author

Done with script flags

@silverwind
Copy link

Should allow-undeclared-keys be made the default?

Dragonfly is advertized as "drop-in" replacement to Redis, but this is a behavorial difference that breaks this compatibility.

@dranikpg
Copy link
Contributor Author

Hi @silverwind

The allow-undeclared-keys flag slows down script execution significantly (it makes it impossible to run them in parallel). Enabling it by default will add enormous overhead to scripts that actually don't need this feature.

Redis itself discourages working with undeclared keys. See their docs:

https://redis.io/commands/eval/#:~:text=Important%3A%20to,in%20the%20database.

@romange
Copy link
Collaborator

romange commented Aug 21, 2023

@silverwind we are working to allow undeclared keys when running the script with slots.
What's your use-case exactly? Do you use some sort of framework like BullMQ?

@silverwind
Copy link

silverwind commented Aug 21, 2023

A colleague was prototyping asynq with dragonfly and this was the first incompatibility they noticed. You can see some of the scripts they run here.

@melroy89
Copy link

I use Dragonfly with Symfony messenger (PHP):

==> messenger_00-stderr---supervisor-o3xj3mpn.log <==
16:37:59 WARNING   [cache] Failed to invalidate key ":\x01tags\x01entry_comment_1554187": ERR Error running script (call to 43d401bd2bd0ad864c3ca221512cda1b6215ec23): @user_script:22: script tried accessing undeclared key ["key" => ":\x01tags\x01entry_comment_1554187","exception" => RedisException { …}]

==> messenger_04-stderr---supervisor-2i2hhnil.log <==
16:37:59 WARNING   [cache] Failed to invalidate key ":\x01tags\x01entry_comment_1568177": ERR Error running script (call to 43d401bd2bd0ad864c3ca221512cda1b6215ec23): @user_script:22: script tried accessing undeclared key ["key" => ":\x01tags\x01entry_comment_1568177","exception" => RedisException { …}]
16:38:00 WARNING   [cache] Failed to invalidate key ":\x01tags\x01entry_comment_1568983": ERR Error running script (call to 43d401bd2bd0ad864c3ca221512cda1b6215ec23): @user_script:22: script tried accessing undeclared key ["key" => ":\x01tags\x01entry_comment_1568983","exception" => RedisException { …}]

==> messenger_00-stderr---supervisor-o3xj3mpn.log <==
16:38:00 WARNING   [cache] Failed to invalidate key ":\x01tags\x01entry_comment_": ERR Error running script (call to 43d401bd2bd0ad864c3ca221512cda1b6215ec23): @user_script:22: script tried accessing undeclared key ["key" => ":\x01tags\x01entry_comment_","exception" => RedisException { …}]

==> messenger_02-stderr---supervisor-w4qmn8mi.log <==
16:38:00 WARNING   [cache] Failed to invalidate key ":\x01tags\x01entry_comment_1560361": ERR Error running script (call to 43d401bd2bd0ad864c3ca221512cda1b6215ec23): @user_script:22: script tried accessing undeclared key ["key" => ":\x01tags\x01entry_comment_1560361","exception" => RedisException { …}]

==> messenger_03-stderr---supervisor-8vgiopva.log <==
16:38:00 WARNING   [cache] Failed to invalidate key ":\x01tags\x01entry_comment_": ERR Error running script (call to 43d401bd2bd0ad864c3ca221512cda1b6215ec23): @user_script:22: script tried accessing undeclared key ["key" => ":\x01tags\x01entry_comment_","exception" => RedisException { …}]

==> messenger_02-stderr---supervisor-w4qmn8mi.log <==
16:38:00 WARNING   [cache] Failed to invalidate key ":\x01tags\x01entry_comment_": ERR Error running script (call to 43d401bd2bd0ad864c3ca221512cda1b6215ec23): @user_script:22: script tried accessing undeclared key ["key" => ":\x01tags\x01entry_comment_","exception" => RedisException { …}]

I get these errors. I was hoping Dragonfly was a drop-in replacement for Redis, hence I moved away from Redis to Dragonfly.

@dranikpg
Copy link
Contributor Author

@melroy89
Copy link

melroy89 commented Mar 25, 2024

@melroy89 Please take a quick look at our docs https://www.dragonflydb.io/docs/managing-dragonfly/scripting#script-flags, specifically https://www.dragonflydb.io/docs/managing-dragonfly/scripting#:~:text=Allowing%20undeclared%20keys%E2%80%8B&text=To%20allow%20accessing%20any%20keys,when%20the%20script%20is%20running.

Thanks! Can these options be set via dragonfly.conf file as well? I think so, right?
Since I installed Dragonfly via deb file and I would like to have these configs in my /etc/dragonfly/dragonfly.conf.

Hopefully using: --default_lua_flags=allow-undeclared-keys,disable-atomicity in dragonfly.conf

@silverwind
Copy link

It still bothers the question: Why not make the default flags redis-compatible?

Dragonfly advertizes with "Dragonfly is fully compatible with Redis APIs" but that is clearly not the case with allow-undeclared-keys not being enabled by default.

@melroy89
Copy link

It still bothers the question: Why not make the default flags redis-compatible?

Dragonfly advertizes with "Dragonfly is fully compatible with Redis APIs" but that is clearly not the case with allow-undeclared-keys not being enabled by default.

That I can't agree more! This issue should be re-opened and fixed.

@romange
Copy link
Collaborator

romange commented Mar 26, 2024

While we strive to be as compatible as possible with Redis Core semantics (atomicity guarantees, Redis Protocol), some Redis use cases can heavily limit Dragonfly's performance. For example, the --default_lua_flags=allow-undeclared-keys flag necessitates the Global Interpreter Lock (GIL) on all EVAL calls. This makes multi-threaded Dragonfly less efficient than single-threaded Redis Core when calling EVALs at high throughput. Many users rely on Lua without using undeclared keys, so it wouldn't be fair to penalise them for this edge case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants