-
Notifications
You must be signed in to change notification settings - Fork 637
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CRASH] Cluster Mode Cross-Slot RANDOMKEY in MULTI-EXEC #1145
Comments
Will raise a PR soon to fix this issue. |
Just to mention, this won't crash for most production users since it uses the debug assert. |
Not in the scope of this issue, but if we refactor the expiry API to optionally accept a slot (or create a new API), we can potentially use the slot cached in the client if a slot is not provided. Would need to take a closer look to see if that's possible. It would improve the look-up perf by a decent amount. |
The behavior today is that we use the slot on the client as an optimization. The assertion we are hitting is that there was a mismatch between the cached slot and the real slot, which is only checked when debug assertions are enabled. |
@madolson in this case, it look like we are returning the wrong slot, like we are returning the cached slot (current->client->slot) and not the real slot (do nothing with the debug assert) |
We need @nadav-levanoni What's the fix you have in mind? |
We have a debug assert that validates the cached slot is the same as the real slot. That caused the crash that Nadav is referring to. In this specific case, we are checking the wrong slot for an expire in randomKey, which is not ideal and is a correctness issue. (We might return a key that is logically expired) |
I've been looking more into this and I think the most correct solution would also come with a perf improvement (spoke with Dan about this). I'm still toying around with it locally, but I think that reworking most of the expiry API to have ex. keys command can call in the case of the randomkey command specifically we can call functions |
Does this bug apply also to other commands like KEYS and SCAN called within a transaction in the same way? I guess it does...
@nadav-levanoni In theory, I agree. I prefer a two-step approach in this case though:
On top of this, we have some other works in progress about restructuring how key-value objects and TTLs are stored, so this API may already change in the near future.' |
This is my first PR, so I'm a little slow
|
#1145 First part of a two-step effort to add `WithSlot` API for expiry. This PR is to fix a crash that occurs when a RANDOMKEY uses a different slot than the cached slot of a client during a multi-exec. The next part will be to utilize the new API as an optimization to prevent duplicate work when calculating the slot for a key. --------- Signed-off-by: Nadav Levanoni <[email protected]> Signed-off-by: Madelyn Olson <[email protected]> Co-authored-by: Nadav Levanoni <[email protected]> Co-authored-by: Madelyn Olson <[email protected]>
…o#1155) valkey-io#1145 First part of a two-step effort to add `WithSlot` API for expiry. This PR is to fix a crash that occurs when a RANDOMKEY uses a different slot than the cached slot of a client during a multi-exec. The next part will be to utilize the new API as an optimization to prevent duplicate work when calculating the slot for a key. --------- Signed-off-by: Nadav Levanoni <[email protected]> Signed-off-by: Madelyn Olson <[email protected]> Co-authored-by: Nadav Levanoni <[email protected]> Co-authored-by: Madelyn Olson <[email protected]>
Crash report
The text was updated successfully, but these errors were encountered: