Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is the 'Redlock' mechanism supported in this library? #24

Closed
wingunder opened this issue Oct 21, 2019 · 42 comments
Closed

Is the 'Redlock' mechanism supported in this library? #24

wingunder opened this issue Oct 21, 2019 · 42 comments

Comments

@wingunder
Copy link
Contributor

I am aware that a transaction mechanism is available and that it works with both clustered and non-clustered Redis instances. However, as far as I understand it, the transaction mechanism can't be used to synchronize over Redis instances, as it would need to have a DLM (Distributed Lock Manager) for doing this. 'Redlock' is a specification for such a DLM.

According to the Distributed locks with Redis page, a C++ implementation of 'Redlock' exists here. However, it seems to lack cluster support, as it uses hiredis, as back-end.

A 'Redlock' implementation will be very handy within, as well as a nice supplement to this library.
Are there any plans for implementing this?

@sewenew
Copy link
Owner

sewenew commented Oct 22, 2019

Yes, there are plans for implementing some patterns / recipes for Redis, and Redlock is one of them. However, I'm busy with other stuffs these day, and there's no firm date for the release of these patterns.

It seems the implementation of Redlock is not very complicated, and I might try it ASAP. If I have any progress, I'll let you know :)

@wingunder
Copy link
Contributor Author

Hi @sewenew,
I did a first throw at adding redlock support in #32.
Regards

@sewenew
Copy link
Owner

sewenew commented Oct 30, 2019

Hi @wingunder

Thanks a lot for your great work!

In fact, I've already tried to implement RedLock this week. However, by now, RedisCluster class doesn't have a better way to iterate master nodes. So I didn't figure out an elegant solution to implement RedLock with Redis Cluster, and haven't commit my code.

I reviewed your code, and I think the Redlock interface can be improved somehow. For example, it should be better if the Redlock::lock method returns std::chrono::milliseconds to indicate how much time still left for the lock. Since getting the lock might cost some time, especially locking with Redis Cluster, in which case, you need to do the lock on several nodes.

The extend_lock method is great! I also added it to my implementation, thanks a lot! However, your implementation of extend_lock seems to have a bug: your script use expire command to set a ttl of std::chrono::milliseconds. I think you should use the pexpire command.

I created a new branch, i.e. recipes, so that I can experiment some Redis recipes in the future. And I've committed my implementation of RedLock for single instance Redis to this branch. It will be great if you can review the code, and give me some feedback. However, I won't merge this branch into the master branch in the near future. Because I'd like to get more feedback on the RedLock interface to ensure it's user-friendly. And I'll try the RedLock implementation for Redis Cluster tomorrow. The following is an example of how to use the RedLock of my version, and any suggestion or pull request will be welcome :)

auto redis = Redis("tcp://127.0.0.1");
RedMutex mut(redis, "resource");
RedLock<RedMutex> locker(mut, std::defer_lock);
auto time_left = locker.try_lock(std::chrono::seconds(30));
// do your work here
locker.unlock();
// when locker is destroyed, its constructor will call unlock automatically if it still has the lock.

Also you're welcome to create another pull request to the recipes branch. If people prefer your interface, I'll merge your code into the master branch :)

Thanks again for your great work!

Regards

@wingunder
Copy link
Contributor Author

Hi @sewenew,

Thanks a lot for looking at my PR #15, as well as your positive feedback.

In the meantime, I fixed the bug that you found and added a test case for it. While testing it I found a little quirk in the tests, which I also fixed. I also decided that your idea of unlocking everything in the destructor is a good idea, and implemented it, together with a test case.

I haven't made Redlock::lock() return std::chrono::milliseconds yet, as:

  1. I'd really like to benchmark the lock-times, before investing time to change the return type. Taking into account that the connection(s) is/are already available when acquiring the lock, it's just the round-trip and EVALSHA time.
  2. The overhead (if any) of calculating the time spent getting the lock will be present, even if it's not used by the user. So, I don't know what and how many use-cases there are for this.
  3. My implementation of lock() doesn't retry, so I don't have a random sleep implemented. I'm not sure if this is desired inside the lock() member.

I had a quick browse over your redlock implementation. I saw that you are using the Redis WAIT command. I was contemplating that, but I used the method that @antirez documented, with the lua script. I'm sure that if the WAIT-type was a better approach, he would have updated his documented method. Anyway, I just quickly scanned through a few implementations, listed here, and not one of the ones I looked at had a WAIT-type implementation. I didn't check all.

BTW, I stumbled across the node-redlock implementation. They also use some kind of lock extension. So, someone had the idea before I did :) The node-redlock implementation also does some funky timing stuff, so this might be an implementation to look at.

As for the cluster stuff: The way I understand it, a key (the one that you're using to lock with) gets sharded to a specific node in the cluster, so you'll always be directed to that node, if you operate on the key. If this is so, you won't need to lock all nodes of the cluster. You simply view the cluster as a single instance, in this case, or am I wrong?

I'll have a closer look at your implementation tomorrow and give you more feedback.

Finally, I think it would be more efficient if we could find some common ground between our implementations. How about the following:

  1. Nail down a common interface.
  2. Merge the extend the exiting test cases, as these should be common as well.
  3. Merge the implementations common implementation methods.
  4. Use conditional compilation for implementations that are not compatible. eg. 'WAIT'-type vs. lua scripts.

How does this sound?
Regards

@wingunder
Copy link
Contributor Author

Hi @sewenew,
I reviewed a part of your commit 856ed96.
Regards

@wingunder
Copy link
Contributor Author

@wingunder wrote:

I haven't made Redlock::lock() return std::chrono::milliseconds yet, as:

  1. I'd really like to benchmark the lock-times, before investing time to change the return type. Taking into account that the connection(s) is/are already available when acquiring the lock, it's just the round-trip and EVALSHA time.

From my benchmarks, the times are as follow:

  • Redlock<RedisCluster>::lock() : avg 250 uSec and max 360 uSec on an +-idling Redis cluster.
  • Redlock<Redis>::lock() : avg 40 uSec and max 550 uSec on a very busy Redis node (top claims that the redis-server process uses 35% CPU).

So, at the moment I'm not sure if making Redlock::lock() return std::chrono::milliseconds is usefull, as it will always return std::chrono::milliseconds(0).

@sewenew
Copy link
Owner

sewenew commented Oct 31, 2019

Hi @wingunder

Thanks for your detailed replies, and your review 👍

I've replied your comments on my implementation, please take a look.

I'll try to explain your question as detail as I can.

You simply view the cluster as a single instance, in this case, or am I wrong?

When I say: RedLock with Redis Cluster, it doesn't mean lock on a single node of Redis Cluster. Instead, I was trying to implement RedLock with the distributed version of RedLock. If you just want to lock on a single master of a Redis Cluster, you don't need another class, e.g. Redlock<RedisCluster>. Instead, you can use the Redlock<Redis> class, and get an instance of Redis object with the RedisCluster::redis(const StringView &hash_tag) method:

auto redis = cluster.redis("resource_name");
Redlock<Redis> redlock(redis, random_buffer);
redlock.lock("resource_name", 10ms);

So you don't need to make your Redlock class be a template class.

When writing this comment, I read the doc of distributed version of RedLock algorithm again. And I think I misunderstood the algorithm. It's NOT an algorithm working with Redis Cluster. Instead, it's an algorithm working with N independent Redis masters. How stupid am I!!! I'll try this tomorrow. I'm sorry, but too busy with my job...

I saw that you are using the Redis WAIT command.

Yes, I didn't use script, but instead, use Redis transaction to do the work. Because I thought it might not be a good idea to load script from the client side. Instead, Lua script should be loaded by Redis administrator. Some administrator might disable EVAL command, because an evil client might load some slow script to Redis, and block Redis for a long time.

However, as you mentioned, since most implementation use script as a solution, it's might not be a bad idea to load it from the client side. I'll reconsider it. And maybe give user an option to choose how to do the lock, so that s/he can still use RedLock on Redis server whose EVAL command has been disabled.

at the moment I'm not sure if making Redlock::lock() return std::chrono::milliseconds is usefull, as it will always return std::chrono::milliseconds(0)

I think you misunderstand the meaning of the return value. It's NOT the time used for acquiring the lock. Instead, if the time still left for the acquired lock, i.e. validity time. For instance, user wants to lock for 100 milliseconds with try_lock(resource, 100ms), however, it costs 20 milliseconds to acquire the lock. So there's only 100 - 20 = 80 milliseconds left for s/he to do the work, and s/he must extend or release the lock in 80 milliseconds. Otherwise, there will be more than 1 users hold the lock.

This might cause problem especially in the Cluster mode. Image there're 5 master nodes, and we need to lock at least 3 nodes for 100 milliseconds. If each lock operation costs 20 milliseconds, there will be only 100 - 20 * 3 = 40 milliseconds left. Please check the RedLock algorithm step 4 for detail.

The node-redlock implementation also does some funky timing stuff, so this might be an implementation to look at.

Thanks for your suggestion! I'll take a closer look.

I think it would be more efficient if we could find some common ground between our implementations. How about the following:

It's a good idea! And I'd like to merge your test code :) But before that, we need to nail down the interface. By now, I still prefer my version of RedLock interface. It has an interface similar as STL's std::mutex and std::unique_lock<Mutex> interface. Also the RedMutex class is thread-safe, so that users can easily use RedMutex in multiple environment. Any suggestion?

Regards

@sewenew
Copy link
Owner

sewenew commented Oct 31, 2019

It's a little late. I'll review your changes tomorrow. Sorry for that...

@sewenew
Copy link
Owner

sewenew commented Nov 1, 2019

Hi @wingunder

I committed the distributed version of RedLock, and refactored the single node version. So that RedMutex can be created with either a single master or a list of masters.

// Single node version:
auto redis = Redis("tcp://127.0.0.1");
RedMutex mut(redis, "resource");

// Distributed version:
// auto redis0 = Redis("tcp://127.0.0.1:7000");
// auto redis1 = Redis("tcp://127.0.0.1:7001");
// auto redis2 = Redis("tcp://127.0.0.1:7002");
// RedMutex mut({std::ref(redis0), std::ref(redis1), std::ref(redis2)}, "resource");

RedLock locker(mut, std::defer_lock);
locker.try_lock(std::chrono::seconds(5));
locker.extend_lock(std::chrono::system_clock::now() + std::chrono::seconds(10));
locker.unlock();

I also updated the CMakeList.txt, and you can fetch the latest code of recipes branch to have to try.

It seems that @antirez's ruby version did some retry and drift stuff. Maybe we can add to the C++ version.

Regards

@wingunder
Copy link
Contributor Author

wingunder commented Nov 2, 2019

Hi @sewenew,

Thanks for your work on the RedMutex stuff and of course all the tips in your review.

I gave up on my original PR #32 as it just seemed better to branch from your recipes branch and continue from there. I will however still comment your review, just for completeness, before I close PR #32.

I was almost finished with the Redlock rewrite, based on https://github.com/antirez/redlock-rb/blob/master/redlock.rb, when you committed 439ac20, so I had to refactor it a bit and it is now called RedLockMutex. This name can be changed. I realized that I wanted to return more than just a (lock) flag and a duration, from RedLockMutex::lock and RedLockMutex::extend_lock, so I created a RedLockMutex::LockInfo struct for this. To me, it turned out to it fit really well.

I took the liberty to create the RedMutexInterface interface and let RedLock and RedLockKeyMutex derive from it. The functionality of RedMutex and RedLockKeyMutex should be the same. I also moved RedMutex::_lock_id() and RedMutex::_ttl() into a new class called RedLockUtils and made them static. These functions are now being used by RedLock, RedLockKeyMutex and the tests.

The test cases are updated, and covers close to everything in RedLockMutex. I also added a test to benchmark the simultaneous lock and unlock of 1000 locks, using the following 3 classes:

  • RedMutex
  • RedLockMutex
  • RedLockKeyMutex

The results are printed on std::cout, when the tests are run. Interesting, but exactly what I expected.

All of your suggestions in your review flowed into #33, except for raw string literal, I would appreciate it if you could help me.

I looked at your RedMutex class and made some comments in commit 439ac20.

Regards

@wingunder
Copy link
Contributor Author

Hi @sewenew,
I've commented on all your reviews in PR #32.
Please confirm, if I should close PR #32.
Thanks & regards

@sewenew
Copy link
Owner

sewenew commented Nov 4, 2019

Hi @wingunder

Thanks for your great work and the great comments, of course! I'm so sorry for the late reply...

I've commented all your questions, and the new issue you created. I'll review your changes for PR #33 tomorrow ASAP. And I think you can close PR #32 now.

Sorry again for the late reply...

Regards

@wingunder
Copy link
Contributor Author

Hi @sewenew,
Thanks a lot. There's no rush, we're all humans ;)
Regards

@sewenew
Copy link
Owner

sewenew commented Nov 11, 2019

Hi @wingunder

I created a new commit: 10ca2b2, which includes the following changes:

  • Make RedLock::try_lock and RedLock::extend_lock return bool
  • Make all XXXMutex::try_lock and XXXMutex::extend_lock return milliseconds, if error happens, return milliseconds(-1), instead of throwing an error.
  • Fix some compiling error. One of which is interesting, i.e. RedLockMutexVessel({instance}). You can check this for explanation.
  • Fix some comments you gave to me. Thanks again :)

Since these changes also modify your code, I'd like you to do a review, and confirm these changes.

Thanks in advance!

Regards

@wingunder
Copy link
Contributor Author

Hi @sewenew,
Thanks for keeping me in the loop. I'll definitely look at the changes and give you feedback, however I would only be able to get to this in about 2 days' time.
Regards

@sewenew
Copy link
Owner

sewenew commented Nov 12, 2019

Hi @wingunder

There's no hurry :)

Also this commit adds a ttl method for RedLock, so that end user can easily get to know whether the lock has been expired.

Regards

@wingunder
Copy link
Contributor Author

Hi @sewenew,
I commented directly in 10ca2b2.

Also this commit adds a ttl method for RedLock, so that end user can easily get to know whether the lock has been expired.

Thanks. Are you planning to add some test cases for RedLock<>?

Regards

@sewenew
Copy link
Owner

sewenew commented Nov 13, 2019

Hi @wingunder

Thanks for your comments! I created a new commit to fix your comments. You can take a look.

Are you planning to add some test cases for RedLock<>?

Yes, of course. I'll add some tests for it.

Regards

@wingunder
Copy link
Contributor Author

Hi @sewenew,

Thanks for your comments! I created a new commit to fix your comments. You can take a look.

This looks OK to me.

One thing that I did notice however, is that throughout the redis++ lib, use is being made of std::chrono::steady_clock::now() and std::chrono::system_clock::now(). I would say that it will be more safe to go for std::chrono::steady_clock::now() only, especially in the case of the redlock stuff.

eg.

auto cur = std::chrono::system_clock::now();

This should probably be something for a new PR, if relevant.
Regards

@sewenew
Copy link
Owner

sewenew commented Nov 14, 2019

Hi @wingunder

I'm glad you noticed the use of two kinds of clocks :)

steady_clock is suitable for measuring elapsed time, since it's monotonic. So I use it to record the last active time of a connection, and calculate the elapsed time for locking.

However, there're some Redis commands that take a UNIX timestamp (system time point) as parameter, e.g. EXPIREAT, PEXPIREAT. So I think the corresponding Redis member functions, e.g. Redis::expireat, should take chrono::time_point<chrono::system_clock> as the parameter type.

In the case of RedLock, the API is designed to follow Redis' idiom: one method takes a TTL (chrono::milliseconds) as parameter (similar to EXPIRE), while the other takes a system time point (chrono::time_point<chrono::system_clock>) as parameter (similar to EXPIREAT).

Hope this can answer your question :)

Regards

@wingunder
Copy link
Contributor Author

Hi @sewenew,

Thanks for the explanation.

Here is some stuff that I picked up, while reviewing your last 2 commits (10ca2b2 and 10c333d):

From: 10ca2b2#r35934763

Yes, I mixed < and <=. I'll make the rest of code to use <=, since if time_left is 0ms, it, in fact, doesn't acquired the lock.

If I understand it correctly, a try_lock() and extend_lock() are now assumed to have failed if they had returned a duration <= std::chrono::milliseconds(0).

However, RedMutex::try_lock(), RedMutex::extend_lock() and RedLockMutex::try_lock() now return std::chrono::milliseconds(-1). I think they should rather return std::chrono::milliseconds(0) in this case.

A more serious problem, is that RedLockMutex::extend_lock() now has a bug, as it does not set the result.remaining_time to std::chrono::milliseconds(0), before returning it. It was not a problem before, as it returned a boolean.

std::chrono::milliseconds extend_lock(const std::string &random_string,
const std::chrono::milliseconds &ttl)
{
const RedLockMutexVessel::LockInfo lock_info =
{true, std::chrono::steady_clock::now(), ttl, _resource, random_string};
const auto result = _redlock_mutex.extend_lock(lock_info, ttl);
return result.time_remaining;
}

It should actually look as follows:

std::chrono::milliseconds extend_lock(const std::string &random_string,
        const std::chrono::milliseconds &ttl)
{
    const RedLockMutexVessel::LockInfo lock_info =
        {true, std::chrono::steady_clock::now(), ttl, _resource, random_string};
    const auto result = _redlock_mutex.extend_lock(lock_info, ttl);
    if (!result.locked) {
        return std::chrono::milliseconds(-1);
    }
    else {
        return result.time_remaining;
    }
}

The patch will look as follows:

diff --git a/src/sw/redis++/recipes/redlock.h b/src/sw/redis++/recipes/redlock.h
index 53a88f5..c2ccc35 100644
--- a/src/sw/redis++/recipes/redlock.h
+++ b/src/sw/redis++/recipes/redlock.h
@@ -286,7 +286,12 @@ public:
         const RedLockMutexVessel::LockInfo lock_info =
             {true, std::chrono::steady_clock::now(), ttl, _resource, random_string};
         const auto result = _redlock_mutex.extend_lock(lock_info, ttl);
-        return result.time_remaining;
+        if (!result.locked) {
+            return std::chrono::milliseconds(0);
+        }
+        else {
+            return result.time_remaining;
+        }
     }
 
     std::chrono::milliseconds extend_lock(const std::string &random_string,

If you want, I can make a PR for this.
Just tell me if I should use return std::chrono::milliseconds(0); or return std::chrono::milliseconds(-1);
Also please tell me if I should fix the other 3 return std::chrono::milliseconds(-1); occurrences as well.

Regards

@sewenew
Copy link
Owner

sewenew commented Nov 15, 2019

Hi @wingunder

Feel free to create a PR to fix it :) Please create the PR based on recipes-dev branch. When it's stable, I'll merge it into recipes branch.

I think they should rather return std::chrono::milliseconds(0) in this case.

I think either 0ms or -1ms should be OK. Since we just need to way to indicate that we failed to acquire the lock by locking more than quorum number of nodes. Returning -1ms is trying to follow the TTL command's idiom: a negative value in order to signal an error.

In fact, the best solution might be throwing an exception. Because this failure is different from the run-out-of-time error. How about keeping it returning -1ms now, and I'll rethink whether we should throw an exception.

Regards

@wingunder
Copy link
Contributor Author

Hi @sewenew,
Thanks for your feedback.
OK, I'll make another PR for this and I'll keep returning -1ms.
Regards

@wingunder
Copy link
Contributor Author

Hi @sewenew,

OK, I'll make another PR for this and I'll keep returning -1ms.

This is done in PR #39.

Regards

@sewenew
Copy link
Owner

sewenew commented Nov 20, 2019

Hi @wingunder

I'm so sorry for the late reply. Too busy these days...

I've merged your commit. Thanks again for your great work!

Regards

@wingunder
Copy link
Contributor Author

Hi @sewenew,
No problem. I'm glad PR #39 was usable.
Please let me know if I could close this issue.
I however think it will be good to keep it open until you've re-worked the details and added RedLock to master.
Regards

@sewenew
Copy link
Owner

sewenew commented Nov 20, 2019

Hi @wingunder

I agree with you. Let's keep it open :)

Regards

@wingunder
Copy link
Contributor Author

Hi @sewenew,

I would like to send you some PRs for the recipes branch, but I saw it's a bit behind master. Would it be possible for you to merge master into recipes?

$ git log --left-right --graph --cherry-pick --oneline origin/recipes..origin/master
> fc28b0e update README
> e29585f add comments for SORTED SET commands
> 4fd4cba add comments for SET commands
> 241f09f add comment for HASH commands
> f4f8781 add comments for constructors
> 65365db add comments for LIST commands
> 856fb77 add comments for STRING commands
> f761408 add comments for KEYS commands
> 91a325c add comments for connection commands
> 1056a42 add comments for keys commands
> 0b34392 add comments for Redis commands whose return value is of boolean type
> 25c7d49 update README and print c++ standard info in cmake output
> bcec1cc Add a CMAKE option for building tests (#43)
> 6301982 add std::string_view and std::optional support
> 26bd432 add sigle key overload for geohash, geopos and script-exists commands
> 9defc53 add test and doc for full URI support
> d74262a fix README problem
> cfc0afd fix README problems.
> 2581019 add full URI scheme support
> 1fdc56d add RedLock doc in README
> a5c7c00 Add missing header to enable build on FreeBSD. (#40)
> ce75d3e update README

Thanks & regards

@sewenew
Copy link
Owner

sewenew commented Feb 12, 2020

Hi @wingunder

No problem! I've merged mater to recipes branch. Thanks for you reminding! I'll do the merge once the master branch has any changes in the future.

Regards

@sewenew sewenew mentioned this issue Jul 19, 2020
@romanholidaypancakes
Copy link

hi, is there any progress?

@sewenew
Copy link
Owner

sewenew commented Sep 22, 2021

@inaryart I still don't receive any feedbacks on the API of this Redlock implementation. However, I'll merge mater branch to recipe branch from time to time. If you get any problem on it or have any feedback on it, feel free to let me know.

Regards

@wingunder
Copy link
Contributor Author

Hi @sewenew,

However, I'll merge mater branch to recipe branch from time to time.

You probably meant that you'll merge the master branch into the recipes branch (where the current RedLock implementation is) from time to time.

At the moment the state of the recipes branch is:

This branch is 12 commits ahead, 91 commits behind master. 

Regards

@sewenew
Copy link
Owner

sewenew commented Sep 23, 2021

@wingunder Yes, I'll merge the master branch into recipes branch. I might do a merge this weekend.

Regards

@sewenew
Copy link
Owner

sewenew commented Sep 25, 2021

Hi all, master branch has been merged into the recipes branch. If you have any feedbacks on the Redlock API, feel free to let me know:)

Regards

@RobinLanglois
Copy link

Hello, will this be merged in master soon ?

@RobinLanglois
Copy link

@sewenew by the way, can we use Redlock with a RedisCluster ?
Thank you in advance

@sewenew
Copy link
Owner

sewenew commented Apr 27, 2022

@RobinLanglois Still not received any feedbacks. However, since there're some use cases for RedLock, I'll try to clean the code, and merge it in master in May.

can we use Redlock with a RedisCluster?

NO. RedLock algorithm works with standalone deployed Redis, NOT Redis Cluster.

Regards

@RobinLanglois
Copy link

@sewenew oh really ? do you have any ideas of alternatives ?

Thank you, regards

@RobinLanglois
Copy link

Well I don't understand, because according to Redis website, Redlock is designed for Redis cluster (without replication).

@sewenew
Copy link
Owner

sewenew commented Apr 29, 2022

@RobinLanglois No. You might have some misunderstanding. From the doc you mentioned:

In the distributed version of the algorithm we assume we have N Redis masters. Those nodes are totally independent, so we don’t use replication or any other implicit coordination system

These nodes are independent masters, which have nothing to do with Redis Cluster.

Regards

@RobinLanglois
Copy link

Oh okay, thanks for explanation

@sewenew
Copy link
Owner

sewenew commented Oct 19, 2022

Hi all,

Redlock implementation has been merge into master branch. Also I add a high level API to make it work like a std::mutex with automatically extending the lock. Please check the doc for detail.

So sorry for the really late delay...

Special thanks give to @wingunder ! Thanks a lot!

Regards

@sewenew sewenew closed this as completed Oct 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants