Avoid expiration and eviction during data syncing #1185

lyq2333 · 2024-10-17T11:10:31Z

When we sync data from the source Valkey to the destination Valkey using some sync tools like redis-shake, the destination Valkey think it's a primary and can perform expiration and eviction, which may cause data corruption. This problem has been discussed in redis/redis#9760 (reply in thread) and Redis already have a solution. But in Valkey we haven't fixed it by now.

i.e. we call set key 1 ex 1 on the source server and transfer this command to the destination server. Then we call incr key on the source server before the key expired, we will have a key on the source server with a value of 2. But when the command arrived at the destination server, the key may be expired and has deleted. So we will have a key on the destination server with a value of 1, which is inconsistent with the source server.

In standalone mode, we can use writable replica to simplify the sync process. However, in cluster mode, we still need a sync tool to help us transfer the source data to the destination. The sync tool usually work as a normal client and the destination works as a primary which keep expiration and eviction.

In this PR, we add a new mode named 'import-mode'. In this mode, server stop expiration and eviction just like a replica. Notice that this mode exists only in sync state to avoid data inconsistency caused by expiration and eviction. The server in import mode can't turn to a real replica by replicaof or cluster replicate and vice versa. Sync tools can mark their clients as an import source by CLIENT IMPORT-SOURCE, which work like a client from primary and can visit expired keys in lookupkey.

Any better idea?

codecov · 2024-10-17T11:27:15Z

Codecov Report

Attention: Patch coverage is 65.71429% with 12 lines in your changes missing coverage. Please review.

Project coverage is 70.69%. Comparing base (a62d1f1) to head (7883e9f).
Report is 4 commits behind head on unstable.

Files with missing lines	Patch %	Lines
src/config.c	42.85%	4 Missing ⚠️
src/networking.c	66.66%	4 Missing ⚠️
src/cluster_legacy.c	33.33%	2 Missing ⚠️
src/replication.c	33.33%	2 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##           unstable    #1185      +/-   ##
============================================
+ Coverage     70.65%   70.69%   +0.03%     
============================================
  Files           114      114              
  Lines         61799    63119    +1320     
============================================
+ Hits          43664    44621     +957     
- Misses        18135    18498     +363

Files with missing lines	Coverage Δ
src/commands.def	`100.00% <ø> (ø)`
src/db.c	`88.79% <100.00%> (+0.28%)`	⬆️
src/evict.c	`97.75% <100.00%> (+0.02%)`	⬆️
src/expire.c	`96.57% <100.00%> (+0.14%)`	⬆️
src/server.c	`88.75% <100.00%> (+0.10%)`	⬆️
src/server.h	`100.00% <ø> (ø)`
src/cluster_legacy.c	`86.19% <33.33%> (-0.13%)`	⬇️
src/replication.c	`87.24% <33.33%> (-0.17%)`	⬇️
src/config.c	`78.68% <42.85%> (-0.02%)`	⬇️
src/networking.c	`88.49% <66.66%> (-0.01%)`	⬇️

... and 84 files with indirect coverage changes

soloestoy · 2024-10-17T11:33:41Z

good work! @valkey-io/core-team please take a look

zuiderkwast

I like it. I think it's a little bit strange, but with some better documentation it's not too strange.

We need to have good support for migration tools, especially for users who want to migrate from proprietary software to open source software. 😄

Redis already have a solution.

I remember this discussion from the Redis times. Do you know what Redis solution is (the same REPLCONF PSEUDO-MASTER?) or is it secret?

Any better idea?

Have you considered the idea to let RediShake act as a primary and let the target database replicate from RediShake? It can act as a replication proxy?

+-----------+   PSYNC  +-----------+   PSYNC  +--------+
| Source DB |<---------| RediShake |<---------| Valkey |
+-----------+          +-----------+          +--------+

zuiderkwast · 2024-10-17T12:33:30Z

src/replication.c

+ * - pseudo-master <0|1>
+ * Set this connection behaving like a master if server.pseudo_replica is true.
+ * Sync tools can set their connections into 'pseudo-master' state to visit expired keys.
 * */
 void replconfCommand(client *c) {


The sync tool sends REPLCONF PSEUDO-MASTER to say "I'm a pseudo-master, so ignore expire for this connection"?

Almost all other REPLCONF commands are sent by the replica to the primary before doing the PSYNC. This one is different. Can we add a more explicit comment about this difference, similar to REPLCONF GETACK which is also different:

* - getack <dummy> * Unlike other subcommands, this is used by primary to get the replication * offset from a replica.

The sync tool sends REPLCONF PSEUDO-MASTER to say "I'm a pseudo-master, so ignore expire for this connection"?

Yes. It does look a bit strange. I add a new command CLIENT IMPORT-SOURCE <on|off> like what @soloestoy say. Maybe it'll look better this way?

zuiderkwast · 2024-10-17T12:36:12Z

valkey.conf

+# Make the master behave like a replica, which forbids expiration and evcition.
+# This is useful for sync tools, because expiration and evcition may cause the data corruption.
+# Sync tools can set their connections into 'pseudo-master' state by REPLCONF PSEUDO-MASTER to
+# behave like a master(i.e. visit expired keys).
+#
+# pseudo-replica no
+


I think "pseudo-master" is a little bit confusing. :) I understand how it works now, but only after I read the test cases. Let's try to improve this documentation later.

In Valkey we are no longer not using "master", so new commands and configs should use "primary".

Maybe we can use master if Redis has exactly the same REPLCONF or config, but otherwise let's use primary.

I agree with you, the new mode server is master node, but it is named as pseudo-replica, it is confused by most people. Let us first give it a better name first.

import-mode?

Should this node forbid writes from all other clients? It makes it behave even more like a replica.

Btw, I'm thinking now that if we want a better implementation of slot migration, maybe it can use the same or a similar feature. The slot replication is also similar to replication but initiated from the source node. @enjoy-binbin how is the implementation you want to upstream?

Because clients may sync data to a working server, I think this is not very friendly to the 24/7 businesses if we simply forbid writes from all other clients. But if we allow writes from other clients, we may face the same dual-write problem as writable replica, I'll document it in the valkey.conf.

Yes, the slot replication is also similar the replication in the implementation, it is something like slot RDB + slot replication propagate + slot failover, something like that

soloestoy · 2024-10-18T02:13:06Z

+-----------+   PSYNC  +-----------+   PSYNC  +--------+
| Source DB |<---------| RediShake |<---------| Valkey |
+-----------+          +-----------+          +--------+

interesting, and we have also considered such a method, but there are several issues.

First, as a cloud provider, we do not allow an instance to become a replica for external instances. This is a very risky operation, especially since the primary connection uses a super user, which has excessive permissions. Additionally, the replica needs to establish an outbound connection, this is also not allowed. These restrictions, I believe, are not unique to cloud providers, many users' security control policies also prohibit such actions.

Another point is that, in a cluster mode, the source and target instances for migration typically have different slot distributions, and redisShake can help with correctly routing the data.

soloestoy

Moreover, I think we should only allow write commands for pseudo-master client when server is in pseudo-replica mode.

soloestoy · 2024-10-18T02:19:12Z

src/evict.c

@@ -546,8 +546,8 @@ int performEvictions(void) {
        goto update_metrics;
    }

-    if (server.maxmemory_policy == MAXMEMORY_NO_EVICTION) {
-        result = EVICT_FAIL; /* We need to free memory, but policy forbids. */
+    if (server.maxmemory_policy == MAXMEMORY_NO_EVICTION || server.pseudo_replica) {


it's better to place it in isSafeToPerformEvictions together with the server.primary_host check.

If place it in isSafeToPerformEvictions , the server will ignore maxmemory. I think it's better to return an OOM error to stop the syncing process.

src/replication.c

lyq2333 · 2024-10-18T02:40:28Z

Redis already have a solution.

I remember this discussion from the Redis times. Do you know what Redis solution is (the same REPLCONF PSEUDO-MASTER?) or is it secret?

It's secret. I used to push this PR to Redis Community(#13077) and they say they will PR their implementation. But no news after that.

Any better idea?

Have you considered the idea to let RediShake act as a primary and let the target database replicate from RediShake? It can act as a replication proxy?
+-----------+   PSYNC  +-----------+   PSYNC  +--------+
| Source DB |<---------| RediShake |<---------| Valkey |
+-----------+          +-----------+          +--------+

What @soloestoy said is a major limitation for us. BTW, sometimes the destination already has some data, PSYNC will delete them.

enjoy-binbin · 2024-10-18T03:56:48Z

i did not read it carefully, internally we simply pause the expiration on both side i guess.
Redis has an issue that seems to be discussing this: redis/redis#13478

lyq2333 · 2024-10-18T08:15:58Z

i did not read it carefully, internally we simply pause the expiration on both side i guess. Redis has an issue that seems to be discussing this: redis/redis#13478

Interesting. I think this PR can solve that problem if the server has no other read. If the server has other read, one possible solution is to disable all expiration in lookupKey for all clients. But I don't want this PR to affect read/write from normal client.

src/networking.c

src/server.h

valkey.conf

zuiderkwast · 2024-10-21T09:47:42Z

@lyq2333 Can you commit the changes to commands.def?

When you run make locally and you have python3 installed, make updates commands.def if there are any new or changed commands.

Signed-off-by: lvyanqi.lyq <[email protected]>

…mand Signed-off-by: lvyanqi.lyq <[email protected]>

Signed-off-by: lvyanqi.lyq <[email protected]>

lyq2333 · 2024-10-21T11:02:03Z

@lyq2333 Can you commit the changes to commands.def?

When you run make locally and you have python3 installed, make updates commands.def if there are any new or changed commands.

@zuiderkwast Thanks. Sorry forgot it before. I also forgot to sign off and have to force push. No code is modified.

madolson · 2024-10-21T15:13:51Z

Weekly core meeting. No specific comments other than we should review this offline and make progress. Directionally seems like a good idea.

lyq2333 · 2024-10-22T09:38:01Z

The scenario we encountered is that some users want to migrate data from Server A to Server B. Both A and B work as primary and B may have some data before the migration. We find expiration and eviction may lead to data inconsistency. So we came up with a simple method, but it still has a few places to discuss.

We plan to introduce a new config(import-mode in this PR) to mark that this server is importing data. As for expiration, in import mode, active expiration in cron is prohibited and passive expiration in commands is limited depend on the client state. Clients marked as import source work like server.primary, which will not trigger passive expiration for read and write. But there are many ways to handle other normal clients. We think of some ways as follow.

Normal clients work as usual, which means they can perform passive expiration whether for read or write. The drawback is that normal clients accessing the migrating data will also trigger passive expiration and may affect the migration.
Normal clients are prohibited from write commands, meanwhile don't trigger passive expiration when read and don't read expired keys. The server works like a read-only replica. The drawback is that clients need to stop writing during the data migration and may affect users business.
Normal clients don't trigger passive expiration when read and don't read expired keys, but still trigger passive expiration when write. We can't prohibit passive expiration in write commands, because server will crash when we call incr/decr commands on expired keys. The reason is that these commands must delete expired keys first, otherwise it will hit the assert in dbAdd. The drawback is that normal clients calling write commands on the migrating data will trigger passive expiration. This is a trade-off between the above two ways and I think it's better because it won't affect the normal clients and have less impact on the migration process.

As for eviction, should server disable eviction automatically to ensure data consistency in import mode？Or let users choose the maxmeory-policy, I guess no one will choose an option other than noeviction.

@valkey-io/core-team WDYT? Any suggestions would be greatly appreciated.

soloestoy · 2024-10-23T03:36:00Z

The scenario we encountered is that some users want to migrate data from Server A to Server B. Both A and B work as primary and B may have some data before the migration. We find expiration and eviction may lead to data inconsistency. So we came up with a simple method, but it still has a few places to discuss.

We plan to introduce a new config(import-mode in this PR) to mark that this server is importing data. As for expiration, in import mode, active expiration in cron is prohibited and passive expiration in commands is limited depend on the client state. Clients marked as import source work like server.primary, which will not trigger passive expiration for read and write. But there are many ways to handle other normal clients. We think of some ways as follow.

Normal clients work as usual, which means they can perform passive expiration whether for read or write. The drawback is that normal clients accessing the migrating data will also trigger passive expiration and may affect the migration.

Normal clients are prohibited from write commands, meanwhile don't trigger passive expiration when read and don't read expired keys. The server works like a read-only replica. The drawback is that clients need to stop writing during the data migration and may affect users business.

Normal clients don't trigger passive expiration when read and don't read expired keys, but still trigger passive expiration when write. We can't prohibit passive expiration in write commands, because server will crash when we call incr/decr commands on expired keys. The reason is that these commands must delete expired keys first, otherwise it will hit the assert in dbAdd. The drawback is that normal clients calling write commands on the migrating data will trigger passive expiration. This is a trade-off between the above two ways and I think it's better because it won't affect the normal clients and have less impact on the migration process.

As for eviction, should server disable eviction automatically to ensure data consistency in import mode？Or let users choose the maxmeory-policy, I guess no one will choose an option other than noeviction.

@valkey-io/core-team WDYT? Any suggestions would be greatly appreciated.

The core issue is that for the target node of data migration that is active, we need to provide normal read and write services. However, we don't want to affect the data being migrated and can't identify which data is being migrated.

Method 3 seems to be a relatively suitable option, but of course, it requires users to ensure they do not perform write operations on the data being migrated.

zuiderkwast · 2024-10-23T10:32:51Z

I agree method 3 seems to be the most suitable. The node behaves like when reading from a replica, a writable replica if write commands are used.

lyq2333 force-pushed the pseudo-master branch from 52be593 to b23b09b Compare October 17, 2024 11:13

lyq2333 requested a review from soloestoy October 17, 2024 11:14

lyq2333 requested a review from a team October 17, 2024 11:34

zuiderkwast reviewed Oct 17, 2024

View reviewed changes

hwware added the major-decision-pending Major decision pending by TSC team label Oct 17, 2024

soloestoy reviewed Oct 18, 2024

View reviewed changes

zuiderkwast reviewed Oct 18, 2024

View reviewed changes

src/networking.c Outdated Show resolved Hide resolved

zuiderkwast reviewed Oct 18, 2024

View reviewed changes

src/server.h Outdated Show resolved Hide resolved

zuiderkwast reviewed Oct 18, 2024

View reviewed changes

valkey.conf Outdated Show resolved Hide resolved

zuiderkwast reviewed Oct 18, 2024

View reviewed changes

valkey.conf Outdated Show resolved Hide resolved

lyq2333 added 8 commits October 21, 2024 18:36

add pseudo-replica mode

f59c5e8

Signed-off-by: lvyanqi.lyq <[email protected]>

rename pseudo-replica to import-mode and add client import-source com…

7226e55

…mand Signed-off-by: lvyanqi.lyq <[email protected]>

revert the file permission change

c2e86bf

Signed-off-by: lvyanqi.lyq <[email protected]>

minor comments fix

937924c

Signed-off-by: lvyanqi.lyq <[email protected]>

minor comments fix

b3b5c46

Signed-off-by: lvyanqi.lyq <[email protected]>

fix typo

a79a51a

Signed-off-by: lvyanqi.lyq <[email protected]>

auto-generate commands.def

3e36548

Signed-off-by: lvyanqi.lyq <[email protected]>

auto-generate commands.def

7883e9f

Signed-off-by: lvyanqi.lyq <[email protected]>

lyq2333 force-pushed the pseudo-master branch from 0655bbc to 7883e9f Compare October 21, 2024 10:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid expiration and eviction during data syncing #1185

Avoid expiration and eviction during data syncing #1185

lyq2333 commented Oct 17, 2024 •

edited

Loading

codecov bot commented Oct 17, 2024 •

edited

Loading

soloestoy commented Oct 17, 2024

zuiderkwast left a comment

zuiderkwast Oct 17, 2024

lyq2333 Oct 18, 2024 •

edited

Loading

zuiderkwast Oct 17, 2024

hwware Oct 17, 2024

zuiderkwast Oct 17, 2024

lyq2333 Oct 18, 2024

enjoy-binbin Oct 18, 2024

soloestoy commented Oct 18, 2024

soloestoy left a comment

soloestoy Oct 18, 2024

lyq2333 Oct 18, 2024

lyq2333 commented Oct 18, 2024

enjoy-binbin commented Oct 18, 2024

lyq2333 commented Oct 18, 2024

zuiderkwast commented Oct 21, 2024

lyq2333 commented Oct 21, 2024

madolson commented Oct 21, 2024

lyq2333 commented Oct 22, 2024

soloestoy commented Oct 23, 2024

zuiderkwast commented Oct 23, 2024

Avoid expiration and eviction during data syncing #1185

Are you sure you want to change the base?

Avoid expiration and eviction during data syncing #1185

Conversation

lyq2333 commented Oct 17, 2024 • edited Loading

codecov bot commented Oct 17, 2024 • edited Loading

Codecov Report

soloestoy commented Oct 17, 2024

zuiderkwast left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lyq2333 Oct 18, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

soloestoy commented Oct 18, 2024

soloestoy left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lyq2333 commented Oct 18, 2024

enjoy-binbin commented Oct 18, 2024

lyq2333 commented Oct 18, 2024

zuiderkwast commented Oct 21, 2024

lyq2333 commented Oct 21, 2024

madolson commented Oct 21, 2024

lyq2333 commented Oct 22, 2024

soloestoy commented Oct 23, 2024

zuiderkwast commented Oct 23, 2024

lyq2333 commented Oct 17, 2024 •

edited

Loading

codecov bot commented Oct 17, 2024 •

edited

Loading

lyq2333 Oct 18, 2024 •

edited

Loading