Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add snapshot step in etcdutl migrate #17911

Open
siyuanfoundation opened this issue Apr 29, 2024 · 17 comments
Open

add snapshot step in etcdutl migrate #17911

siyuanfoundation opened this issue Apr 29, 2024 · 17 comments
Labels
area/etcdutl help wanted priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. type/feature

Comments

@siyuanfoundation
Copy link
Contributor

siyuanfoundation commented Apr 29, 2024

What would you like to be added?

Currently during downgrade, if there is a WAL entry with fields from the higher version, the offline migration tool etcdutl migrate would abort because of the version check. This would block the offline downgrade migration even if the entries of higher version have been committed to the backend.

On the contrary, in the online migration, after enabling downgrade, ClusterVersionSet would downgrade the cluster version, and force snapshot, so that old WAL entries would be compacted and not blocking the migration.

We need to make the offline migration consistent with the online migration flow, by adding a snapshot step before calling schema.Migrate

Why is this needed?

To make offline downgrade consistent with online downgrade.

#13168 Milestone 3

@siyuanfoundation
Copy link
Contributor Author

cc @serathius @henrybear327

@jmhbnz jmhbnz added area/etcdctl area/etcdutl priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed area/etcdctl labels May 2, 2024
@ah8ad3
Copy link
Contributor

ah8ad3 commented May 6, 2024

If you think this task is suitable for me i would like to give it a try.

@siyuanfoundation
Copy link
Contributor Author

@ah8ad3 yes, it would be great if you can give it a try. You can ping me on slack if you have any questions. Thank you!

@ah8ad3
Copy link
Contributor

ah8ad3 commented May 11, 2024

Sorry do we have any implementation of getting snapshot without need of server? i couldn't find any, all i find needs Endpoint to get snapshot from server.
If i'm right we need to create an offline snapshotter or we should use the snapshot.SaveWithVersion.
Am i right? @siyuanfoundation @serathius

@siyuanfoundation
Copy link
Contributor Author

snapshot.SaveWithVersion fetches snapshot from remote etcd server, and when the data is connected to an ectd server, it could add new entries, which is not desirable. So it would need to be an offline snapshotter or just something to clean up the old WAL entries. @ah8ad3 can you dive into the details of EtcdServer.snapshot to see if it is possible?

@ah8ad3
Copy link
Contributor

ah8ad3 commented May 13, 2024

Thanks for answer. I'll try to understand the code and see what can we do. @siyuanfoundation

@ah8ad3
Copy link
Contributor

ah8ad3 commented May 13, 2024

Also can you please explain what is

something to clean up the old WAL entries

Clean from version tags? or delete?

@ah8ad3
Copy link
Contributor

ah8ad3 commented May 13, 2024

Think i got the answer from my last question.
I dive into code and find out we have an example directory of raft that seems to do the job we want to here, but i need confirmation that if i am in the right path for it? @siyuanfoundation @serathius

func (rc *raftNode) maybeTriggerSnapshot(applyDoneC <-chan struct{}) {

@ah8ad3
Copy link
Contributor

ah8ad3 commented May 15, 2024

I end up reading more about raft to understand the design, here is what stopped me from thinking about implementing offline snapshotter.
Screenshot from 2024-05-15 09-03-46

Based on this photo we can't decide offline which member to get snapshot from, we should only get it from online leader. So i think we should think about other suggestion which is

or just something to clean up the old WAL entries

Which i'm looking into it to see how to do that.

Update:
I didn't think about migrate command when writing this, i was thinking about whole project snapshot. If we get data-dir and only check that member we don't care about raft anymore.

@siyuanfoundation
Copy link
Contributor Author

I think you can do ReadConsistentIndex directly from the backend, and Compact (aka remove) any entries before the consistent index. I don't think we need to keep catchup entries in the migrate process, is that correct? @serathius

@ah8ad3
Copy link
Contributor

ah8ad3 commented May 19, 2024

Thank you @siyuanfoundation it seems ReadConsistentIndex will get apply id and will help but i have a little bit of problem in understanding and well implementing the downgrade which i ask in blow. If you think this need more consideration and we can ask it on community meeting or tag ask it on slack??!

I tested out current behavior of etcdutl migrate and etcdctl downgrade, it seems

  1. etcdctl downgrade 3.5 will change the version of db
    after running: etcd-dump-db iterate-bucket default.etcd/member/snap/db cluster --decode=true:
key="downgrade", value="{\"target-version\":\"3.5.0\",\"enabled\":true}"
key="clusterVersion", value="3.5.0"

and force snapshot which removes all entries from wal file
etcd-dump-logs default.etcd

Snapshot:
term=8 index=34 nodes=[8e9e05c52164694d] confstate={"voters":[10276657743932975437],"auto_leave":false}
Start dumping log entries from snapshot.
WAL metadata:
nodeID=8e9e05c52164694d clusterID=cdf818194e3a8c32 term=8 commitIndex=34 vote=8e9e05c52164694d
WAL entries: 0
term	     index	type	data

Entry types (Normal,ConfigChange) count is : 0
  1. running etcdutl migrate --data-dir default.etcd --target-version 3.5 --force successfully:
    after running: etcd-dump-db iterate-bucket default.etcd/member/snap/db cluster --decode=true:
key="downgrade", value="{\"target-version\":\"\",\"enabled\":false}"
key="clusterVersion", value="3.6.0"

Which doesn't change

and wal:
etcd-dump-logs default.etcd

Snapshot:
term=8 index=34 nodes=[8e9e05c52164694d] confstate={"voters":[10276657743932975437],"auto_leave":false}
Start dumping log entries from snapshot.
WAL metadata:
nodeID=8e9e05c52164694d clusterID=cdf818194e3a8c32 term=9 commitIndex=38 vote=8e9e05c52164694d
WAL entries: 4
lastIndex=38
term	     index	type	data
   9	        35	norm	
   9	        36	norm	header:<ID:7587878799003708674 > cluster_member_attr_set:<member_ID:10276657743932975437 member_attributes:<name:"default" client_urls:"http://localhost:2379" > > 
   9	        37	norm	header:<ID:7587878799003708677 > downgrade_info_set:<> 
   9	        38	norm	header:<ID:7587878799003708678 > cluster_version_set:<ver:"3.6.0" > 

Entry types (Normal,ConfigChange) count is : 4

It seems to me that etcdctl downgrade did a general better job on downgrading than etcdutl migrate, We need to remove cluster_version_set:<ver:"3.6.0" > entry maybe with compaction. But do you sure just compacting will be fine?

In addition i suggest we remove extra tools and add all utility on db and wal files (etcd-dump-logs, etcd-dump-dbs) into etcdutl. Having this many tools is a little bit hard to use and confusing.

@siyuanfoundation
Copy link
Contributor Author

Thank you @ah8ad3 for digging into this.
etcdctl downgrade works perfectly fine for the online downgrade, you need to start the etcdctl downgrade process before shutting down any node.
etcdutl migrate is meant to be used to migrate the files offline, say if you have the data files for 3.6, and you want to use them to start a 3.5 cluster. This issue is to make etcdutl migrate converge as much as possible as the etcdctl downgrade.

When you run etcdutl migrate, it will abort if there is any WAL entries that are not compatible with the old version, as described in this issue.
When you run etcdutl migrate --force, it would just change the storage the storage version, and pretend the migration is complete. I think you also uncovered that "clusterVersion" should also be changed too.

I think you make a really good point that there are too many tools. We can raise that in the community meeting.

@ah8ad3
Copy link
Contributor

ah8ad3 commented Jun 8, 2024

Sorry that took me a lot to respond this issue was a little bit hard for me i'm not that confident in etcd codebase, anyway.
I read a lot about snapshoting and still i think i may not know all about it.
In this function
func (s *EtcdServer) snapshot(snapi uint64, confState raftpb.ConfState) it creates snapshot v2 for compatibility,
but i don't think we need that in this issue so i went with standard v3 which is here and the final code of using it is something like this:

	var f *os.File
	defer f.Close()
	partpath := "etcd-test-db"
	f, err = os.OpenFile(partpath, os.O_WRONLY|os.O_CREATE|os.O_TRUNC, fileutil.PrivateFileMode)
	if err != nil {
		panic(fmt.Errorf("could not open %s (%v)", partpath, err))
	}
	c.lg.Info("created temporary db file", zap.String("path", partpath))
	snapshot := c.be.Snapshot()
	defer snapshot.Close()
	snapshot.WriteTo(f)

This will create a snapshot from the db and i test the restore with
etcdutl snapshot restore etcd-test-db --data-dir etcd-test-db-dir which works and etcd can start with it.
The only missing part in mind is that we should somehow check last consistent index with the snapshot but didn't think about it that much.

Also asked about this issue in main slack channel https://kubernetes.slack.com/archives/C3HD8ARJ5/p1716904436225239 , Marek said

You don't need to remove any files (assuming that #17227 is fixed). You just need to create a snapshot based on the last commit index, assuming that there is newer version entries after last snapshot or database consistent index. (edited)

And also

If there is a incompatible WAL entry after commit index, then we cannot downgrade the storage. We should give up and return error.

So i think this issue can be fixed with the code i suggested and we should fix the other issue #17227 .
WDYT @siyuanfoundation ?

@ahrtr
Copy link
Member

ahrtr commented Dec 5, 2024

We can create a v2snapshot with consistent_index in v3store(bbolt), and all WAL records older than the consistent_index can be skipped. I agree it's doable, but I wouldn't suggest to spend huge effort on this. The etcdutl migrate will be good enough if we can resolve #17227.

@ahrtr
Copy link
Member

ahrtr commented Dec 12, 2024

I agree that this feature is nice to have, although it isn't a blocker for 3.6. I am thinking probably we can break down this feature into two smaller ones:

  • add a command something like etcdutl v2snapshot create to support creating a v2 snapshot based on the consistent_index.
    • Since etcd 3.6 will need a v2snapshot anyway, so the offline command can be a manual last to resort solution for any potential issue.
  • add snapshot step in etcdutl migrate as this ticket describes.

Both above two tasks may be able to share most of the source code/implementation.

@ahrtr
Copy link
Member

ahrtr commented Dec 12, 2024

@ah8ad3 are you still working on this? Please feel free to reach out if you have any questions or comments. Thanks.

@ahrtr
Copy link
Member

ahrtr commented Dec 16, 2024

Is anyone interested in working on this per #17911 (comment)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/etcdutl help wanted priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. type/feature
Development

No branches or pull requests

4 participants