-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ETCD db size is extremely unbalanced among members #7116
Comments
After checking the cluster details, I noticed that the etcdv3 should have defragment operation cause of using boltdb. |
@armstrongli etcd should not ever get into this state. can you easily reproduce this? do you still have that 11GB data around? |
@xiang90 I'm trying to reproduce it. It happens for twice. So I raise it here. What I have done on the cluster is to upgrade it from 2.2.1 to 3.0.15 in one kubernetes cluster. |
@armstrongli The defrag just fixes the symptom. The size should not be unbalanced (~10 MB difference is possible) at that at the first place. It would be great if you can reproduce it. Does the leader's db size become 11GB right after the migration? Or it grows to 11GB eventually? If the later, how long does it take to grow to 11GB? Hours? Days? |
@xiang90 |
@armstrongli Is it still growing now after your defrag? |
@xiang90 Not yet. The cluster runs for weeks until exceeds database space. It runs only 2 days. |
@armstrongli 11GB/x weeks where x < 5 ~ 2GB/week ~ 300MB/day. You should be able to notice even for 2 days it if it continues to grow, I guess. |
@armstrongli Any updates on this? |
@xiang90 The data size doesn't change a lot after running my cluster for 1 week. Close this issue for now. |
@armstrongli OK. I guess there is something with your initial setup or migration. Let us know if you meet this again. |
@heyitsanthony @gyuho @fanminshi Other people reported similar behavior. Probably we should look into this problem. |
@armstrongli Do you remember if you took snapshot from that etcd server periodically? |
@xiang90 sorry for this late response. |
@armstrongli Have you seen it recently? Any chance you can reproduce it? |
no. i haven't found the root cause. we are using it for kubernetes clusters in our company. |
So you are able to somehow reproduce it? |
I redo what I did, the unbalance data didn't get reproduced. :( |
The reproduce steps are a little different from the ones which happened on unbalance data. I used the backup data instead of the hot data(which contains all the tomb data/historical data). |
@armstrongli As long as you can reproduce it, it will give us a better chance to get it fixed. |
@xiang90 I noticed that one of our clusters encountered data unbalance again after running for months. And it caused some very wired issue on ETCD cluster.
The status of kubernetes shows that the data is different among kubernetes master node. From master-1 & master-3, the data is same. Master-2 is totally different from others.
We didn't do anything on ETCD just let it run for months. |
Looks like the raft algorithm in ETCD encounters some corner cases it can't handle. |
is it a fresh v3 cluster? or migrated from v2? |
Migrated from v2 |
@armstrongli I would image it is not synced from the very beginning. Are you sure they are synced? |
Yes. they were synced and have been running months healthily. And we got issue today. |
can you provide me the log of all the members around when it happened? |
|
Looks like that this member didn't purge its history things. Or there can be some error on the storage part to catch the wrong history logs. |
The most wired thing is that the leader election kept running a long time. The maser-2 should step down as the cluster has leader and should follow the leader's steps. |
my bad about leader election. |
@xiang90 per my understanding, the leader election should happen if the member can't get heart beat from leader, and convert into candidate and send leader election. If the member gets the cluster leader info and follow up, then it should catch up the cluster and serve traffics. Now the status of the member looks like that it has caught up with the cluster and served traffic for some time. |
Hi, if you can investigate the issue and find the root cause, then it would be great! Probably you have to look into the codebase more and understand how exactly etcd works before you start the investigation. It does not work the way you assume. Or the best option here is probably to provide enough information for etcd dev team to help. We need the full log around (before 24 hours and after 24 hours ) when the bad thing happened. Do not truncate the log or skip log entries. Raw log would be the best.
How do you know they are synced? Do you have any monitoring data? |
@xiang90 |
@armstrongli I read through the log. I do not find anything that can cause data inconsistent in the log. I suspect that the data already got inconsistent before 5.18. I guess you noticed it since |
My confustion:
The |
Related issue: @xiang90 I have successfully reproduced the data unbalance state of etcd cluster. I have one test kubernetes cluster and keep it running for months. The difference between k8s and benchmark is that
Then I run defrag on the cluster and you can see the differences between
after defrag all the members, they have the same data size now.
The snapshot size ranges a large scale, too.
|
can you provide me the steps to reproduce it? so that we can also reproduce it in our environment? |
I compared 2 snapshots before defrag and after defrag(after defrag and running for ~3 days). There're a lot of
It's obvious that there're a lot of |
I thought about the object defined by bolt DB, it defines buckets and fill in the buckets with values and expand the size until it reach the max value and expand it. I guess(haven't reviewed code details about it): after compact and delete the old revisions, bolt DB does't reuse the blank pages(blocks) anymore and keeps allocating more space and place the data. especially the data is not solid as stone, it's porous as bread. |
@armstrongli We want to understand how your db size got unbalanced among members. Large db size is another issue. It is relevant, but not what we are talking about here. Again, we need to get a reproduce from you to confirm exactly what happened in your environment. |
@armstrongli kindly ping. |
@xiang90 I have been running benchmark on our cluster, and I can't reproduce it. Just keep the kubernetes running for months. What's more, the db size can exceed the 8GB setting. |
well. maybe k8s was doing something strange. or something else happened to the machine. we do not know. and there is no way to help you debug it. I do not really believe this is an etcd issue. if it is, it is probably dup with #8009. |
@xiang90 I've had the same problem lately after I migrated from v2 and update the kubernetes's version from 1.4.7 to 1.6.4. There are many anomalies. The etcd data grew very fast, and unbalanced on menberes. And the data version has been sent back after I added the startup parameters(--quota-backend-bytes=5368709120) and restart etcd service. Because this problem can lead to periodic problems of kubernetes cluster, so the impact is very large. |
I am facing the similar issue. Any permanent fix for this? |
As I mentioned, this issue is probably a dup with #8009. I closed this one since it is messy and has no reproduce steps or useful information. Please follow up on #8009. Or if you can reproduce your issue and believe it is different from #8009, please create a new one. Do not continue to comment on this closed issue please. |
I have one 4 members' cluster. And after running for days, one of members' DB size is extremely larger than the other ones.
Member
87e6a178ab5a0f63
has one DB file which is11 GB
, but other ones have only57 MB
.There're a lot of such logs in member
87e6a178ab5a0f63
The cluster shares the same dataset, so the data among them should be almost the same. e.g. 57~65 MB.
I'm expecting response :)
The text was updated successfully, but these errors were encountered: