-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove clock drift checks from connectivity monitor #16432
Comments
hi @serathius |
Opening for discussion. @ptabor @ahrtr @jmhbnz ccing everyone involved with the lease work @xiang90 @gyuho @hexfusion @yichengq @jonboulle @yichengq @heyitsanthony |
The lease depends on wall time, if a member's local clock drift, then the lease will be affected. Simply put, all usage of The existing lease has big design issue, we need to think about how to refactor it, which has already been tracked by the roadmap. |
I don't think impact of clock drift on wall time matters. By impact on lease I mean consistency issue. Situations where clock drift could cause one member to consider lease expired while other member doesn't. To clarify by clock drift a small error in physical clock that accumulates over a time. It might be error of 1 second per month that over a year accumulates to couple of minutes. This is negligible for lease wall time calculation as leases are meant to be short, Kubenetes lease for events is unusual because it's for 2h. For 2h lease the wall time impact of clock drift should negligible (below 1 second) which is acceptable. Clock drift between members would matter if TTL was calculated in exact deadline, not by time difference. For example if we have lease with TTL 1h, it doesn't matter if one member calculates time from 18:00:00 to 19:00:00 while other member is 10 second forward and calculates 18:00:10 to 19:00:10. |
Two things:
|
Thanks @xiang90 for confirming there is no correctness issue. I can agree that having clock drift in cluster is not great, however this is an external problem to etcd. Many assumptions break with clock drift, take centralized logging, it becomes useless for debugging if logs are collected from nodes with clock drift. I don't think this is inherently an etcd problem. I had multiple users scared asking me what's the impact on etcd. Will it not perform well? Is the whole cluster consistency is under risk? Overall I think it's not a good idea to try to solve issues that users don't have. Etcd is not a monitoring system, so it should not monitor or alert on clock drift. It just confuses user about why etcd cares about clock drift. We can warn users about clock drift making etcd debugging hard. We can recommend that users run NTP and monitor their clock drift, even provide a link to external tools, but etcd should not take this responsibility on itself. |
hi @serathius, @xiang90 |
I'm also +1 for removing it. WDYT about adding a timestamp to endpoint/status? That way we can still have some indication when we go through (disconnected) customer logs. |
@Aditya-Sood We haven't made decision on how to proceed yet. |
Don't think it will help. Clock drift at the moment of request can be totally unrelated to clock drift present at the moment of logs were written. |
As confirmed in #16432 (comment) I would like to repeat the proposal of to remove the ping @ahrtr @jmhbnz @wenjiaswe |
Hey Team - I've been following this thread and held off commenting as I'm still not fully familiar with the underlying code in question. However to answer the ping above, my vote would be to continue to inform users that clock drift over a certain threshold exists, but ensure this is done in such a way that it is clear there is no impact on etcd consistency. |
Isn't this a problem caused by clock drift? Especially when the problematic member is a leader. Overall, I don't think this ticket deserve much time to discuss before the issue I mentioned in #16432 (comment) is resolved, especially #15247 |
No, because:
|
FYI. It's being executed by quorum instead of being agreed/consensused by quorum. Also per your logic, the issue #15247 should NOT happen, because the out of date leader (which gets stuck on writing for a long time) will never get the consensus.
Pls do not assume any user cases. |
What I meant here is that the decision that lease should be invalidated is made by leader and then proposed to raft.
No, that's not true. My understanding: (please correct me if any of those points is incorrect)
Issue #15247 is caused by point 2a not properly executed. Old leader doesn't know that it should step down and countdown clock is ticking. This results in old leader executing point 3 even though it shouldn't be. I have pointed out those issues in #15944 long time ago. Clock drift doesn't influence any of those steps, as cluster only depends on countdown clock on leader. If leader is changed, the TTL is reset. If leader clock is 10 seconds behind other members, it doesn't matter it the time difference it will count will still be TTL.
We need to make assumptions, especially about leases which were designed for short living leader election tokens. 2 hour leases are already a big problem for Kubernetes due to lack of checkpointing. Imagine that you generate 1 GB of Kubernetes events within an hour and you have an hour of TTL. Every time there is a leader change (can easily happened multiple times in hour), the TTL will be reset. One leader election you have 2GB, two leader changes, you have 3 GB and so on. |
What would you like to be added?
I would like to propose removal of logs
prober found high clock drift
as they are incorrectly implying that etcd is impacted by clock drift.To my knowledge, there is no impact of clock difference on etcd version 3.
Raft itself doesn't not depend on time in any way. It measures time passage for things like health probes, but doesn't compare time between members.
The only part of etcd that could be impacted is Leases, see #9768 (comment).
The connectivity monitor that reports the time drift was introduced for v2 etcd (#3210).
In v3 etcd leases were rewritten to depend on time difference, thus should not be affected. #3834
Leases also use monotonic time (#6888, #8507) meaning time changes should not impact ttl.
I expect the connectivity monitor stayed due to etcd v3.5 still officially supporting v2 API.
Next release v3.6 removes v2 API, so we can remove the clock drift detection too.
Please let me know if you are aware of any place that etcd could be impacted by clock drift.
Why is this needed?
Prevents user confusion about clock drift impact on etcd.
The text was updated successfully, but these errors were encountered: