-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[3.5] gRPC health server sets serving status to NOT_SERVING on defrag #17914
[3.5] gRPC health server sets serving status to NOT_SERVING on defrag #17914
Conversation
gRPC health server sets serving status to NOT_SERVING on defrag Backport from 3.6 in etcd-io#16278 Co-authored-by: Chao Chen <[email protected]> Signed-off-by: Thomas Jungblut <[email protected]>
1794125
to
750bc0b
Compare
the grpcProxy failure seems to be #16475 which wasn't backported to 3.5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tjungblu, the backport looks clean.
Click approve too soon: there is a follow up PR #16836 which improves code readability. Could you please also take that one? There is another in flight PR fixing the insecure / secure RPC servers set up #16959.. Although this is not a blocker of this PR since this is protected by an experimental feature flag and still improves the availability, we should document that in the help messages clearly. |
Thanks @chaochn47, I'll pick the other PR too and let you know |
Backport from 3.6 in etcd-io#16836 Co-authored-by: Chao Chen <[email protected]> Signed-off-by: Thomas Jungblut <[email protected]>
@chaochn47 please take a look again with your refactoring applied otherwise @ahrtr and @serathius PTAL |
@@ -301,6 +301,7 @@ func newConfig() *config { | |||
fs.DurationVar(&cfg.ec.ExperimentalWarningApplyDuration, "experimental-warning-apply-duration", cfg.ec.ExperimentalWarningApplyDuration, "Time duration after which a warning is generated if request takes more time.") | |||
fs.BoolVar(&cfg.ec.ExperimentalMemoryMlock, "experimental-memory-mlock", cfg.ec.ExperimentalMemoryMlock, "Enable to enforce etcd pages (in particular bbolt) to stay in RAM.") | |||
fs.BoolVar(&cfg.ec.ExperimentalTxnModeWriteWithSharedBuffer, "experimental-txn-mode-write-with-shared-buffer", true, "Enable the write transaction to use a shared buffer in its readonly check operations.") | |||
fs.BoolVar(&cfg.ec.ExperimentalStopGRPCServiceOnDefrag, "experimental-stop-grpc-service-on-defrag", cfg.ec.ExperimentalStopGRPCServiceOnDefrag, "Enable etcd gRPC service to stop serving client requests on defragmentation.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding experimental
prefix and removing it in future is a breaking change as we discussed in #17657. Also see #17657
Are we still happy to add prefix experimental
for an flag?
I know this PR just backports changes from main to 3.5, so the suggestion (adding experimental
in description only) is actually for the main branch (update main firstly). Do we want to follow the suggestion for now? Or do we have an agreement on #17657 and #17657 before we add any new flags? @fuweid @jmhbnz @serathius @siyuanfoundation @spzala
fs.BoolVar(&cfg.ec.ExperimentalStopGRPCServiceOnDefrag, "experimental-stop-grpc-service-on-defrag", cfg.ec.ExperimentalStopGRPCServiceOnDefrag, "Enable etcd gRPC service to stop serving client requests on defragmentation.") | |
fs.BoolVar(&cfg.ec.ExperimentalStopGRPCServiceOnDefrag, "stop-grpc-service-on-defrag", cfg.ec.ExperimentalStopGRPCServiceOnDefrag, "(Experimental) Enable etcd gRPC service to stop serving client requests on defragmentation.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As part of #17657 we will still need to support migration from the old flag naming scheme (with prefix) to the new one. One more flag should not change much as we already have other flags to migrate. So I don't think we need to block on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy for this to proceed using older approach. We are still working on the KEP for feature flags so it will be a while off and agree with @serathius above we don't want to block this work.
LGTM, thanks @tjungblu !! |
gRPC health server sets serving status to NOT_SERVING on defrag.
Backported from 3.6 in #16278
would be awesome to have this available in 3.5 already, we ran into this recently with a performance test when running etcd with 38 gigs.