-
Notifications
You must be signed in to change notification settings - Fork 188
worker, ha: For errors that can only be attempted a limited number of times, the number of attempts is limited #1396
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will review later
dm/worker/server.go
Outdated
select { | ||
case <-s.ctx.Done(): | ||
return | ||
case <-time.After(keepaliveTimeout): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sleep time should be related to s.cfg.KeepAliveTTL
, for example, 2 * s.cfg.KeepAliveTTL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rest LGTM. good job
/lgtm |
and please help check other |
PTAL @3pointer @lichunzhu |
@@ -110,7 +110,7 @@ func (s *Server) KeepAlive() { | |||
return // return if failed to stop the worker. | |||
} | |||
select { | |||
case <-s.ctx.Done(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest using another variable to save s.ctx
here to avoid data race.
dm/worker/server.go
Outdated
select { | ||
case <-s.ctx.Done(): | ||
return | ||
case <-time.After(time.Duration(s.cfg.KeepAliveTTL) * time.Second * 2): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we start another keepalive only when we assure the last one is quitted? For example, using another waitgroup.
/lgtm |
Signed-off-by: ti-srebot <[email protected]>
cherry pick to release-2.0 in PR #1425 |
What problem does this PR solve?
worker recieved a bound watch, but failed to read bound information in etcd.[part of #1388] (#1388)
What is changed and how it works?
Add reading bound information failed error to RetryableError type, but limit the number of retry.
Check List
Tests
Code changes
Side effects
Related changes