Skip to content
This repository has been archived by the owner on Feb 18, 2021. It is now read-only.

Bug fixes of multi_zone consumer group #170

Merged
merged 5 commits into from
May 1, 2017
Merged

Bug fixes of multi_zone consumer group #170

merged 5 commits into from
May 1, 2017

Conversation

datoug
Copy link
Contributor

@datoug datoug commented Apr 25, 2017

Fix a couple of bugs:

  1. If we couldn't get active zone from config, default to NOT consume (otherwise we may allow consuming in both zones)
  2. For GetOutputHost call, make the empty host a valid case.

@datoug datoug requested review from venkat1109 and kirg April 25, 2017 22:19
Copy link
Contributor

@venkat1109 venkat1109 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good !

return strings.EqualFold(zone, cgDesc.GetActiveZone())
}

if len(dConfig.ActiveZone) > 0 {
return strings.EqualFold(zone, dConfig.ActiveZone)
}

return true
return false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you also add a log or metric when both dConfig.ActiveZone and cgDesc.GetActiveZone() are empty ?
just want to make sure we have mechanisms to verify correctness as we roll this out initially.

@@ -385,19 +385,17 @@ func (mcp *Mcp) GetOutputHosts(ctx thrift.Context, inReq *c.GetOutputHostsReques
}

response := func(outputHostIDs []string, err error) (*c.GetOutputHostsResult_, error) {
if len(outputHostIDs) < 1 {
// only count as failure if our answer contains no endpoints at all
if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's revert to the same code as before:
https://github.com/uber/cherami-server/blob/f47853600f4ebb2fc74a41e3a7d3e11ab6f07031/services/controllerhost/controllerhost.go

checking for err != nil still changes the behavior.

@datoug
Copy link
Contributor Author

datoug commented Apr 26, 2017

For the bug fix in controller, I reverted back to the original change but added another variable in the cache called consumeDisabled to distinguish the multi_zone scenario(a ok case) and the no output host case(an error case).

Also I fixed another bug in replicator. In authoritative zone, we skipped the destination and cg reconciliation, so localCgs variable is always empty, which results in no cg extent reconciliation for authoritative zone. Now I'm moving the computation of localCgs to the outer scope.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.2%) to 68.376% when pulling 6332a02 on bugs into 6442730 on master.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.3%) to 68.28% when pulling 7b56357 on bugs into 6442730 on master.

@datoug
Copy link
Contributor Author

datoug commented Apr 28, 2017

@venkat1109 could you take another look?

}

return true
context.log.Warn(`no active zone from dynamic config !`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be a good idea to emit a metric here

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.1%) to 68.511% when pulling f6cdebd on bugs into 6442730 on master.

@datoug datoug merged commit ba68fc2 into master May 1, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants