-
Notifications
You must be signed in to change notification settings - Fork 681
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix xDS protocol mishandling #1013
Conversation
Signed-off-by: Brandon Cook <[email protected]>
@@ -40,10 +40,10 @@ func (xh *xdsHandler) fetch(req *v2.DiscoveryRequest) (*v2.DiscoveryResponse, er | |||
} | |||
resources, err := toAny(r, toFilter(req.ResourceNames)) | |||
return &v2.DiscoveryResponse{ | |||
VersionInfo: "0", | |||
VersionInfo: req.VersionInfo, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this fix the issue (I would love it if it does)?? I'll go test the branch ;) . Looking at the xDS protocol diagram, the management server is supposed to control version and nonce values, passing them back as part of DiscoveryResponse to be used in the NACK/ACK DiscoveryRequest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment we always send version_info="0", which is not correct, it should change. Looking through the code on the envoy side there is no requirement for version_info to increment, I've not found any place where two versions are sorted, they are only compared for equality.
I'm not sure this fix is correct, in the initial discovery request version info will be "", so rather than replying with "0", we'd reply with "" which has some unexpected consequences as "" is treated as a marker for some conditions.
This is also why the tests pass with the versioninfo and nonce fields removed in all the internal/e2e
test cases. I don't think this is correct.
It's easy enough to get the right version info, it's in last
. We can use the string form of last
for VersionInfo.
For the nonce, this is only used in the outer loop of GrpcMuxImpl::onDiscoveryResponse
on the Envoy side. Its never compared to a previous nonce, or sorted for newer or olderness, despite what the xDS protocol document says. This is sort of backed up by the observation that if these were serial numbers, they would be sent on the wire as numbers, not opaque strings.
I think for the moment we can use the string form of last here, at least it will change on every response.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for working on this. I really appreciate you digging into the problem and proposing a fix.
I've spent today comparing the management server impl in envoyproxy/go-control-plane and the code on the Envoy side as well as some of the envoy test cases. I think your diagnosis is correct; version number and/or nonce should change on each request/response cycle -- however I don't think the solution you've proposed is correct. Please see my next comment for details.
@@ -40,10 +40,10 @@ func (xh *xdsHandler) fetch(req *v2.DiscoveryRequest) (*v2.DiscoveryResponse, er | |||
} | |||
resources, err := toAny(r, toFilter(req.ResourceNames)) | |||
return &v2.DiscoveryResponse{ | |||
VersionInfo: "0", | |||
VersionInfo: req.VersionInfo, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment we always send version_info="0", which is not correct, it should change. Looking through the code on the envoy side there is no requirement for version_info to increment, I've not found any place where two versions are sorted, they are only compared for equality.
I'm not sure this fix is correct, in the initial discovery request version info will be "", so rather than replying with "0", we'd reply with "" which has some unexpected consequences as "" is treated as a marker for some conditions.
This is also why the tests pass with the versioninfo and nonce fields removed in all the internal/e2e
test cases. I don't think this is correct.
It's easy enough to get the right version info, it's in last
. We can use the string form of last
for VersionInfo.
For the nonce, this is only used in the outer loop of GrpcMuxImpl::onDiscoveryResponse
on the Envoy side. Its never compared to a previous nonce, or sorted for newer or olderness, despite what the xDS protocol document says. This is sort of backed up by the observation that if these were serial numbers, they would be sent on the wire as numbers, not opaque strings.
I think for the moment we can use the string form of last here, at least it will change on every response.
@phylake can you please try this diff instead
|
Closing in favor of #1016 |
Signed-off-by: Brandon Cook [email protected]
fixes #499