Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add xds-server implementation #176

Merged
merged 11 commits into from
Jul 29, 2022
Merged

Conversation

youngnick
Copy link
Contributor

@youngnick youngnick commented Jul 22, 2022

Fixes #40.

I've made a few related updates here:

  • Added a shim (LogrWrapper) to wrap the logr.Logger in some extra methods so that we can pass it to some go-control-plane constructors.
  • Added internal/xds/cache to handle the snapshot cache that's required to use go-control-plane.
  • Moved the Context struct out of internal/xds/translator into internal/types because I needed it in internal/cache.
  • The Context struct was renamed to ResourceVersionTable, which isn't a great name, but was the best I could come up with. Alternatives considered: CacheVersion, ResourceVersion, other similiar ideas. I think that calling it a Context risks confusion with context.Context.
  • Added a gateway xdstest command that will run an xDS server for you to connect to to validate this works. This runs through three different IRs and updates the xDS caches with a new translated version every 10 seconds, so that you can see changes run through the system. It's deliberately left way too verbose, since we want to get rid of it anyway.

I've got the Delta/incremental xDS stuff to work by having the callbacks manage Envoy nodeIDs and ensure that there's a snapshot per nodeID. This is a bit hacky and should be discussed with go-control-plane to see if there's a better way.

Still todo (I'll log issues for stuff we don't have covered already):

  • Reopen Select Logging Framework #76 and we'll relitigate, keeping in mind that go-control-plane at least expects levelled logs, not V-style ones. logr is probably not a good fit for this project, sadly.
  • Follow up with go-control-plane about the callback hacks for managing per-nodeID caches.
  • Once we have everything wired up correctly, remove gateway xdstest as a command.
  • I'll see if I can figure out a way to test this properly - I think that we could make a Go xDS client, and do an almost-e2e test that IRs produce xDS outputs, but I'll write something up about it first.
  • Longer-term, we'll need to figure out how to handle NACKs, internal/xds/cache/snapshotcache.go is the place where that chain of events will begin.

@youngnick youngnick requested a review from a team as a code owner July 22, 2022 05:59
@codecov-commenter
Copy link

codecov-commenter commented Jul 22, 2022

Codecov Report

Merging #176 (fbfb23b) into main (7246e3a) will decrease coverage by 5.82%.
The diff coverage is 6.97%.

@@            Coverage Diff             @@
##             main     #176      +/-   ##
==========================================
- Coverage   67.57%   61.74%   -5.83%     
==========================================
  Files          20       20              
  Lines        1659     1814     +155     
==========================================
- Hits         1121     1120       -1     
- Misses        477      632     +155     
- Partials       61       62       +1     
Impacted Files Coverage Δ
internal/cmd/xdstest.go 3.65% <3.65%> (ø)
internal/xds/translator/translator.go 64.93% <71.42%> (ø)
internal/cmd/root.go 100.00% <100.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us.

// xDSTest implements the command.
// This is deliberately verbose and unoptimized, since the purpose
// is just to illustrate how the flow will need to work.
func xDSTest() error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since the input is fixed, this function would be could be a good candidate for a test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, but in order to test it, we'll need an xDS client, which I'll need to write. That will be a reasonable amount of work, so I wanted to do it in a followup.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's roll with the test cmd and create the xds client as part of #177


}

func (s *snapshotcache) OnStreamRequest(streamID int64, req *envoy_service_discovery_v3.DiscoveryRequest) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious why SoTW callbacks are implemented, we should just log and return imho

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not implementing the full protocol when we can pretty easily makes me uncomfortable - if we do happen to make an error with the Delta callbacks or implementation, the SOTW is a good fallback to have available for operational purposes.

I don't think that not meeting expectations about what variants we support is worth the small saving in development effort.

// Yes, the different ordering in the name is part of the go-control-plane interface.
func (s *snapshotcache) OnDeltaStreamOpen(ctx context.Context, streamID int64, typeURL string) error {

s.streamIDNodeID[streamID] = ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this tracks rogue streams ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It ensures that the assignment to s.streamIDNodeID[streamID] in the StreamRequest callback will always succeed, since every stream has StreamOpen called first, then StreamRequest per request received. This could be more defensive though, I'll update.


// If no snapshot has been generated yet, we can't do anything, so don't mess with this request.
// go-control-plane will respond with an empty response, then send an update when a snapshot is generated.
if s.lastSnapshot == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if s.lastSnapshot == nil {
if s.lastSnapshot != nil {
err := s.SetSnapshot(context.TODO(), nodeID, s.lastSnapshot); err != nil {
...
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, I'm using a Go idiom that basically says "The leftmost side of the code should be the happy path, exceptions should be in if blocks." That way we're not indending most of the rest of the function.

Note that this is a different case to the next one - this one is that no snapshot has been generated by the rest of EG yet, the s.GetSnapshot(nodeID) call below checks if there's a snapshot with the current Node.

@youngnick
Copy link
Contributor Author

When reviewing the code after @arkodg's comments, I realized that these callbacks need to be concurrent-safe, so I sprinkled some sync.Mutex on it.

I also realized that requests don't always have to have node ID set, so tidied up the handling there too.

arkodg
arkodg previously approved these changes Jul 26, 2022
Copy link
Contributor

@arkodg arkodg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks for implementing delta xDS
hoping the xdstest can be moved to a go test function in a followup PR

@danehans
Copy link
Contributor

hoping the xdstest can be moved to a go test function in a followup PR

@youngnick when you have a moment, please create an issue and xref.

@youngnick
Copy link
Contributor Author

I created #177 to cover setting up automated testing for the xDS server.

LukeShu
LukeShu previously approved these changes Jul 28, 2022
Copy link
Contributor

@LukeShu LukeShu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A nit, and 2 minor suggestions. But IMO mergeable.

internal/cmd/xdstest.go Outdated Show resolved Hide resolved
Comment on lines 1 to 16
package log

import (
"fmt"

"github.com/go-logr/logr"
)

// LogrWrapper is a nasty hack for turning the logr.Logger we get from NewLogger()
// into something that go-control-plane can accept.
// It seems pretty silly to take a zap logger, which is levelled, turn it into
// a V-style logr Logger, then turn it back again with this, but here we are.
// TODO(youngnick): Reopen the logging library discussion then do something about this.
type LogrWrapper struct {
logr logr.Logger
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't have this in a separate package or be exported; that only serves to encourage using it instead of the preferred logging abstraction (logr). IMO this should be a private implementation detail of internal/xds (or internal/xds/cache?).

internal/cmd/xdstest.go Outdated Show resolved Hide resolved
danehans
danehans previously approved these changes Jul 28, 2022
// xDSTest implements the command.
// This is deliberately verbose and unoptimized, since the purpose
// is just to illustrate how the flow will need to work.
func xDSTest() error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's roll with the test cmd and create the xds client as part of #177

@youngnick youngnick dismissed stale reviews from danehans, LukeShu, and arkodg via 883be6c July 29, 2022 06:08
Nick Young added 11 commits July 29, 2022 06:33
Signed-off-by: Nick Young <[email protected]>
Signed-off-by: Nick Young <[email protected]>
Signed-off-by: Nick Young <[email protected]>
Signed-off-by: Nick Young <[email protected]>
Signed-off-by: Nick Young <[email protected]>
Signed-off-by: Nick Young <[email protected]>
Signed-off-by: Nick Young <[email protected]>
@youngnick youngnick force-pushed the add-xds-server branch 2 times, most recently from a8a8549 to fbfb23b Compare July 29, 2022 06:36
@danehans danehans merged commit e20d7ad into envoyproxy:main Jul 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Initial xDS Server Design and Implementation
5 participants