Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xDS conformance test suite #11299

Open
htuch opened this issue May 22, 2020 · 11 comments
Open

xDS conformance test suite #11299

htuch opened this issue May 22, 2020 · 11 comments
Labels
api/v3 Major version release @ end of Q3 2019 help wanted Needs help!

Comments

@htuch
Copy link
Member

htuch commented May 22, 2020

It would be helpful to control plane implementors and client implementations to have an xDS conformance test suite that is independent of Envoy. I think this would largely look like some scripts (Python or Go) that create xDS gRPC connections and exercise some xDS exchanges.

A priority area here is delta xDS, which control plane implementors consider to be challenging to verify.

I think we could separate this work into two parts; building out the basic test infra and then building out an increasingly more complete test suite.

@snowp @howardjohn @derekargueta

@htuch htuch added api/v3 Major version release @ end of Q3 2019 help wanted Needs help! labels May 22, 2020
@htuch
Copy link
Member Author

htuch commented May 22, 2020

@caniszczyk @mattklein123 I remember a while back we were discussing possibilities here; I think there was someone at Lyft interested in this and also CNCF funding. In any case, would be rad to learn what is possible here.

@derekargueta
Copy link
Member

Thinking about this abstractly, I think starting with client conformance might be much easier even though it might have less value since afaik Envoy is the only conforming client (could be nice as out-of-tree integration tests I suppose).

Control plane conformance might be a bit difficult since there isn't a universal way to set the backing resources such as endpoints, listeners, routes, etc. Many control planes use Kubernetes ConfigMaps, some use databases like PostgreSQL, etc. So for the control plane being tested, the test suite would have to know how to set up the backing store such that the control plane will return certain resources, otherwise it's non-deterministic, especially true of delta xDS where to verify correctness it's useful to validate which resources aren't being sent. This becomes even more challenging for everyone building a private internal control plane that has its own idioms.

As a more concrete example, we could write a Go script that sends a DiscoveryRequest to the control plane with an empty resource_names to indicate return all resources, but without a way to set mock return values then if the control plane returns an empty list it's difficult to discern whether there really are no resources to return or if there's a bug in the control plane and it's actually not conforming. Ideally we'd have something like

[ test client scripts ] <---> [ control plane being tested ] <--- [ test data files to populate store ]

where the control plane can consume a file specification to return mock data but that might be a moonshot. Perhaps something we could bake into the (go|python|java)-control-plane libraries to make it an easy API to adopt?

@htuch
Copy link
Member Author

htuch commented May 24, 2020

@derekargueta yeah, this is a good point. I think a well written xDS server should have some ability to mock here, i.e. there is some support for dependency injection. Something like the go-control-plane should be hiding the intricacies of things like nonce handling or delta discovery from the rest of the system, so this is the main validation target, not the entire control plane. Some kind of test harness for xDS server configuration makes sense.

@snowp
Copy link
Contributor

snowp commented May 24, 2020

I think I'm saying the same thing as Harvey here, but I think one thing to keep in mind is that the scope of these tests don't need to be expanded to necessarily handle all kinds of control plane interactions: the goal would to be validate the xDS implementation specifically, not the configuration pipeline etc.

In practice, this means that the conformance test run against the go-control-plane code, not the various control plane implementations that use it, as it doesn't leak the xDS implementation.

I can imagine implementing some basic mock API (file based/gRPC/whatever) that would simply update resources for a specific client (identified by its Node struct), with the ability to update things at the required granularity to verify that the xDS server behaves as expected. Given that the xDS protocol isolates individual clients, all we need to express is changes to the resources required by the one client that's connecting.

For go-control-plane & co this could likely be a small implementation that sits on top of its API, mapping mock calls to calls to updating the snapshot. For other implementations, I can imagine this encouraging a logical split within their code base between the xDS implementation and the rest (if one doesn't already), allowing them to integrate their xDS bits with the mock API to run against the conformance test.

@howardjohn
Copy link
Contributor

This would be super useful for us. We have a test XDS server, and often times I cannot tell what is part of the XDS spec, what is an envoy implementation detail, and what is an implementation detail of our test server. This is especially common around handling ACKs.

@howardjohn
Copy link
Contributor

I have been playing around with this a bit some thoughts:

I think implementing in Go will be the most useful because

  • It would be nice to provide the ability to just call xdsconformance.RunConformanceTest(t) for an existing go test. I realize not everyone has go control planes, but its certainly one of the most popular
  • It seems a hard requirement to provide a standalone binary that can run the tests. This will support non-go control planes. Doing this with python can be a bit of a pain, so this would push me towards using go
  • We can rely on go-control-plane for the generated go structs. Not sure what this looks like in python?

Regarding test input, there are a fair amount of parameters we will need (thinking from a server conformance perspective):

  • Address/connection settings (such as TLS) for the control plane
  • What feature sets are supported - server may serve CDS but not LDS for example
    • Likely want a label selection mechanism?
  • Node information to pass, unless we decide that to be a conformant xds server you must interact with arbitrary Nodes?
  • What resources are available. This is mentioned above about mock api. I am not sure we actually need deep introspection of the resources returned, I would think most tests will just check the DiscoveryRequest and not unmarshal the Resources field as this will break pretty quickly for any custom filters. It may be possible to instead just declare the resource names available? For example, I may decide in my control plane for node=conformance-test-node I return endpoints mock-1 and mock-2. So in the test input I configure the node to identify as conformance-test-node, and I put endpoints: [mock-1, mock-2]. The test will then send an EDS request with this node and request mock-1 and mock-2 endpoints. For users who do not have/want a mock, they could instead configure the test to declare endpoints some-real-endpoint and run it against a live deployed XDS server?

I am building out some of the basic infra for this and trying to populate some super basic tests to see what this can look like at: https://github.com/howardjohn/xds-conformance. I am hoping we can get some idea of how this will work there then can merge it into envoyproxy org and add the actual tests

@htuch
Copy link
Member Author

htuch commented Jun 23, 2020

As discussed IRL, I agree we don't need deep mocking of resources, but there will need to be some sort of shim to allow named resources of different types to be created, deleted and updated (including version).

@htuch
Copy link
Member Author

htuch commented Aug 24, 2020

@howardjohn where did you end up with your explorations?

@htuch
Copy link
Member Author

htuch commented Sep 17, 2020

CNCF and @envoyproxy/api-shepherds have an RFP for vendors to bid on a project to build this in #13165.

@htuch
Copy link
Member Author

htuch commented Apr 5, 2021

@howardjohn do you have any additional update? HH are spinning up on this project and would like to know if there's anything to be learned from the Istio experience. Thanks.

@howardjohn
Copy link
Contributor

@htuch beyond #11299 (comment) I haven't explored this much. I ran into issues with figuring out the "mock API" or inputs and how to make it generic. As a result, we just focused on expanding our own Istio-specific "conformance" tests which we have a fake ADS client, with Istio specific configurations. Most of these are in https://github.com/istio/istio/blob/8105c2bb98582ee78519dd7a19c7a5f1ab3faba6/pilot/pkg/xds/ads_test.go#L50-L49.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api/v3 Major version release @ end of Q3 2019 help wanted Needs help!
Projects
None yet
Development

No branches or pull requests

4 participants