Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xds: use filter type URL as the primary way to discover extensions #9618

Merged
merged 32 commits into from
Jan 23, 2020

Conversation

kyessenov
Copy link
Contributor

@kyessenov kyessenov commented Jan 9, 2020

Signed-off-by: Kuat Yessenov [email protected]

Description: use the type URL in typed_config to select an extension instead of name whenever it uniquely identifies it
Risk Level: medium (might affect invalid config with one extension proto used for another extension)
Testing: unit
Docs Changes: updated configuration overview section
Release Notes: use the config type URL for extension discovery
Fixes: #9587

@kyessenov kyessenov marked this pull request as ready for review January 9, 2020 03:08
Signed-off-by: Kuat Yessenov <[email protected]>
Signed-off-by: Kuat Yessenov <[email protected]>
@htuch htuch self-assigned this Jan 9, 2020
@htuch
Copy link
Member

htuch commented Jan 9, 2020

@kyessenov QQ for context: do we want to land this prior to 1.13.0 (given all the v3 disruption already happening) or wait until after? What's the thinking from Istio-side?

@mattklein123
Copy link
Member

Can someone describe simply how this effects existing configurations of filters that are either using the deprecated config field or have no config at all? Meaning, how is this backwards compatible until configuration is potentially updated? It's not acceptable to have any breaking changes here that require a lockstep config change.

@mattklein123 mattklein123 self-assigned this Jan 9, 2020
@mattklein123
Copy link
Member

(And yes this is going to need some detailed documentation somewhere in the configuration guide including an overview of Any/Typed Struct/etc. This is all getting super confusing. I would recommend a new section somewhere in here: https://www.envoyproxy.io/docs/envoy/latest/configuration/overview/v2_overview (that page needs to be split up into multiple pages but that's a different issue which I can potentially tackle and will open an issue on)

@kyessenov
Copy link
Contributor Author

kyessenov commented Jan 9, 2020

@htuch We don't need it immediately, but given that it takes awhile for changes to propagate, it would be good to get in soon so that we can use it in the extension API. It doesn't need to be in 1.13.

@mattklein123 Any config that is not typed (deprecated struct, or struct inside typed_config) falls back to using name just like before. If the config is typed (e.g. any new empty type extension), then the type is used to look up an extension. The risk is that when the config proto and extension name don't match, Envoy silently accepts it by doing struct merge, but will stop working after this PR.

@mattklein123
Copy link
Member

@kyessenov and if someone is still using the deprecated raw config field (not typed config) it uses name also, right? If so this makes sense to me and I don't see any strong reason to wait till after 1.13.0 for this, other than more regression risk (which might be a good enough reason to wait).

@kyessenov
Copy link
Contributor Author

Yes, basically any struct (old or extensions that don't want to have proto config type) is filtered both in the type registry and in lookup. So it will only kick-in if there is some non-struct type_url specified either in TypedStruct or directly in typed_config.

@mattklein123
Copy link
Member

Yes, basically any struct (old or extensions that don't want to have proto config type) is filtered both in the type registry and in lookup. So it will only kick-in if there is some non-struct type_url specified either in TypedStruct or directly in typed_config.

Awesome, thanks. Will review in detail tomorrow when I am fresh.

Copy link
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, the main thing I'm wondering about is the factories that still take Struct, this seems to be the same problem as Empty..
/wait

continue;
}

// skip untyped factories and factories that consume Struct
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which factories consume Struct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only test factories AFAICT.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels kind of bad that we need to have prod code here that exists just for test, any way we can improve that, e.g. switch tests to use some canonical TestConfigMessage?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't think deeply about filters that use struct as config type. I'd out-right ban Struct but that requires deprecation and will to force everyone to create their own proto types. Can this be done separately? What we did with Empty was easier since it was mostly used in Envoy code base.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

People are definitely using Empty outside of Envoy (we have filters at Lyft that have no config), but the Empty case should "just work" as there is no config needed. I agree with @kyessenov that we can't ban struct without a deprecation cycle as we don't know if anyone is using Struct for their filter configs. We could deprecate it though, but that would require a warning/stat/etc. and the full dance. Can you file an issue and TODO a follow up here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

source/common/config/utility.h Outdated Show resolved Hide resolved
source/common/config/utility.h Outdated Show resolved Hide resolved
source/common/config/utility.h Outdated Show resolved Hide resolved
source/common/config/utility.h Outdated Show resolved Hide resolved
Signed-off-by: Kuat Yessenov <[email protected]>
@kyessenov
Copy link
Contributor Author

/retest

@repokitteh-read-only
Copy link

🔨 rebuilding ci/circleci: coverage (failed build)

🐱

Caused by: a #9618 (comment) was created by @kyessenov.

see: more, trace.

Signed-off-by: Kuat Yessenov <[email protected]>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general this LGTM but definitely needs docs per previous discussion. Thank you!

/wait

include/envoy/registry/registry.h Outdated Show resolved Hide resolved
include/envoy/registry/registry.h Outdated Show resolved Hide resolved
continue;
}

// skip untyped factories and factories that consume Struct
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

People are definitely using Empty outside of Envoy (we have filters at Lyft that have no config), but the Empty case should "just work" as there is no config needed. I agree with @kyessenov that we can't ban struct without a deprecation cycle as we don't know if anyone is using Struct for their filter configs. We could deprecate it though, but that would require a warning/stat/etc. and the full dance. Can you file an issue and TODO a follow up here?

include/envoy/registry/registry.h Outdated Show resolved Hide resolved
Signed-off-by: Kuat Yessenov <[email protected]>
Signed-off-by: Kuat Yessenov <[email protected]>
Signed-off-by: Kuat Yessenov <[email protected]>
@kyessenov
Copy link
Contributor Author

I'm not sure how to debug the coverage test failure -- the failing test appear to pass locally with and without ASAN. Any clue would be appreciated.

@kyessenov
Copy link
Contributor Author

/retest

@repokitteh-read-only
Copy link

🔨 rebuilding ci/circleci: coverage (failed build)

🐱

Caused by: a #9618 (comment) was created by @kyessenov.

see: more, trace.

@kyessenov
Copy link
Contributor Author

/retest

@repokitteh-read-only
Copy link

🔨 rebuilding ci/circleci: coverage (failed build)

🐱

Caused by: a #9618 (comment) was created by @kyessenov.

see: more, trace.

@mattklein123
Copy link
Member

The reason for coverage giving me so much grief is because there are RegisterFactory calls inside test bodies in some tests that are picked up by other tests once they are combined into a single binary. We should probably ban global registration inside tests.

IIRC there are test hooks that should allow for both registering and unregistering, but yeah, this will need some cleaning up.

@kyessenov
Copy link
Contributor Author

Now, it's ADS integration test and the error makes no sense. It says the Any type does not match the requested resource type. Same test passes alone. Is there some test that overrides something in protobuf globally?

@kyessenov
Copy link
Contributor Author

kyessenov commented Jan 23, 2020

Found a second build bug: there are two TestFilters defined in router_test and router_upstream_log_test. They are almost the same except one has an extra field. Due to some randomness, the shorter wins sometimes and leads to undefined behavior.

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!

@mattklein123 mattklein123 merged commit 08ec4dd into envoyproxy:master Jan 23, 2020
junr03 added a commit to envoyproxy/envoy-mobile that referenced this pull request Jan 30, 2020
Signed-off-by: Jose Nino [email protected]
Description: envoyproxy/envoy#9618 broke the iOS build due to missing symbols. envoyproxy/envoy#9875 fixes. However, in order to expedite a clean master branch this PR moves the Envoy ref back to a stable place. Note that Android logging is reverted. Also note that CI for iOS was not testing for liveliness, which is how the breakage went through in the first place. This PR also fixes that.
Risk Level: low
Testing: CI

Fixes #646

Signed-off-by: Jose Nino <[email protected]>
jpsim pushed a commit that referenced this pull request Nov 28, 2022
Signed-off-by: Jose Nino [email protected]
Description: #9618 broke the iOS build due to missing symbols. #9875 fixes. However, in order to expedite a clean master branch this PR moves the Envoy ref back to a stable place. Note that Android logging is reverted. Also note that CI for iOS was not testing for liveliness, which is how the breakage went through in the first place. This PR also fixes that.
Risk Level: low
Testing: CI

Fixes #646

Signed-off-by: Jose Nino <[email protected]>
Signed-off-by: JP Simard <[email protected]>
jpsim pushed a commit that referenced this pull request Nov 29, 2022
Signed-off-by: Jose Nino [email protected]
Description: #9618 broke the iOS build due to missing symbols. #9875 fixes. However, in order to expedite a clean master branch this PR moves the Envoy ref back to a stable place. Note that Android logging is reverted. Also note that CI for iOS was not testing for liveliness, which is how the breakage went through in the first place. This PR also fixes that.
Risk Level: low
Testing: CI

Fixes #646

Signed-off-by: Jose Nino <[email protected]>
Signed-off-by: JP Simard <[email protected]>
@markdroth markdroth mentioned this pull request Oct 21, 2024
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

allow loading extensions by type URL
4 participants