Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove global runtime: rebase runtime features on ABSL_flags #19847

Closed
alyssawilk opened this issue Feb 7, 2022 · 16 comments · Fixed by #19880 or #20337
Closed

remove global runtime: rebase runtime features on ABSL_flags #19847

alyssawilk opened this issue Feb 7, 2022 · 16 comments · Fixed by #19880 or #20337
Assignees
Labels
enhancement Feature requests. Not bugs or questions.

Comments

@alyssawilk
Copy link
Contributor

So as part of
envoyproxy/envoy-mobile#2003 we need to fix the runtime singleton problem.

Actual runtime is no problem but having runtimeFeatureEnabled point at an actual runtime is basically impossible to fix.

I propose replacing the runtime guards with ABSL flags: still setting them via RTDS but being able to access them directly as flags. This solves both a bunch of test issues where flags were accessed before/after the envoy server comes up and down, and allows for removing the singleton issue for envoy mobile.

E-M won't support different values for runtime across different instances, but we can't do that without wiring some server-specific handle to all the locations where we use runtime which would make adding new values prohibitively cumbersome anyway.

@alyssawilk alyssawilk added the enhancement Feature requests. Not bugs or questions. label Feb 7, 2022
@alyssawilk
Copy link
Contributor Author

cc @envoyproxy/envoy-maintainers for thoughts

@jmarantz
Copy link
Contributor

jmarantz commented Feb 7, 2022

@krajshiva FYI

@yanavlasov
Copy link
Contributor

How is this going to work together with TCLAP?

@alyssawilk
Copy link
Contributor Author

will test to ensure no change but I don't intend to use the command line parsing utilities, just the in-memory classes/macros.

@mattklein123
Copy link
Member

Aren't gflags process singletons? How does that fix the problem? Or it's OK to have the feature enabled stuff be global?

@yanavlasov
Copy link
Contributor

I see, if we do not intend to set them through command line, then there are no concerns. I think one other concern is that ABSL flags use synchronization. And I think Envoy's runtime has per worker snapshots. This may impact performance if flag is accessed in a hot path. Even latching it per connection may lead to contention.

@alyssawilk
Copy link
Contributor Author

@mattklein123 they're functionally thread-safe globals. the problem isn't that they're global (we can't avoid that if you want to allow runtime access from any point in the code) the problem is they're globals with lifetime issues. that if you have Envoy A and Envoy B, and the "single singleton pointer" for runtime points at A's runtime, and then A gets deleted just as B is doing a runtime flag lookup, you crash.
If we actually make our globals global, then you cant have 2 instances with different values, but you avoid lifetime and thread safety issues.

@mattklein123
Copy link
Member

Yup OK that seems fine for those flags. What will be the semantic if the values are set from multiple runtimes? I assume last one wins? This is an edge case and I think anything is fine but we should decide/document the behavior.

@alyssawilk
Copy link
Contributor Author

exactly that. I planned on adding docs to E-M for running concurrent Envoy's. Could also doc up in runtime but I can't imagine it being a good idea for anyone to run multiple server side proxies simultaneously so I'd rather not mention the possibility upstream :-P

@jmarantz
Copy link
Contributor

jmarantz commented Feb 8, 2022

You've probably explored this already, but could we plumb access to the instance-specific Runtime object everywhere that's needed via the already-laid conduit that is Api::Api?

@alyssawilk
Copy link
Contributor Author

nope. most of the value of runtime is folks can effortlessly guard refactors. If they have to plumb through 20 layers of API, they're not going to do it, and then folks start making high risk functional changes in the clear.
If you look at where runtime is used, API is often not available, which is why the decision was made to allow it to be global several years back.

@alyssawilk
Copy link
Contributor Author

Ok I have a draft up at #19880 for the curious. Still only draft as I have to tear down the singleton access and figure out what to do with getInteger. I'm inclined to special case the existing cases, see if we can deprecate them entirely, and say it shouldn't be used.
Also I'm wondering if folks have opinions on if we should encourage folks to use flag directly in future to save the hash map look up

@alyssawilk alyssawilk changed the title rebase runtime features on ABSL_flags remove global runtime: rebase runtime features on ABSL_flags Feb 10, 2022
@alyssawilk
Copy link
Contributor Author

alyssawilk commented Feb 10, 2022

Turns out flags was the least annoying part of this.
So as Matt predicted, once we had the singleton folks used it in unexpected ways. We've got some stats on deprecated config (removing: #19906) global connection enforcement (plumbing runtime: #19907), quic enablement (will plumb), and then regex size stats (see below)

I'm not sure what to do with regex: I'm inclined to remove the stats, and then deprecate the config (to see if anyone is using it, and if they are leave it as an int-flag) but would appreciate others' thoughts for if their intuition is that folks need the stats and I need to plumb runtime to everywhere we create regex @mattklein123 @htuch
source here: https://github.com/envoyproxy/envoy/blob/main/source/common/common/regex.cc#L24

@jmarantz
Copy link
Contributor

Hi -- I can't remember what specifically I was involved in with regard to the runtime tweak and histogram tracking regex program size in regex.cc, but I was at least aware of it :)

TL;DR I think it'd be fine to just rip all that out and leave in the runtime int setting, because I think no one is looking at the histogram, and its access of rootScope() may complicate things for you.

Is the access to rootScope() the thing that you were not expecting?

@htuch
Copy link
Member

htuch commented Feb 11, 2022

The notion of getRootScope() in the runtime seems like mission creep.

@mattklein123
Copy link
Member

Yeah I would just rip out the regex stuff.

alyssawilk added a commit that referenced this issue Feb 15, 2022
Risk Level: low
Testing: n/a
Docs Changes: n/a
part of #19847

Signed-off-by: Alyssa Wilk <[email protected]>
alyssawilk added a commit that referenced this issue Feb 16, 2022
Risk Level: low
Testing: n/a
Docs Changes: n/a
Release Notes: inline
Part of #19847

Signed-off-by: Alyssa Wilk <[email protected]>
alyssawilk added a commit that referenced this issue Feb 16, 2022
Risk Level: low
Testing: existing tests
Docs Changes: n/a
Release Notes: n/a
part of #19847

Signed-off-by: Alyssa Wilk <[email protected]>
alyssawilk added a commit that referenced this issue Feb 17, 2022
alyssawilk added a commit that referenced this issue Feb 17, 2022
Note: this does NOT remove the runtime singleton, which will be done in a follow-up PR.

Risk Level: high
Testing: existing tests cover
Docs Changes: n/a
Release Notes:
Fixes #19847
Part of envoyproxy/envoy-mobile#2003

Signed-off-by: Alyssa Wilk <[email protected]>
alyssawilk added a commit that referenced this issue Mar 1, 2022
Functionally this

handles the multi-envoy signal handler crash
skips instantiating a runtime singleton (off by default, must stay off until remove global runtime: rebase runtime features on ABSL_flags #19847 is closed)
Multi-envoy does not correctly support runtime flags or deprecation stats due to #19847 being incomplete. It can still handle proxy traffic client - L1 - L2 - upstream as shown in test.

Risk Level: low
Testing: yes
Docs Changes: n/a
Release Notes: n/a
Part of envoyproxy/envoy-mobile#2003

Signed-off-by: Alyssa Wilk <[email protected]>
rexnp pushed a commit to rexnp/envoy that referenced this issue Mar 2, 2022
Functionally this

handles the multi-envoy signal handler crash
skips instantiating a runtime singleton (off by default, must stay off until remove global runtime: rebase runtime features on ABSL_flags envoyproxy#19847 is closed)
Multi-envoy does not correctly support runtime flags or deprecation stats due to envoyproxy#19847 being incomplete. It can still handle proxy traffic client - L1 - L2 - upstream as shown in test.

Risk Level: low
Testing: yes
Docs Changes: n/a
Release Notes: n/a
Part of envoyproxy/envoy-mobile#2003

Signed-off-by: Alyssa Wilk <[email protected]>
Signed-off-by: Rex Chang <[email protected]>
wrowe pushed a commit that referenced this issue Mar 15, 2022
* test: adding a multi-envoy test (#20016)

Functionally this

handles the multi-envoy signal handler crash
skips instantiating a runtime singleton (off by default, must stay off until remove global runtime: rebase runtime features on ABSL_flags #19847 is closed)
Multi-envoy does not correctly support runtime flags or deprecation stats due to #19847 being incomplete. It can still handle proxy traffic client - L1 - L2 - upstream as shown in test.

Risk Level: low
Testing: yes
Docs Changes: n/a
Release Notes: n/a
Part of envoyproxy/envoy-mobile#2003

Signed-off-by: Alyssa Wilk <[email protected]>
Signed-off-by: Rex Chang <[email protected]>

* Add a congestionWindowInBytes method to Envoy::Network::Connection (#20105)

Signed-off-by: Bin Wu <[email protected]>
Signed-off-by: Rex Chang <[email protected]>

* Update QUICHE from 50f15e7a5 to cf1588207 (#20154)

https://github.com/google/quiche/compare/50f15e7a5..cf1588207

$ git log 50f15e7a5..cf1588207 --date=short --no-merges --format="%ad %al %s"

2022-02-28 wub Deprecate --gfe2_reloadable_flag_quic_crypto_noop_if_disconnected_after_process_chlo.
2022-02-27 vasilvv Remove QuicheMemSlice(QuicUniqueBufferPtr, size_t) constructor.
2022-02-26 fayang Use std::string instead of absl::string_view in CryptoBufferMap.
2022-02-25 bnc Ignore incoming HTTP/3 MAX_PUSH_ID frames.
2022-02-25 bnc Remove Http3DebugVisitor::OnMaxPushIdFrameSent().
2022-02-25 bnc Remove QuicSpdySession::CanCreatePushStreamWithId().
2022-02-25 fayang Deprecate gfe2_reloadable_flag_quic_single_ack_in_packet2.

Signed-off-by: Alyssa Wilk <[email protected]>
Signed-off-by: Rex Chang <[email protected]>

* build(deps): bump actions/stale from 4.1.0 to 5 (#20159)

Bumps [actions/stale](https://github.com/actions/stale) from 4.1.0 to 5.
- [Release notes](https://github.com/actions/stale/releases)
- [Changelog](https://github.com/actions/stale/blob/main/CHANGELOG.md)
- [Commits](actions/stale@v4.1.0...v5)

---
updated-dependencies:
- dependency-name: actions/stale
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Rex Chang <[email protected]>

* admin: improve test coverage and increase the coverage-percent threshold (#20025)

Adds a missing test for recent lookups now that there are no more fake symbol tables. Adds tests for a variety of override methods defined in admin.h that were previously hard to hit.

Adds a benchmark test to establish a baseline for the speedups in #19693

Signed-off-by: Joshua Marantz <[email protected]>
Signed-off-by: Rex Chang <[email protected]>

* test: removing a bunch of direct runtime singleton access (#19993)

Signed-off-by: Alyssa Wilk <[email protected]>
Signed-off-by: Rex Chang <[email protected]>

* build(deps): bump grpcio-tools in /examples/grpc-bridge/client (#20040)

Bumps [grpcio-tools](https://github.com/grpc/grpc) from 1.43.0 to 1.44.0.
- [Release notes](https://github.com/grpc/grpc/releases)
- [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md)
- [Commits](grpc/grpc@v1.43.0...v1.44.0)

---
updated-dependencies:
- dependency-name: grpcio-tools
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Rex Chang <[email protected]>

* adds to spellcheck

Signed-off-by: Rex Chang <[email protected]>

* xray tracer: set subsegment type for child spans (#2)

* xray tracer: set subsegment type for child spans

Signed-off-by: Rex Chang <[email protected]>

* adds test coverage

Signed-off-by: Rex Chang <[email protected]>

* Xray subsegment (#3)

* xray tracer: set subsegment type for child spans

Signed-off-by: Rex Chang <[email protected]>

* adds test coverage

Signed-off-by: Rex Chang <[email protected]>

* updates xray subsegment name to use operation name (instead of parent's span name)

Signed-off-by: Rex Chang <[email protected]>

* updates doc

Signed-off-by: Rex Chang <[email protected]>

* updates doc

Signed-off-by: Rex Chang <[email protected]>

* Xray subsegment (#4)

* xray tracer: set subsegment type for child spans

Signed-off-by: Rex Chang <[email protected]>

* adds test coverage

Signed-off-by: Rex Chang <[email protected]>

* updates xray subsegment name to use operation name (instead of parent's span name)

Signed-off-by: Rex Chang <[email protected]>

* updates doc

Signed-off-by: Rex Chang <[email protected]>

* updates doc

Signed-off-by: Rex Chang <[email protected]>

* adds to spell check dictionary

Signed-off-by: Rex Chang <[email protected]>

* fixes spellcheck

Signed-off-by: Rex Chang <[email protected]>

* adds to spellcheck

Signed-off-by: Rex Chang <[email protected]>

xray tracer: set subsegment type for child spans (#2)

* xray tracer: set subsegment type for child spans

Signed-off-by: Rex Chang <[email protected]>

* adds test coverage

Signed-off-by: Rex Chang <[email protected]>

Xray subsegment (#3)

* xray tracer: set subsegment type for child spans

Signed-off-by: Rex Chang <[email protected]>

* adds test coverage

Signed-off-by: Rex Chang <[email protected]>

* updates xray subsegment name to use operation name (instead of parent's span name)

Signed-off-by: Rex Chang <[email protected]>

* updates doc

Signed-off-by: Rex Chang <[email protected]>

* updates doc

Signed-off-by: Rex Chang <[email protected]>

Xray subsegment (#4)

* xray tracer: set subsegment type for child spans

Signed-off-by: Rex Chang <[email protected]>

* adds test coverage

Signed-off-by: Rex Chang <[email protected]>

* updates xray subsegment name to use operation name (instead of parent's span name)

Signed-off-by: Rex Chang <[email protected]>

* updates doc

Signed-off-by: Rex Chang <[email protected]>

* updates doc

Signed-off-by: Rex Chang <[email protected]>

* adds to spell check dictionary

Signed-off-by: Rex Chang <[email protected]>

fixes spellcheck

Signed-off-by: Rex Chang <[email protected]>

* fixes spell check

Signed-off-by: Rex Chang <[email protected]>

Co-authored-by: alyssawilk <[email protected]>
Co-authored-by: Bin Wu <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Joshua Marantz <[email protected]>
alyssawilk added a commit that referenced this issue Mar 17, 2022
This PR has a pretty high chance of CI breaking post-submit, as anyone else who references global runtime will have clean merges, but have their tests fail.

Risk Level: Low for upstream Envoy, high of downstream test breakages.
Testing: n/a
Docs Changes: n/a
Release Notes: inline
Actually fixes #19847

Signed-off-by: Alyssa Wilk <[email protected]>
ravenblackx pushed a commit to ravenblackx/envoy that referenced this issue Jun 8, 2022
…yproxy#20170)

* test: adding a multi-envoy test (envoyproxy#20016)

Functionally this

handles the multi-envoy signal handler crash
skips instantiating a runtime singleton (off by default, must stay off until remove global runtime: rebase runtime features on ABSL_flags envoyproxy#19847 is closed)
Multi-envoy does not correctly support runtime flags or deprecation stats due to envoyproxy#19847 being incomplete. It can still handle proxy traffic client - L1 - L2 - upstream as shown in test.

Risk Level: low
Testing: yes
Docs Changes: n/a
Release Notes: n/a
Part of envoyproxy/envoy-mobile#2003

Signed-off-by: Alyssa Wilk <[email protected]>
Signed-off-by: Rex Chang <[email protected]>

* Add a congestionWindowInBytes method to Envoy::Network::Connection (envoyproxy#20105)

Signed-off-by: Bin Wu <[email protected]>
Signed-off-by: Rex Chang <[email protected]>

* Update QUICHE from 50f15e7a5 to cf1588207 (envoyproxy#20154)

https://github.com/google/quiche/compare/50f15e7a5..cf1588207

$ git log 50f15e7a5..cf1588207 --date=short --no-merges --format="%ad %al %s"

2022-02-28 wub Deprecate --gfe2_reloadable_flag_quic_crypto_noop_if_disconnected_after_process_chlo.
2022-02-27 vasilvv Remove QuicheMemSlice(QuicUniqueBufferPtr, size_t) constructor.
2022-02-26 fayang Use std::string instead of absl::string_view in CryptoBufferMap.
2022-02-25 bnc Ignore incoming HTTP/3 MAX_PUSH_ID frames.
2022-02-25 bnc Remove Http3DebugVisitor::OnMaxPushIdFrameSent().
2022-02-25 bnc Remove QuicSpdySession::CanCreatePushStreamWithId().
2022-02-25 fayang Deprecate gfe2_reloadable_flag_quic_single_ack_in_packet2.

Signed-off-by: Alyssa Wilk <[email protected]>
Signed-off-by: Rex Chang <[email protected]>

* build(deps): bump actions/stale from 4.1.0 to 5 (envoyproxy#20159)

Bumps [actions/stale](https://github.com/actions/stale) from 4.1.0 to 5.
- [Release notes](https://github.com/actions/stale/releases)
- [Changelog](https://github.com/actions/stale/blob/main/CHANGELOG.md)
- [Commits](actions/stale@v4.1.0...v5)

---
updated-dependencies:
- dependency-name: actions/stale
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Rex Chang <[email protected]>

* admin: improve test coverage and increase the coverage-percent threshold (envoyproxy#20025)

Adds a missing test for recent lookups now that there are no more fake symbol tables. Adds tests for a variety of override methods defined in admin.h that were previously hard to hit.

Adds a benchmark test to establish a baseline for the speedups in envoyproxy#19693

Signed-off-by: Joshua Marantz <[email protected]>
Signed-off-by: Rex Chang <[email protected]>

* test: removing a bunch of direct runtime singleton access (envoyproxy#19993)

Signed-off-by: Alyssa Wilk <[email protected]>
Signed-off-by: Rex Chang <[email protected]>

* build(deps): bump grpcio-tools in /examples/grpc-bridge/client (envoyproxy#20040)

Bumps [grpcio-tools](https://github.com/grpc/grpc) from 1.43.0 to 1.44.0.
- [Release notes](https://github.com/grpc/grpc/releases)
- [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md)
- [Commits](grpc/grpc@v1.43.0...v1.44.0)

---
updated-dependencies:
- dependency-name: grpcio-tools
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Rex Chang <[email protected]>

* adds to spellcheck

Signed-off-by: Rex Chang <[email protected]>

* xray tracer: set subsegment type for child spans (#2)

* xray tracer: set subsegment type for child spans

Signed-off-by: Rex Chang <[email protected]>

* adds test coverage

Signed-off-by: Rex Chang <[email protected]>

* Xray subsegment (#3)

* xray tracer: set subsegment type for child spans

Signed-off-by: Rex Chang <[email protected]>

* adds test coverage

Signed-off-by: Rex Chang <[email protected]>

* updates xray subsegment name to use operation name (instead of parent's span name)

Signed-off-by: Rex Chang <[email protected]>

* updates doc

Signed-off-by: Rex Chang <[email protected]>

* updates doc

Signed-off-by: Rex Chang <[email protected]>

* Xray subsegment (#4)

* xray tracer: set subsegment type for child spans

Signed-off-by: Rex Chang <[email protected]>

* adds test coverage

Signed-off-by: Rex Chang <[email protected]>

* updates xray subsegment name to use operation name (instead of parent's span name)

Signed-off-by: Rex Chang <[email protected]>

* updates doc

Signed-off-by: Rex Chang <[email protected]>

* updates doc

Signed-off-by: Rex Chang <[email protected]>

* adds to spell check dictionary

Signed-off-by: Rex Chang <[email protected]>

* fixes spellcheck

Signed-off-by: Rex Chang <[email protected]>

* adds to spellcheck

Signed-off-by: Rex Chang <[email protected]>

xray tracer: set subsegment type for child spans (#2)

* xray tracer: set subsegment type for child spans

Signed-off-by: Rex Chang <[email protected]>

* adds test coverage

Signed-off-by: Rex Chang <[email protected]>

Xray subsegment (#3)

* xray tracer: set subsegment type for child spans

Signed-off-by: Rex Chang <[email protected]>

* adds test coverage

Signed-off-by: Rex Chang <[email protected]>

* updates xray subsegment name to use operation name (instead of parent's span name)

Signed-off-by: Rex Chang <[email protected]>

* updates doc

Signed-off-by: Rex Chang <[email protected]>

* updates doc

Signed-off-by: Rex Chang <[email protected]>

Xray subsegment (#4)

* xray tracer: set subsegment type for child spans

Signed-off-by: Rex Chang <[email protected]>

* adds test coverage

Signed-off-by: Rex Chang <[email protected]>

* updates xray subsegment name to use operation name (instead of parent's span name)

Signed-off-by: Rex Chang <[email protected]>

* updates doc

Signed-off-by: Rex Chang <[email protected]>

* updates doc

Signed-off-by: Rex Chang <[email protected]>

* adds to spell check dictionary

Signed-off-by: Rex Chang <[email protected]>

fixes spellcheck

Signed-off-by: Rex Chang <[email protected]>

* fixes spell check

Signed-off-by: Rex Chang <[email protected]>

Co-authored-by: alyssawilk <[email protected]>
Co-authored-by: Bin Wu <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Joshua Marantz <[email protected]>
ravenblackx pushed a commit to ravenblackx/envoy that referenced this issue Jun 8, 2022
This PR has a pretty high chance of CI breaking post-submit, as anyone else who references global runtime will have clean merges, but have their tests fail.

Risk Level: Low for upstream Envoy, high of downstream test breakages.
Testing: n/a
Docs Changes: n/a
Release Notes: inline
Actually fixes envoyproxy#19847

Signed-off-by: Alyssa Wilk <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Feature requests. Not bugs or questions.
Projects
None yet
5 participants