Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Least request lb enable full scan mode #1

Open
wants to merge 200 commits into
base: main
Choose a base branch
from

Conversation

jkirschner-hashicorp
Copy link
Owner

This is a WIP attempt to combine the PRs related to full scan mode (primarily authored by @barroca) into a single PR.

This branch was made relative to Envoy main on Dec 20.

@@ -169,6 +175,14 @@ removed_config_or_runtime:
runtime flag and legacy code path.

new_features:
- area: upstream
change: |
Added :ref:`enable_full_scan <envoy_v3_api_msg_extensions.load_balancing_policies.least_request.v3.LeastRequest>`
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: Fix this, since the bool has been converted to an enum

@jkirschner-hashicorp jkirschner-hashicorp force-pushed the least_request_lb_enable_full_scan_mode branch from 33efb88 to dfaa3b6 Compare December 23, 2023 21:41
jkirschner-hashicorp pushed a commit that referenced this pull request Jan 16, 2024
Commit Message: the probing socket is released when port migration fails. If this happens in response to an incoming packet during an I/O event, the follow socket read could cause use-after-free.

[2024-01-08 16:30:53.386][12][critical][backtrace] [./source/server/backtrace.h:104] Caught Segmentation fault, suspect faulting address 0x0
[2024-01-08 16:30:53.387][12][critical][backtrace] [./source/server/backtrace.h:91] Backtrace (use tools/stack_decode.py to get line numbers):
[2024-01-08 16:30:53.387][12][critical][backtrace] [./source/server/backtrace.h:92] Envoy version: 0/1.29.0-dev/test/DEBUG/BoringSSL
[2024-01-08 16:30:53.413][12][critical][backtrace] [./source/server/backtrace.h:96] #0: Envoy::SignalAction::sigHandler() [0x55bb876d499e]
[2024-01-08 16:30:53.413][12][critical][backtrace] [./source/server/backtrace.h:98] #1: [0x7f55fbf92510]
[2024-01-08 16:30:53.440][12][critical][backtrace] [./source/server/backtrace.h:96] envoyproxy#2: Envoy::Network::Utility::readPacketsFromSocket() [0x55bb875de0ef]
[2024-01-08 16:30:53.466][12][critical][backtrace] [./source/server/backtrace.h:96] envoyproxy#3: Envoy::Quic::EnvoyQuicClientConnection::onFileEvent() [0x55bb8663e1eb]
[2024-01-08 16:30:53.492][12][critical][backtrace] [./source/server/backtrace.h:96] envoyproxy#4: Envoy::Quic::EnvoyQuicClientConnection::setUpConnectionSocket()::$_0::operator()() [0x55bb8663f192]
[2024-01-08 16:30:53.518][12][critical][backtrace] [./source/server/backtrace.h:96] envoyproxy#5: std::__invoke_impl<>() [0x55bb8663f151]
[2024-01-08 16:30:53.544][12][critical][backtrace] [./source/server/backtrace.h:96] envoyproxy#6: std::__invoke_r<>() [0x55bb8663f0e2]
[2024-01-08 16:30:53.569][12][critical][backtrace] [./source/server/backtrace.h:96] envoyproxy#7: std::_Function_handler<>::_M_invoke() [0x55bb8663efc2]
[2024-01-08 16:30:53.595][12][critical][backtrace] [./source/server/backtrace.h:96] envoyproxy#8: std::function<>::operator()() [0x55bb85cb8f44]
[2024-01-08 16:30:53.621][12][critical][backtrace] [./source/server/backtrace.h:96] envoyproxy#9: Envoy::Event::DispatcherImpl::createFileEvent()::$_5::operator()() [0x55bb8722560f]
[2024-01-08 16:30:53.648][12][critical][backtrace] [./source/server/backtrace.h:96] envoyproxy#10: std::__invoke_impl<>() [0x55bb872255c1]
[2024-01-08 16:30:53.674][12][critical][backtrace] [./source/server/backtrace.h:96] envoyproxy#11: std::__invoke_r<>() [0x55bb87225562]
[2024-01-08 16:30:53.700][12][critical][backtrace] [./source/server/backtrace.h:96] envoyproxy#12: std::_Function_handler<>::_M_invoke() [0x55bb872253e2]
[2024-01-08 16:30:53.700][12][critical][backtrace] [./source/server/backtrace.h:96] envoyproxy#13: std::function<>::operator()() [0x55bb85cb8f44]
[2024-01-08 16:30:53.726][12][critical][backtrace] [./source/server/backtrace.h:96] envoyproxy#14: Envoy::Event::FileEventImpl::mergeInjectedEventsAndRunCb() [0x55bb872358ec]
[2024-01-08 16:30:53.752][12][critical][backtrace] [./source/server/backtrace.h:96] envoyproxy#15: Envoy::Event::FileEventImpl::assignEvents()::$_1::operator()() [0x55bb87235ed1]
[2024-01-08 16:30:53.778][12][critical][backtrace] [./source/server/backtrace.h:96] envoyproxy#16: Envoy::Event::FileEventImpl::assignEvents()::$_1::__invoke() [0x55bb87235949]
[2024-01-08 16:30:53.804][12][critical][backtrace] [./source/server/backtrace.h:96] envoyproxy#17: event_persist_closure [0x55bb87fab72b]
[2024-01-08 16:30:53.830][12][critical][backtrace] [./source/server/backtrace.h:96] envoyproxy#18: event_process_active_single_queue [0x55bb87faada2]
[2024-01-08 16:30:53.856][12][critical][backtrace] [./source/server/backtrace.h:96] envoyproxy#19: event_process_active [0x55bb87fa56c8]
[2024-01-08 16:30:53.882][12][critical][backtrace] [./source/server/backtrace.h:96] envoyproxy#20: event_base_loop [0x55bb87fa45cc]
[2024-01-08 16:30:53.908][12][critical][backtrace] [./source/server/backtrace.h:96] envoyproxy#21: Envoy::Event::LibeventScheduler::run() [0x55bb8760a59f]
Risk Level: low
Testing: new unit test
Docs Changes: N/A
Release Notes: Yes
Platform Specific Features: N/A

Signed-off-by: Dan Zhang <[email protected]>
Co-authored-by: Dan Zhang <[email protected]>
@jkirschner-hashicorp jkirschner-hashicorp force-pushed the least_request_lb_enable_full_scan_mode branch from 9024501 to 0cf6424 Compare January 16, 2024 19:52
By default, the least request load balancer returns the host with the fewest
active requests from a set of N randomly selected hosts.

This introduces a new "full scan" selection method that returns the host with
the fewest number of active requests from all hosts. If multiple hosts are
tied for "least", a host is selected using an approximately random means that
is time and memory efficient. The selection probability among tied hosts may
not be equal, depending on:
- The host count
- How many hosts are tied
- How those tied hosts are distributed within the hosts array

Signed-off-by: Jared Kirschner <[email protected]>
Signed-off-by: Leonardo da Mata <[email protected]>
@jkirschner-hashicorp jkirschner-hashicorp force-pushed the least_request_lb_enable_full_scan_mode branch from 0cf6424 to 47e2838 Compare January 16, 2024 19:59
alyssawilk and others added 23 commits January 16, 2024 15:34
Use the same TIMEOUT_FACTOR when running TSAN, MSAN, and ASAN since running the tests under sanitizers can take longer. This also renames TSAN_TIMEOUT_FACTOR to TIMEOUT_FACTOR to reflect the name.

Risk Level: low (tests only)
Testing: n/a
Docs Changes: n/a
Release Notes: n/a
Platform Specific Features: n/a

Signed-off-by: Fredy Wijaya <[email protected]>
* Update RELEASES.md to less outdated instructions

Signed-off-by: Raven Black <[email protected]>
Additional Description: The CryptoMB private key provider only supports RSA at the time, the patch adds ECDSA support to it.
Risk Level: Low (as contrib extension)
Testing: Unit and integration tests
Docs Changes: N/A
Release Notes: N/A
Platform Specific Features: Requires AVX512 or equivalent CPU instruction set

Signed-off-by: Xie Zhihao <[email protected]>
The OpenTelemetry exporter was previously using client-side streaming for exporting trace data. This appears to work fine with the reference OpenTelemetry collector, but some collectors (DataPrepper for one example) instead seem to expect unary RPC calls for trace exports (in accordance with the OpenTelemetry specification). It seems that Data Prepper appears to rely specifically on the end-stream flag arriving in order for it to properly process the trace exports.

Risk Level: medium
Testing: no new tests

Fixes envoyproxy#31497 

Signed-off-by: Ashish Banerjee <[email protected]>
See envoyproxy/go-control-plane#824 for more information

This PR adds the vtprotobuf protoc plugin for Go. This works on top of the existing protoc-gen-go, to add optimized marshal functions that callers can opt in to using. This is not like gogo, which was a very invasive change -- everything is layered and opt-in. See issue for benchmarks, etc.

Additionally, to avoid possible binary size increase, the entire new code is protected under a go build tag. Users will need to opt-in at build time (-tags=vtprotobuf). By default, there is no impact for users at all.

Risk Level: Low - only additional opt-in code
Testing: Manually tested in Istio codebase

Signed-off-by: John Howard <[email protected]>
…red/golang (envoyproxy#31860)

build(deps): bump golang in /examples/shared/golang

Bumps golang from `cbee5d2` to `c4b696f`.

---
updated-dependencies:
- dependency-name: golang
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…in /examples/shared/node (envoyproxy#31859)

build(deps): bump node in /examples/shared/node

Bumps node from 21.5-bookworm-slim to 21.6-bookworm-slim.

---
updated-dependencies:
- dependency-name: node
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
… initialized (envoyproxy#31834)

* initialize variable request_duration_

Signed-off-by: Zhiyong Yang <[email protected]>
…oxy#31852)

Moves some frame accounting to onBeforeFrameReceived.

Signed-off-by: Biren Roy <[email protected]>
…p2_settings` internally (envoyproxy#31875)

This is refactoring in preparation for removing the nghttp2_callback API layer.

Commit Message: http2: updates CodecImpl to use a more abstract data type than nghttp2_settings internally
Additional Description:
Risk Level: low; refactoring, no functional change intended
Testing: ran unit and integration tests locally
Docs Changes:
Release Notes:
Platform Specific Features:

Signed-off-by: Biren Roy <[email protected]>
Signed-off-by: dependency-envoy[bot] <148525496+dependency-envoy[bot]@users.noreply.github.com>
Co-authored-by: dependency-envoy[bot] <148525496+dependency-envoy[bot]@users.noreply.github.com>
…31888)

Signed-off-by: dependency-envoy[bot] <148525496+dependency-envoy[bot]@users.noreply.github.com>
Co-authored-by: dependency-envoy[bot] <148525496+dependency-envoy[bot]@users.noreply.github.com>
…oxy#31891)

Signed-off-by: dependency-envoy[bot] <148525496+dependency-envoy[bot]@users.noreply.github.com>
Signed-off-by: dependency-envoy[bot] <148525496+dependency-envoy[bot]@users.noreply.github.com>
Signed-off-by: dependency-envoy[bot] <148525496+dependency-envoy[bot]@users.noreply.github.com>
gRPC clients expect that request status will be communicated using the gRPC status header, 
and that HTTP status for responses should always be 200 for properly behaving servers.
This commit updates the grpc_http1_reverse_bridge to follow this behavior.

Risk Level: medium
Testing: unit test, manually tested with grpcurl
Docs Changes: n/a
Release Notes: updated
Runtime guard: envoy.reloadable_features.grpc_http1_reverse_bridge_change_http_status

Signed-off-by: Henry Qin <[email protected]>
Signed-off-by: Henry Qin <[email protected]>
dependabot bot and others added 30 commits February 2, 2024 11:50
…hared/postgres (envoyproxy#32158)

build(deps): bump postgres in /examples/shared/postgres

Bumps postgres from `49c276f` to `db2d3c8`.

---
updated-dependencies:
- dependency-name: postgres
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…al_ratelimit with 1 update (envoyproxy#32162)

build(deps): bump the examples-local-ratelimit group

Bumps the examples-local-ratelimit group in /examples/local_ratelimit with 1 update: nginx.


Updates `nginx` from `4c0fdaa` to `6eb9534`

---
updated-dependencies:
- dependency-name: nginx
  dependency-type: direct:production
  dependency-group: examples-local-ratelimit
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
)

Currently, if propagate_response_headers option is enabled, either for UDP tunneling and TCP tunneling, and retry is also enabled, the second HTTP response headers will cause a crash in Envoy, since the first filter state object is set with ReadOnly attribute. Fixing this by changing to Mutable with relevant tests that check this use case.

Risk Level: low
Testing: integration tests
Docs Changes: None
Release Notes: updated
Platform Specific Features: None
Signed-off-by: ohadvano <[email protected]>
Add Envoy::ExecutionContext, it can be inherited by subclasses to represent arbitrary information associated with the execution of a piece of code. Its activate/deactivate methods are called when the said execution starts/ends.

This is useful for things like resource accounting - see github issue for more details.

Commit Message: Add Envoy::ExecutionContext.
Additional Description:
Risk Level: Low. It is no op in open source code.
Testing: Unit test in test/common/common/execution_context_test.cc.
Docs Changes: N/A
Release Notes: N/A

Signed-off-by: Bin Wu <[email protected]>
It is causing test flakes and friction on CI. I was able to reproduce
this, and the crash happens in
https://github.com/envoyproxy/envoy/blob/faec43168bb50028b4b4351f477dfef40e0fb227/test/integration/fake_upstream.h#L714.

But I'm not sure cause, so will disable it for now until we have
deflaked it.

See envoyproxy#32151 for details.

Signed-off-by: Ali Beyad <[email protected]>
…oyproxy#32105)

* stream info: rename the extended response flag and response flag

Signed-off-by: wbpcode <[email protected]>

* minor update

Signed-off-by: wbpcode <[email protected]>

* address comments

Signed-off-by: wbpcode <[email protected]>

---------

Signed-off-by: wbpcode <[email protected]>
* Exclude hosts in DRAINING state

Signed-off-by: Networking team <[email protected]>

* WIP

Signed-off-by: Networking team <[email protected]>

* fix upstream_impl_test

Signed-off-by: Networking team <[email protected]>

* add more tests

Signed-off-by: Networking team <[email protected]>

* fix lint

Signed-off-by: Networking team <[email protected]>

* fix cluster_handler_test

Signed-off-by: Networking team <[email protected]>

* fix lint

Signed-off-by: Networking team <[email protected]>

* rm excluded_via_eds_draining

Signed-off-by: Chao-Han Tsai <[email protected]>

* rename

Signed-off-by: Chao-Han Tsai <[email protected]>

* restore

Signed-off-by: Chao-Han Tsai <[email protected]>

* remove comment

Signed-off-by: Chao-Han Tsai <[email protected]>

* add runtime guard and changelog

Signed-off-by: Networking team <[email protected]>

* fix lint

Signed-off-by: Networking team <[email protected]>

---------

Signed-off-by: Networking team <[email protected]>
Signed-off-by: Chao-Han Tsai <[email protected]>
Co-authored-by: Networking team <[email protected]>
…roxy#32193)

Bumps [orjson](https://github.com/ijl/orjson) from 3.9.12 to 3.9.13.
- [Release notes](https://github.com/ijl/orjson/releases)
- [Changelog](https://github.com/ijl/orjson/blob/master/CHANGELOG.md)
- [Commits](ijl/orjson@3.9.12...3.9.13)

---
updated-dependencies:
- dependency-name: orjson
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…hared/postgres (envoyproxy#32195)

build(deps): bump postgres in /examples/shared/postgres

Bumps postgres from `db2d3c8` to `4d1b17a`.

---
updated-dependencies:
- dependency-name: postgres
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
envoyproxy#32190)

Bumps redis from `247eb51` to `f44e917`.

---
updated-dependencies:
- dependency-name: redis
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…red/golang (envoyproxy#32191)

build(deps): bump golang in /examples/shared/golang

Bumps golang from `cf1c8a7` to `3efef61`.

---
updated-dependencies:
- dependency-name: golang
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…al_ratelimit with 1 update (envoyproxy#32189)

build(deps): bump the examples-local-ratelimit group

Bumps the examples-local-ratelimit group in /examples/local_ratelimit with 1 update: nginx.


Updates `nginx` from `6eb9534` to `5f44022`

---
updated-dependencies:
- dependency-name: nginx
  dependency-type: direct:production
  dependency-group: examples-local-ratelimit
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…xamples/zipkin (envoyproxy#32097)

build(deps): bump openzipkin/zipkin in /examples/zipkin

Bumps openzipkin/zipkin from `24164b3` to `4c7427b`.

---
updated-dependencies:
- dependency-name: openzipkin/zipkin
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…les/skywalking (envoyproxy#32188)

build(deps): bump elasticsearch in /examples/skywalking

Bumps elasticsearch from `60babae` to `2529e4b`.

---
updated-dependencies:
- dependency-name: elasticsearch
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
The ResetWithBidiTrafficExplicitData was crashing sporadically with
memory errors when run for HTTP3. Running with msan showed that the
crashes were caused in headers_callback.setReady() being called after
the test finished and member variables of the test were already
destroyed.

I noticed that in the test, we didn't call headers_callback.waitReady()
when the test was running for HTTP3.  It wasn't clear to me why, but
that seemed like the cause.  I removed the if-guard so we always call
headers_callback.waitReady(), and the test seems to pass now (over 500
runs).

Fixes envoyproxy#32024

Signed-off-by: Ali Beyad <[email protected]>
Lowering API limits so we can more easily test dns refresh.
Adding a bunch of e2e DNS refresh tests.

Risk Level: low
Testing: yes
Docs Changes: inline
Release Notes:
Signed-off-by: Alyssa Wilk <[email protected]>
…roxy#32183)

We followed the C++ style to register configFactory, but it's not a proper choice for Golang filter.
Here is the Reasons:

Golang introduced the ConfigParser interface to Parse/validate a config, configFactory does not need to parse config.
better performance. By using fiterFactory, we could ommit generate a closure per http request, that may take ~1ns.
Yep, this is a breaking change, people need to change the register API.
So, I suggest we fix it earlier.
Signed-off-by: doujiang24 <[email protected]>
…histograms (envoyproxy#32197)

Commit Message: Works around n^2 issue when destructing hundreds of thousands of mocks in the benchmark context. It does thi s by:

only initializing the hosts when per-host stats are enabled in the cluster config
adding a 'disabled' and 'enabled' variant to the histogram-json test so that it's easy to filter using --benchmark_filter=disabled to avoid initializing the hosts.
See also: envoyproxy#32198 -- this doesn't fix that bug but we could in theory revert this PR once that bug is fixed.
Additional Description:
Risk Level: low
Testing: just the one test
Docs Changes: n/a
Release Notes: n/a
Platform Specific Features: n/a

Signed-off-by: Joshua Marantz <[email protected]>
…nvoyproxy#32174)

mobile: Fix the singleton implementation to be thread-safe

Signed-off-by: Fredy Wijaya <[email protected]>
…d/node (envoyproxy#32192)

build(deps): bump node in /examples/shared/node

Bumps node from `0ca0e0a` to `5792ef2`.

---
updated-dependencies:
- dependency-name: node
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.