-
Notifications
You must be signed in to change notification settings - Fork 369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Wasm OCI image #3564
feat: Wasm OCI image #3564
Conversation
7169309
to
3916bdf
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3564 +/- ##
==========================================
+ Coverage 68.55% 68.75% +0.20%
==========================================
Files 170 175 +5
Lines 20690 21484 +794
==========================================
+ Hits 14183 14771 +588
- Misses 5492 5636 +144
- Partials 1015 1077 +62 ☔ View full report in Codecov by Sentry. |
4164371
to
a2f8b6c
Compare
Hi @envoyproxy/gateway-reviewers my apologies for the large PR. I would have broken it into multiple smaller ones if possible :-). A lot of *.out.yaml test files have been updated due to the addition of the new wasm HTTP server static cluster to the Envoy bootstrap config. If you're going to review it, the most significant changes are in the internal/wasm package, which implements the Wasm file cache and an HTTP server to serve these files to the Envoy fleet. Additionally, the Gateway API runner has been modified to initialize the Wasm cache server. |
db8908d
to
236e316
Compare
7dccbbf
to
7f8f8e0
Compare
Signed-off-by: Huabing Zhao <[email protected]>
7f8f8e0
to
e8a3fce
Compare
Signed-off-by: Huabing Zhao <[email protected]>
) | ||
|
||
wasmRemoteFetchCount = metrics.NewCounter( | ||
"wasm_remote_fetch_count", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prob needs to be split up into 2 - error, total
cc @shawnh2 any prior art here for naming
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two different naming styles in EG:
xds_snapshot_creation_total
xds_snapshot_creation_failed
xds_snapshot_creation_success
status_update_total
status_update_failed_total
status_update_success_total
status_update_conflict_total
Prefer the first one. We also need to align "failed, failure, error" to a single term.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 looks more like envoy stats
2 looks like prom naming https://prometheus.io/docs/practices/naming/
@Xunzhuo @shawnh2 can you help drive/unify this naming
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tracked with #3684
Signed-off-by: Huabing Zhao <[email protected]>
Signed-off-by: Huabing Zhao <[email protected]>
Signed-off-by: Huabing Zhao <[email protected]>
Signed-off-by: Huabing Zhao <[email protected]>
transport := http.DefaultTransport.(*http.Transport).Clone() | ||
// nolint: gosec | ||
// This is only when a user explicitly sets a flag to enable insecure mode | ||
transport.TLSClientConfig = &tls.Config{InsecureSkipVerify: true} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this configurable in the API ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not yet, we can expose it to API when needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall LGTM !
there are some non blocking comments added in the PR, that can be tackled in follow ups
long term, my vote would be to move the wasm server into its own runner (like ratelimit)
this eliminates the shared fate problem of delaying xDS due to one slow URL download.
this also creates a eventual consistency problem, which can be resolved by implementing retries in the wasm cluster in envoy
Signed-off-by: Huabing Zhao <[email protected]>
Signed-off-by: Huabing Zhao <[email protected]>
Signed-off-by: Huabing Zhao <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, can we have some sort of a feature toggle here? Some components like the runner and the servers will execute even when the feature is not in use. LAter we can expose this by default, when we're confident and/or extracted to a different component.
func (r *Runner) Name() string { | ||
return string(egv1a1.LogComponentGatewayAPIRunner) | ||
} | ||
|
||
// Start starts the gateway-api translator runner | ||
func (r *Runner) Start(ctx context.Context) (err error) { | ||
r.Logger = r.Logger.WithName(r.Name()).WithValues("runner", r.Name()) | ||
|
||
go r.startWasmCache(ctx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if cache fails to start, should we fail EG or disable cache-related translation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fail the Wasm translation in EEP gateway API translation if cache fails to start.
require ( | ||
fortio.org/fortio v1.65.0 | ||
fortio.org/log v1.12.2 | ||
github.com/Masterminds/semver/v3 v3.2.1 | ||
github.com/cncf/xds/go v0.0.0-20240423153145-555b57ec207b | ||
github.com/davecgh/go-spew v1.1.1 | ||
github.com/davecgh/go-spew v1.1.2-0.20180830191138-d8f796af33cc | ||
github.com/docker/cli v26.1.3+incompatible |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any way to avoid the docker dependency and use more generic oci libs here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docker lib is used to parse docker compatible config file to get auth info.
Signed-off-by: Huabing Zhao <[email protected]>
Signed-off-by: Huabing Zhao <[email protected]>
We probably could:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seeing as most of the implementation that will execute regardless of EEP presence is taken from istio and already has sufficient production burn time, I take back my request for a feature gate.
* support Wasm OCI image Signed-off-by: Huabing Zhao <[email protected]> * set up test registry Signed-off-by: Huabing Zhao <[email protected]> * add test for registry authn Signed-off-by: Huabing Zhao <[email protected]> * fix lint Signed-off-by: Huabing Zhao <[email protected]> * fix e2e Signed-off-by: Huabing Zhao <[email protected]> * fix e2e Signed-off-by: Huabing Zhao <[email protected]> * add test for unauthed private image Signed-off-by: Huabing Zhao <[email protected]> * fix e2e Signed-off-by: Huabing Zhao <[email protected]> * fix e2e Signed-off-by: Huabing Zhao <[email protected]> * fix lint Signed-off-by: Huabing Zhao <[email protected]> * refactor Signed-off-by: Huabing Zhao <[email protected]> * add max failed attempts limit Signed-off-by: Huabing Zhao <[email protected]> * remove retries Signed-off-by: Huabing Zhao <[email protected]> * clean up e2e tests Signed-off-by: Huabing Zhao <[email protected]> * add e2e test for wrong password Signed-off-by: Huabing Zhao <[email protected]> * Update api/v1alpha1/authorization_types.go Co-authored-by: Arko Dasgupta <[email protected]> Signed-off-by: Huabing Zhao <[email protected]> * Update api/v1alpha1/wasm_types.go Co-authored-by: Arko Dasgupta <[email protected]> Signed-off-by: Huabing Zhao <[email protected]> * remove unnecessary replace Signed-off-by: Huabing Zhao <[email protected]> * remove set package Signed-off-by: Huabing Zhao <[email protected]> * fix gen check Signed-off-by: Huabing Zhao <[email protected]> * add test for failed attempts Signed-off-by: Huabing Zhao <[email protected]> * address comments Signed-off-by: Huabing Zhao <[email protected]> * address comments Signed-off-by: Huabing Zhao <[email protected]> * minor wording Signed-off-by: Huabing Zhao <[email protected]> * move sha256 inside code source Signed-off-by: Huabing Zhao <[email protected]> * address comments Signed-off-by: Huabing Zhao <[email protected]> * fix e2e Signed-off-by: Huabing Zhao <[email protected]> * fix flaky test Signed-off-by: Huabing Zhao <[email protected]> * change comments Signed-off-by: Huabing Zhao <[email protected]> * address comments Signed-off-by: Huabing Zhao <[email protected]> * address comments Signed-off-by: Huabing Zhao <[email protected]> * fail the eep translation if the wasm cache failed to start Signed-off-by: Huabing Zhao <[email protected]> --------- Signed-off-by: Huabing Zhao <[email protected]> Co-authored-by: Arko Dasgupta <[email protected]>
This PR adds support for Wasm OCI image format to EG.
It allows EG to download Wasm images from remote registries and serve them to the Envoy fleet via a local HTTP server inside EG running on 18002.
EnvoyExtensionPolicy
to point to the local EG HTTP server.Implements: #3304
Design: #3313