Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: Support FaultInjection in BackendTrafficPolicy #2304

Merged
merged 6 commits into from
Dec 21, 2023

Conversation

tmsnan
Copy link
Contributor

@tmsnan tmsnan commented Dec 15, 2023

What type of PR is this?

API: Support FaultInjection in BackendTrafficPolicy
Provide basic fault injection capabilities:
Supports injecting fixed time delay or abort errors by percentage

Example

---
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: BackendTrafficPolicy
metadata:
  name: example-policy-with-fault-injection
spec:
 [...]
  faultInjection: # optional
    abort: # optional
      statusCode: # required
      percentage: # default 100
    delay: # optional
      fixedDelay: # required
      percentage: # default 100
  
[...]    

Relates to #2124

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #

@tmsnan tmsnan requested a review from a team as a code owner December 15, 2023 06:20
Copy link

🚀 Thank you for contributing to the Envoy Gateway project! 🚀

Before merging, please ensure to follow the process below:

  1. Requesting Reviews:
  • cc @envoyproxy/gateway-reviewers team for an initial review.
  • After the initial review, reviewers should request the @envoyproxy/gateway-maintainers team for further review.
  1. Review Approval:
  • Your PR needs to receive at least two approvals.
  • At least one approval must come from a member of the gateway-maintainers team.

NOTE: Once your PR is under review, please do not rebase and force push it. Otherwise, it will force your reviewers to review the PR from scratch rather than simply look at your latest changes.

What's more, you can help expedite the processing of your PR by
  • Ensuring you have self-reviewed your work according to the project's Contribution Guidelines.
  • If your PR addresses a specific issue, make sure to mention it in the PR description.
  • Respond promptly if there are any test failures or suggestions for improvements that we comment on.

Copy link

github-actions bot commented Dec 15, 2023

@github-actions github-actions bot temporarily deployed to pull request December 15, 2023 06:23 Inactive
Copy link

codecov bot commented Dec 15, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (64d7152) 64.54% compared to head (fcd4f79) 64.53%.
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2304      +/-   ##
==========================================
- Coverage   64.54%   64.53%   -0.02%     
==========================================
  Files         112      112              
  Lines       15949    15949              
==========================================
- Hits        10295    10293       -2     
- Misses       5007     5009       +2     
  Partials      647      647              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -65,6 +65,10 @@ type BackendTrafficPolicySpec struct {
//
// +optional
TCPKeepalive *TCPKeepalive `json:"tcpKeepalive,omitempty"`

// FaultInjection defines the fault injection policy to be applied. Support delays and aborts.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// FaultInjection defines the fault injection policy to be applied. Support delays and aborts.
// FaultInjection defines the fault injection policy to be applied. This configuration can be used to
// inject delays and abort requests to mimic failure scenarios such as service failures and overloads.

// FaultInjection defines the fault injection policy to be applied. Support delays and aborts.
type FaultInjection struct {

// If specified, the delay will inject a fixed delay into the request
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// If specified, the delay will inject a fixed delay into the request
// If specified, a delay will be injected into the request

// +optional
Delay *DelayConfig `json:"delay,omitempty"`

// If specified, the abort will abort the request with the specified HTTP status code
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// If specified, the abort will abort the request with the specified HTTP status code
// If specified, the request will be aborted if it meets the configuration criteria.

// If specified, the delay will inject a fixed delay into the request
//
// +optional
Delay *DelayConfig `json:"delay,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Delay *DelayConfig `json:"delay,omitempty"`
Delay *FaultInjectionDelay `json:"delay,omitempty"`

// If specified, the abort will abort the request with the specified HTTP status code
//
// +optional
Abort *AbortConfig `json:"abort,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Abort *AbortConfig `json:"abort,omitempty"`
Abort *FaultInjectionAbort `json:"abort,omitempty"`

// +required
FixedDelay *metav1.Duration `json:"fixedDelay"`

// Percentage specifies the percentage of requests to be delayed. Default 100%, if set 0, no requests will be delayed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we can only express int e.g. 1%, 2%, but not fractions such as 0.5% ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update

// +required
StatusCode *int `json:"statusCode"`

// Percentage specifies the percentage of requests to be aborted. Default 100%, if set 0, no requests will be aborted.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


// AbortConfig defines the abort fault injection configuration
type AbortConfig struct {
// StatusCode specifies the HTTP/GRPC status code to be returned
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

grpc status is different from http status

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one way to define this is

status:
  type: HTTP
  http: 200

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or define httpStatus & grpcStatus and use CEL to make sure both cannot be simultaneously defined

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, used CEL

@tmsnan tmsnan force-pushed the dev-add-fault-injection branch from c197ba1 to c5221fe Compare December 18, 2023 11:59
@github-actions github-actions bot temporarily deployed to pull request December 18, 2023 12:02 Inactive
// Percentage specifies the percentage of requests to be delayed. Default 100%, if set 0, no requests will be delayed. Accuracy to 0.0001%.
// +optional
// +kubebuilder:default=100
Percentage *float64 `json:"percentage"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

omitempty?

// +union
//
// +kubebuilder:validation:XValidation:rule=" !(has(self.httpStatus) && has(self.grpcStatus)) && (has(self.httpStatus) || has(self.grpcStatus))",message="httpStatus and grpcStatus cannot be simultaneously defined."
type FaultInjectionAbort struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also missing omitempty for optional fields

@@ -30,7 +30,7 @@ CONTROLLERGEN_OBJECT_FLAGS := object:headerFile="$(ROOT_DIR)/tools/boilerplate/
.PHONY: manifests
manifests: $(tools/controller-gen) generate-gwapi-manifests ## Generate WebhookConfiguration and CustomResourceDefinition objects.
@$(LOG_TARGET)
$(tools/controller-gen) crd paths="./..." output:crd:artifacts:config=charts/gateway-helm/crds/generated
$(tools/controller-gen) crd:allowDangerousTypes=true paths="./..." output:crd:artifacts:config=charts/gateway-helm/crds/generated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why set this to be true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because of float field

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it

api/v1alpha1/fault_injection.go Outdated Show resolved Hide resolved
@@ -0,0 +1,220 @@
apiVersion: apps/v1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this shouldnt be part of this PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, my bad.

// FaultInjectionAbort defines the abort fault injection configuration
// +union
//
// +kubebuilder:validation:XValidation:rule=" !(has(self.httpStatus) && has(self.grpcStatus)) && (has(self.httpStatus) || has(self.grpcStatus))",message="httpStatus and grpcStatus cannot be simultaneously defined."
Copy link
Contributor

@arkodg arkodg Dec 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be !(has(self.httpStatus) && has(self.grpcStatus)) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah,Add verification that at least one of httpStatus and grpcStatus is set

@github-actions github-actions bot temporarily deployed to pull request December 19, 2023 02:35 Inactive
Signed-off-by: zhaonan <[email protected]>
@tmsnan tmsnan force-pushed the dev-add-fault-injection branch from 661b177 to c5fcea9 Compare December 19, 2023 02:40
@github-actions github-actions bot temporarily deployed to pull request December 19, 2023 02:44 Inactive
// +union
//
// +kubebuilder:validation:XValidation:rule=" !(has(self.httpStatus) && has(self.grpcStatus)) ",message="httpStatus and grpcStatus cannot be simultaneously defined."
// +kubebuilder:validation:XValidation:rule=" has(self.httpStatus) || has(self.grpcStatus) ",message="httpStatus and grpcStatus are set at least one."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is L45 needed ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

arkodg
arkodg previously approved these changes Dec 19, 2023
Copy link
Contributor

@arkodg arkodg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks for introducing this API
adding a minor non blocking comment

@arkodg arkodg requested review from a team December 19, 2023 22:51
@arkodg arkodg requested a review from shawnh2 December 19, 2023 22:51
zirain
zirain previously approved these changes Dec 20, 2023
@tmsnan tmsnan dismissed stale reviews from zirain and arkodg via 7fddc46 December 20, 2023 02:51
Copy link
Contributor

@shawnh2 shawnh2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Copy link
Member

@Xunzhuo Xunzhuo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks

@Xunzhuo
Copy link
Member

Xunzhuo commented Dec 20, 2023

/retest

@zirain zirain merged commit c2f88f0 into envoyproxy:main Dec 21, 2023
18 checks passed
@tmsnan tmsnan deleted the dev-add-fault-injection branch December 21, 2023 03:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

5 participants