Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aws-custom-route-controller events are wrongly categorized as ERR_RETRYABLE_INFRA_DEPENDENCIES instead of ERR_INFRA_UNAUTHENTICATED #1137

Open
ialidzhikov opened this issue Nov 20, 2024 · 0 comments
Labels
area/ops-productivity Operator productivity related (how to improve operations) kind/bug Bug platform/aws Amazon web services platform/infrastructure

Comments

@ialidzhikov
Copy link
Member

How to categorize this issue?

/area ops-productivity
/kind bug
/platform aws

What happened:
Shoot credentials got invalid. Then the ControlPlane got unhealthy with:

status:
  conditions:
  - codes:
    - ERR_RETRYABLE_INFRA_DEPENDENCIES
    lastTransitionTime: "2024-11-20T10:37:28Z"
    lastUpdateTime: "2024-11-20T10:49:01Z"
    message: "[aws-custom-route-controller] RoutesUpdateFailed: (combined from similar
      events): AuthFailure: AWS was not able to validate the provided access credentials\n\tstatus
      code: 401, request id: <id>."
    reason: HealthCheckUnsuccessful
    status: "False"
    type: ControlPlaneHealthy

IMO, the ERR_RETRYABLE_INFRA_DEPENDENCIES is wrong. It should be ERR_INFRA_UNAUTHENTICATED.

In

unauthenticatedRegexp = regexp.MustCompile(`(?i)(AuthFailure|InvalidAccessKeyId|InvalidSecretAccessKey)`)
, the error string AuthFailure is already marked as ERR_INFRA_UNAUTHENTICATED.

The custom handling for the events:

var codes []gardencorev1beta1.ErrorCode
if strings.Contains(newestEvent.Message, "RouteLimitExceeded") {
codes = append(codes, gardencorev1beta1.ErrorInfraQuotaExceeded)
} else {
codes = append(codes, gardencorev1beta1.ErrorRetryableInfraDependencies)
}

What you expected to happen:
AuthFailure error to be flagged with ERR_INFRA_UNAUTHENTICATED, not with ERR_RETRYABLE_INFRA_DEPENDENCIES.

How to reproduce it (as minimally and precisely as possible):
See above.

Anything else we need to know?:
N/A

Environment:

  • Gardener version (if relevant):
  • Extension version: v1.58.3
  • Kubernetes version (use kubectl version):
  • Cloud provider or hardware configuration:
  • Others:
@gardener-robot gardener-robot added area/ops-productivity Operator productivity related (how to improve operations) kind/bug Bug platform/aws Amazon web services platform/infrastructure labels Nov 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ops-productivity Operator productivity related (how to improve operations) kind/bug Bug platform/aws Amazon web services platform/infrastructure
Projects
None yet
Development

No branches or pull requests

2 participants