Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BackendTrafficPolicy causes EDS fetch failure #2345

Closed
guydc opened this issue Dec 22, 2023 · 3 comments · Fixed by #2388
Closed

BackendTrafficPolicy causes EDS fetch failure #2345

guydc opened this issue Dec 22, 2023 · 3 comments · Fixed by #2388
Assignees
Labels
area/xds-server Issues related to the xDS Server used for managing Envoy configuration. help wanted Extra attention is needed triage

Comments

@guydc
Copy link
Contributor

guydc commented Dec 22, 2023

Description:

Applying any sort of Backend Traffic Policy causes the relevant cluster's endpoint to be removed

Repro steps:

  1. Install EG 0.6.0
  2. apply quickstart settings: https://gateway.envoyproxy.io/v0.6.0/user/quickstart/
  3. curl --verbose --header "Host: www.example.com" http://$GATEWAY_HOST/get works
  4. apply any non-trivial BackendTrafficPolicy, e.g.:
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: BackendTrafficPolicy
metadata:
  name: lb-example
spec:
  targetRef:
    group: gateway.networking.k8s.io
    kind: HTTPRoute
    name: backend
  loadBalancer:
    type: ConsistentHash
    consistentHash:
      type: SourceIP
  1. wait 30 seconds
  2. curl --verbose --header "Host: www.example.com" http://$GATEWAY_HOST/get` fails with unhealthy upstream

Environment:
EG v0.6.0, v0.0.0-latest

Logs:
Envoy Logs (some time after BTP applied):

[2023-12-22 17:18:10.737][1][warning][config] [source/extensions/config_subscription/grpc/grpc_subscription_impl.cc:130] gRPC config: initial fetch timed out for type.googleapis.com/envoy.config.endpoint.v3.ClusterLoadAssignment

Envoy Config Dump (before BTP):

   "dynamic_endpoint_configs": [
    {
     "endpoint_config": {
      "@type": "type.googleapis.com/envoy.config.endpoint.v3.ClusterLoadAssignment",
      "cluster_name": "httproute/default/backend/rule/0",
      "endpoints": [
       {
        "locality": {},
        "lb_endpoints": [
         {
          "endpoint": {
           "address": {
            "socket_address": {
             "address": "10.244.0.7",
             "port_value": 3000
            }
           },
           "health_check_config": {}
          },
          "health_status": "HEALTHY",
          "load_balancing_weight": 1
         }
        ],
        "load_balancing_weight": 1
       }
      ],
      "policy": {
       "overprovisioning_factor": 140
      }
     }
    }
   ]
  }

Envoy Config Dump (after BTP applied):

   "dynamic_endpoint_configs": [
    {
     "endpoint_config": {
      "@type": "type.googleapis.com/envoy.config.endpoint.v3.ClusterLoadAssignment",
      "cluster_name": "httproute/default/backend/rule/0",
      "policy": {
       "overprovisioning_factor": 140
      }
     }
    }
   ]
  },
@guydc guydc added the triage label Dec 22, 2023
@arkodg arkodg added the help wanted Extra attention is needed label Dec 22, 2023
@shawnh2
Copy link
Contributor

shawnh2 commented Dec 28, 2023

It seems related to envoyproxy/go-control-plane#583.

From my debug log, after BTP is applied, the endpoint xds resources has been generated correctly, and has been set snapshot for corresponding node successfully.

err := s.SetSnapshot(context.TODO(), node, snapshot)

@shawnh2 shawnh2 added the area/xds-server Issues related to the xDS Server used for managing Envoy configuration. label Dec 28, 2023
@tmsnan
Copy link
Contributor

tmsnan commented Dec 29, 2023

same issue #2261.
Maybe we can use eds-caching to solve this problem.

envoyproxy/envoy#28877 (comment)

envoyproxy/envoy#13009.

@arkodg
Copy link
Contributor

arkodg commented Dec 29, 2023

wow good find @tmsnan, thanks for root causing this, agree lets set envoy.restart_features.use_eds_cache_for_ads=true

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/xds-server Issues related to the xDS Server used for managing Envoy configuration. help wanted Extra attention is needed triage
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants