Is it possible to use the standard k8s topology labels? #729

TBBle · 2021-02-06T03:26:55Z

Is your feature request related to a problem?/Why is this needed

Looking into the interactions of this driver with the Cluster Autoscaler, i.e., kubernetes/autoscaler#3230 (comment), I have been unable to identify a good way to support the scale-from-zero case without needing to either change Cluster Autoscaler to always add this driver's topology label, or have each node group's ASG manually specify the topology label topology.ebs.csi.aws.com/zone.

/feature

Describe the solution you'd like in detail

A parameter to specify the topology label used by the PV dynamic allocation, which in the EKS case, can be set to topology.kubernetes.io/zone (or failure-domain.beta.kubernetes.io/zone on older clusters) and behave correctly.

Describe alternatives you've considered

Hard-coding the topology.ebs.csi.aws.com/zone label into Cluster Autoscaler AWS support the same way the standard zone label is hard-coded.
Manually including the tag k8s.io/cluster-autoscaler/node-template/label/topology.ebs.csi.aws.com/zone on each node group definition in eksctl to which the EBS CSI Daemonset will be deployed, i.e. manually matching the local taint/toleration setup.

See kubernetes/autoscaler#3230 for more details on these alternatives.

Additional context

I'm not sure what the reason is for using a new label originally, which is why I suggest it be an option, rather than changing the default. I'm guessing in non-EKS deployments on EC2, the standard zone label may not be reliably the AWS Zone, if the AWS Cloud Controller is not in-use?

That said, if the reason is that that the label must be owned by the driver for spec or behaviour reasons, then this wouldn't be feasible at all, and we'll probably be stuck manually ensuring that the node groups with the tag correctly match the tolerations of the Daemonset.

The text was updated successfully, but these errors were encountered:

ayberk · 2021-02-16T21:23:50Z

I really don't know why we didn't use the well-known label. @wongma7 @leakingtapan maybe?

I agree we should just use them instead. Initially we should support both labels though so we can provide a deprecation warning.

leakingtapan · 2021-02-16T22:15:19Z

It is because the well-known label wasn't well-known yet by the time this driver's topology key was set. There used to be a separate label failure-domain.beta.kubernetes.io/zone and it got renamed to topology.kubernetes.io/zone as part of the work from sig-cloudprovider. See here. And see here for when this driver's label was set.

Given the fact that the driver is not GA yet, I think it makes sense to fix the label before GA if no other concerns. But also keep in mind to update the CSI translation lib for CSI migration here to avoid breaking CSI migration.

Hope this helps

leakingtapan · 2021-02-16T22:30:25Z

related to: #675

ayberk · 2021-02-17T18:09:11Z

That's really helpful, thanks a lot @leakingtapan!

Since this requires k/k changes, I'll try to come up with a plan.

/assign

ayberk · 2021-02-26T23:42:10Z

We'll need to gracefully retire the current label (topology key), so we need to start with writing both.

Driver starts writing both keys: topology.ebs.csi.aws.com/zone and topology.kubernetes.io/zone. For topology constraints it should use the new one if available or fallback to the old one.
Translation lib similarly has to read both when converting to in-tree. However, we have two options while converting from in-tree to CSI:
1. Only write the new one. This would break the backward-compatibility and older versions of the driver wouldn't be able to read the topology key.
2. Write both keys so we don't break the older versions. Since we're already beta, this is preferred.
At this point all e2e tests should still pass.
Update the e2e tests to use the new key. Unit tests should suffice for supporting the old key.
Remove the old key. Eventually.

Skipping the beta labels should be safe since it's only used by the translation lib.

cc @wongma7 for sanity check

wongma7 · 2021-02-27T00:38:03Z

sounds good to me. I understand concretely this to mean

NodeGetInfo must respond such that CSINode.topologyKeys has both oldKey and newKey.

CreateVolume, if provided topology requirement in request, must extract the zone from newKey if exists or oldKey if not

CreateVolume must respond such that PVs have nodeAffininity.nodeSelector oldKey==A || newKey==A (by returning AccessibleTopology array containing both)

TBBle · 2021-03-28T06:13:48Z

If I'm following correctly, landing #773 would be sufficient for Cluster Autoscaler to operate correctly with --balancing-ignore-label=topology.ebs.csi.aws.com/zone, because #773 is ensuring that it has the same value as topology.kubernetes.io/zone. So CA can rely on the well-known label for "will this Pod be schedulable on this node-template", even though csi-translation-lib is hinging off the deprecated-but-identical value when it actually interacts with the scheduler.

It looks like #773 missed the 0.10 release branch, so I guess it'll be part of 0.11?

ayberk · 2021-03-29T04:19:21Z

You're right, #773 is enough. Remaining work is irrelevant.

I can see it in the release PR. We haven't released it but 0.10 will include it. See #811

denniswebb · 2021-04-03T02:46:38Z

We are running 0.10 and hit an issue of it complaining about not having a topology key on the node to provision a new volume from a PVC. Does #773 prevent needing the key topology.ebs.csi.aws.com/zone or is it still required for now?

ayberk · 2021-04-06T17:32:26Z

For now you should still use both. Driver itself priorities topology.kubernetes.io/zone, but it's not present falls back to topology.ebs.csi.aws.com/zone. We haven't updated the csi-translation on the upstream though, so we're still writing and reading both keys.

Do you mind opening an issue ideally with more details?

fejta-bot · 2021-07-05T18:04:55Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

wongma7 · 2021-07-06T18:06:03Z

/remove-lifecycle stale

label best practices needs discussion with kubernetes upstream . It's not clear at this time whether we should be putting the standard label or if the standard label should be reserved forr use by in-tree drivers (which simplifies migration because migration translates from our custom label to the standard label.)

k8s-triage-robot · 2021-10-04T18:09:23Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k1rk · 2021-10-05T10:52:36Z

/remove-lifecycle stale

when balancing similar nodes, the autoscaler will look for differences in labels as one of its checks for similarity. by default it will ignore the well known kubernetes zone label (topology.kubernetes.io/zone), but the aws-ebs-csi-driver will add a unique label for its zone, `topology.ebs.csi.aws.com/zone`. this label will eventually be deprecated in favor of the common well known topology label. for more information about this see the references. ref: kubernetes/autoscaler#3230 ref: kubernetes-sigs/aws-ebs-csi-driver#729

This change adds the aforementioned label to the list of ignored labels in the AWS nodegroupset processor. This change is being made in response to the addition of this label by the aws-ebs-csi-driver. This label will eventually be deprecated by the driver, but its use will prevent AWS users from properly balancing similar nodes. Also adds unit test for the AWS processor. ref: kubernetes#3230 ref: kubernetes-sigs/aws-ebs-csi-driver#729

ConnorJC3 · 2023-03-02T19:44:38Z

/close

fixed in #1360 - available in the driver since version v1.12.0

k8s-ci-robot · 2023-03-02T19:44:42Z

@ConnorJC3: Closing this issue.

In response to this:

/close

fixed in #1360 - available in the driver since version v1.12.0

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

jbg · 2023-03-06T06:47:43Z

@ConnorJC3 I think this should be re-opened. From what I can see, the fix was reverted because it was causing duplicate topology selector terms in the CSI migration case.

So it's still the case, even with latest driver version, that provisioned PVs select only on the non-standardised topology.ebs.csi.aws.com/zone label, and don't work out of the box with cluster-autoscaler when scaling node groups up from zero. (You have to manually label the ASGs with a node template so that cluster-autoscaler knows that the first node will have the topology.ebs.csi.aws.com/zone label once it starts up.)

One solution to solve this without harming compatibility with existing clusters would be to provide an option, disabled by default, to use the standardised topology.kubernetes.io/zone label. This seems to be what is discussed in #962, which was closed due to no activity.

ConnorJC3 · 2023-03-06T16:34:59Z

Ah, that is correct, I will reopen this, sorry. Will look into fixing this in the near-ish future completely.

k8s-triage-robot · 2023-06-04T16:54:41Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

z0rc · 2023-06-05T21:27:37Z

/remove-lifecycle stale

k8s-triage-robot · 2024-01-21T23:27:01Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

z0rc · 2024-01-22T10:23:44Z

/remove-lifecycle stale

ConnorJC3 · 2024-02-13T17:08:30Z

/lifecycle frozen

Hi, sorry about the delay on this, the team has decided on a plan:

We do want to move over to the standard labels, but don't want to immediately do so as it breaks downgrading and is a minor breaking change. Thus, the plan is:

Immediately change the node to report both our non-standard label as well as the standard label. This enables forwards compatibility for a future release when the topology on the volume is switched over (being worked on in Report zone via well-known topology key in NodeGetInfo #1931)
Announce the intent to change the topology label for created volumes to the standard label in a future release in the next release of the EBS CSI Driver (currently planned for mid/late February)
In some future release (exact timeline TBD, but likely 2-3 monthly releases after the announcement), switch the driver over to report the standard label for all created volumes instead of the non-standard label. After upgrading to this version, users will only be able to downgrade to the February release or later (or else their volumes will not be able to schedule).

lgfa29 · 2024-03-07T16:27:36Z

Hi @ConnorJC3 👋

I am engineer on the Nomad team and I think #1931 may have created an irreconcilable topology.

With v1.28.0, the created volume only has the topology segment "topology.ebs.csi.aws.com/zone": "eu-west-2a", while the node has both domains in the same segment. Since segments are compared with AND it means that volumes are never able to be scheduled.

On step 3, if the volumes only receive the new label, would the non-standard label be removed from node as well? Otherwise the irreconcilable topology would still be a problem.

Apologies for my ignorance on this, but, currently, where is the standard label applied the the volume on creation? Otherwise I would imagine the same problem would be happening on Kubernetes clusters?

Thank you!

ConnorJC3 · 2024-03-07T18:00:36Z

Hi @lgfa29 - My understanding of the CSI spec is that a volume with key topology.ebs.csi.aws.com/zone should match a node with keys topology.ebs.csi.aws.com/zone and topology.kubernetes.io/zone (as long as the value for topology.ebs.csi.aws.com/zone matches). In other words, essentially extra key(s) on the node should not be considered a mismatch.

This is also how it is implemented in Kubernetes: https://github.com/kubernetes/kubernetes/blob/909faa3a9b33161992af0684c5582353b5be6d9b/pkg/scheduler/framework/plugins/volumezone/volume_zone.go#L187-L193

// ... The requirement is that any
// volume zone-labels must match the equivalent zone-labels on the node.  It is OK for
// the node to have more zone-label constraints (for example, a hypothetical replicated
// volume might allow region-wide access)

On step 3, if the volumes only receive the new label, would the non-standard label be removed from node as well? Otherwise the irreconcilable topology would still be a problem.

The intent is to serve both labels from the node indefinitely, to support volumes created with the old topology key to continue to work on new versions of the EBS CSI Driver.

Apologies for my ignorance on this, but, currently, where is the standard label applied the the volume on creation? Otherwise

Currently volumes receive topology.ebs.csi.aws.com/zone upon creation. We intend at some future date to switch this to the now standard label topology.kubernetes.io/zone.

lgfa29 · 2024-03-07T19:14:34Z

Thank you for the extra information @ConnorJC3.

I've opened hashicorp/nomad#20094 in the Nomad repo for us to track this.

This is the latest version where the topology key is still topology.ebs.csi.aws.com/zone. Version 1.33.0 switched to the well-known topology.kubernetes.io/zone, and is a breaking change that we'll have to plan for. kubernetes-sigs#729 (comment)

ConnorJC3 · 2024-08-06T13:48:11Z

/close

As of v1.33.0 of the EBS CSI Driver, the driver creates volumes with the AZ topology label topology.kubernetes.io/zone. As a result, the three step plan from #729 (comment) is completed. See the CHANGELOG for more info.

k8s-ci-robot · 2024-08-06T13:48:16Z

@ConnorJC3: Closing this issue.

In response to this:

/close

As of v1.33.0 of the EBS CSI Driver, the driver creates volumes with the AZ topology label topology.kubernetes.io/zone. As a result, the three step plan from #729 (comment) is completed.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

TBBle mentioned this issue Feb 6, 2021

Ignore topology.ebs.csi.aws.com/zone label for Balance-Similar-NodeGroups kubernetes/autoscaler#3230

Closed

k8s-ci-robot assigned ayberk Feb 17, 2021

ayberk mentioned this issue Mar 2, 2021

Add well-known topology label #773

Merged

tirumerla mentioned this issue Apr 23, 2021

Running into no topology key found on CSINode with 0.10.2 #848

Closed

Elbehery mentioned this issue May 24, 2021

[sig-storage][csi-translation] Duplicate zone information for in-tree PV kubernetes/kubernetes#102248

Closed

This was referenced Jun 15, 2021

Scale from zero fails to work on k8s >= 1.18 AWS gardener/autoscaler#83

Closed

Create volumes with one topology label, not both #899

Closed

wongma7 mentioned this issue Jul 2, 2021

Configurable topology labels #962

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 5, 2021

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 6, 2021

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 4, 2021

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 5, 2021

elmiko mentioned this issue Nov 2, 2021

Bug 2001027: add topology.ebs.csi.aws.com/zone to ignored labels openshift/cluster-autoscaler-operator#224

Closed

TBBle mentioned this issue Nov 19, 2021

[EKS] [request]: Nodegroup should support tagging ASGs aws/containers-roadmap#608

Open

k8s-ci-robot closed this as completed Mar 2, 2023

ConnorJC3 reopened this Mar 6, 2023

tjegorow-c8y mentioned this issue Mar 16, 2023

Configure custom topologyKey for podAntiAffinity apache/pulsar-helm-chart#351

Merged

1 task

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 4, 2023

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 5, 2023

elankath mentioned this issue Jul 19, 2023

enabled volume test for GCP scale-up from zero gardener/autoscaler#244

Merged

Nuru mentioned this issue Jul 30, 2023

Please document allowedTopologies #1701

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 21, 2024

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 22, 2024

k8s-ci-robot added the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Feb 13, 2024

ConnorJC3 mentioned this issue Feb 13, 2024

Report zone via well-known topology key in NodeGetInfo #1931

Merged

lgfa29 mentioned this issue Mar 7, 2024

csi: aws-ebs-csi plugin v1.28.0 fails to place allocations hashicorp/nomad#20094

Closed

ConnorJC3 mentioned this issue Jul 12, 2024

Migrate CreateVolume response topology to standard label topology.kubernetes.io/zone #2086

Merged

k8s-ci-robot closed this as completed Aug 6, 2024

dfajmon mentioned this issue Sep 5, 2024

STOR-2016: Rebase to upstream v1.34.0 for OCP 4.18 openshift/aws-ebs-csi-driver#276

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it possible to use the standard k8s topology labels? #729

Is it possible to use the standard k8s topology labels? #729

TBBle commented Feb 6, 2021 •

edited

Loading

ayberk commented Feb 16, 2021

leakingtapan commented Feb 16, 2021 •

edited

Loading

leakingtapan commented Feb 16, 2021

ayberk commented Feb 17, 2021

ayberk commented Feb 26, 2021

wongma7 commented Feb 27, 2021

TBBle commented Mar 28, 2021

ayberk commented Mar 29, 2021

denniswebb commented Apr 3, 2021

ayberk commented Apr 6, 2021

fejta-bot commented Jul 5, 2021

wongma7 commented Jul 6, 2021

k8s-triage-robot commented Oct 4, 2021

k1rk commented Oct 5, 2021

ConnorJC3 commented Mar 2, 2023

k8s-ci-robot commented Mar 2, 2023

jbg commented Mar 6, 2023 •

edited

Loading

ConnorJC3 commented Mar 6, 2023

k8s-triage-robot commented Jun 4, 2023

z0rc commented Jun 5, 2023

k8s-triage-robot commented Jan 21, 2024

z0rc commented Jan 22, 2024

ConnorJC3 commented Feb 13, 2024 •

edited

Loading

lgfa29 commented Mar 7, 2024

ConnorJC3 commented Mar 7, 2024

lgfa29 commented Mar 7, 2024

ConnorJC3 commented Aug 6, 2024 •

edited

Loading

k8s-ci-robot commented Aug 6, 2024

Is it possible to use the standard k8s topology labels? #729

Is it possible to use the standard k8s topology labels? #729

Comments

TBBle commented Feb 6, 2021 • edited Loading

ayberk commented Feb 16, 2021

leakingtapan commented Feb 16, 2021 • edited Loading

leakingtapan commented Feb 16, 2021

ayberk commented Feb 17, 2021

ayberk commented Feb 26, 2021

wongma7 commented Feb 27, 2021

TBBle commented Mar 28, 2021

ayberk commented Mar 29, 2021

denniswebb commented Apr 3, 2021

ayberk commented Apr 6, 2021

fejta-bot commented Jul 5, 2021

wongma7 commented Jul 6, 2021

k8s-triage-robot commented Oct 4, 2021

k1rk commented Oct 5, 2021

ConnorJC3 commented Mar 2, 2023

k8s-ci-robot commented Mar 2, 2023

jbg commented Mar 6, 2023 • edited Loading

ConnorJC3 commented Mar 6, 2023

k8s-triage-robot commented Jun 4, 2023

z0rc commented Jun 5, 2023

k8s-triage-robot commented Jan 21, 2024

z0rc commented Jan 22, 2024

ConnorJC3 commented Feb 13, 2024 • edited Loading

lgfa29 commented Mar 7, 2024

ConnorJC3 commented Mar 7, 2024

lgfa29 commented Mar 7, 2024

ConnorJC3 commented Aug 6, 2024 • edited Loading

k8s-ci-robot commented Aug 6, 2024

TBBle commented Feb 6, 2021 •

edited

Loading

leakingtapan commented Feb 16, 2021 •

edited

Loading

jbg commented Mar 6, 2023 •

edited

Loading

ConnorJC3 commented Feb 13, 2024 •

edited

Loading

ConnorJC3 commented Aug 6, 2024 •

edited

Loading