Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(scheduling): add volume group capacity tracking #21

Merged
merged 1 commit into from
Mar 10, 2021

Conversation

iyashu
Copy link
Contributor

@iyashu iyashu commented Feb 11, 2021

Why is this PR required? What issue does it fix?:
With k8s version >= 1.19, there is feature(alpha stage) added where the kube-scheduler gets influenced by storage capacity available on nodes during filtering stage. Without CSIStorageCapacity tracking, a pod with a volume that uses delayed binding may get scheduled multiple times, but might always land on the same node unless there are multiple nodes with equal priority. More details are mentioned in KEP 1472 & external provisioner documentation.

What this PR does?:
We introduced a new custom resource called LVMNode recording all the available vgs and its corresponding attributes in each node. Openebs node plugin running on each node will periodically scan vgs and reconcile LVMNode resource. At controller side, we've implemented CSI GetCapacity method called by k8s external-provisioner for creating/updating CSIStorageCapacity objects which in turn used by kube-scheduler for each node topology. K8s external-provisioner calls GetCapacity for every combination of storage class and node topology segments (individual nodes in our case). So, volume group configured in storage class is also a part of args in GetCapacity method. In case, there are multiple volume group (with multi pool support) available on nodes, we choose the volume group having maximum free capacity & its name matching to volgroup parameter of the GetCapacity call.

Does this PR require any upgrade changes?:
No

If the changes in this PR are manually verified, list down the scenarios covered::
Consider a kubernetes cluster having 5 worker nodes (besides master ones) each having a storage capacity of size 32 Gi. Create a stateful set cluster say sts-a of size 4 each requesting storage capacity(pvc) of size 20Gi. Now create another stateful set cluster say sts-b of size 1 again requesting storage capacity(pvc) of size 20Gi. We'll see that pod in sts-b will be scheduled (by kube-scheduler) on right node having enough capacity.

Any additional information for your reviewer? :
Mention if this PR is part of any design or a continuation of previous PRs

Checklist:

  • Fixes #
  • PR Title follows the convention of <type>(<scope>): <subject>
  • Has the change log section been updated?
  • Commit has unit tests
  • Commit has integration tests
  • (Optional) Are upgrade changes included in this PR? If not, mention the issue/PR to track:
  • (Optional) If documentation changes are required, which issue on https://github.com/openebs/openebs-docs is used to track them:

@codecov-io
Copy link

Codecov Report

Merging #21 (ef941f5) into master (311e12b) will decrease coverage by 0.11%.
The diff coverage is 0.00%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master     #21      +/-   ##
=========================================
- Coverage    1.20%   1.08%   -0.12%     
=========================================
  Files          11      11              
  Lines         831     920      +89     
=========================================
  Hits           10      10              
- Misses        821     910      +89     
Impacted Files Coverage Δ
pkg/driver/agent.go 0.00% <0.00%> (ø)
pkg/driver/controller.go 0.63% <0.00%> (-0.15%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 311e12b...ef941f5. Read the comment docs.

@iyashu
Copy link
Contributor Author

iyashu commented Feb 16, 2021

Gentle reminder @pawanpraka1 @akhilerm

pkg/driver/controller.go Outdated Show resolved Hide resolved
@pawanpraka1 pawanpraka1 added this to the v0.3 milestone Feb 17, 2021
# to generate the CRD definition

---
apiVersion: apiextensions.k8s.io/v1beta1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can start using v1 CRDs version ..v1beta going to be deprecated in k8s 1.22

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I had initially configured the same, then saw that other crds (LVMSnapshot, LVMVolume etc) are using v1beta1. I think, we can take this up as part of separate pull request as it'll required changes to other crds as well.

kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.2.8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can start using the controller-gen verison v0.4.0 version which will basically generates the v1 CRD version by default.

@iyashu iyashu force-pushed the storage-capacity branch 2 times, most recently from 878542c to e401866 Compare March 2, 2021 13:14
@iyashu iyashu force-pushed the storage-capacity branch from e401866 to bf82f9f Compare March 5, 2021 18:26
Copy link
Contributor

@akhilerm akhilerm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@prateekpandey14 prateekpandey14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, some of the changes like CRDs version etc going to be fixed in upcoming PRs

Copy link
Contributor

@pawanpraka1 pawanpraka1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iyashu why do we need watcher for lvmnode object? does node daemonset need to take any action as per modification/updation of lvmnode object?

pkg/driver/agent.go Show resolved Hide resolved
@pawanpraka1 pawanpraka1 merged commit 3df971c into openebs:master Mar 10, 2021
akhilerm pushed a commit to akhilerm/lvm-localpv that referenced this pull request Mar 10, 2021
pawanpraka1 pushed a commit that referenced this pull request Mar 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants