feat(scheduling): add volume group capacity tracking #21

iyashu · 2021-02-11T20:30:38Z

Why is this PR required? What issue does it fix?:
With k8s version >= 1.19, there is feature(alpha stage) added where the kube-scheduler gets influenced by storage capacity available on nodes during filtering stage. Without CSIStorageCapacity tracking, a pod with a volume that uses delayed binding may get scheduled multiple times, but might always land on the same node unless there are multiple nodes with equal priority. More details are mentioned in KEP 1472 & external provisioner documentation.

What this PR does?:
We introduced a new custom resource called LVMNode recording all the available vgs and its corresponding attributes in each node. Openebs node plugin running on each node will periodically scan vgs and reconcile LVMNode resource. At controller side, we've implemented CSI GetCapacity method called by k8s external-provisioner for creating/updating CSIStorageCapacity objects which in turn used by kube-scheduler for each node topology. K8s external-provisioner calls GetCapacity for every combination of storage class and node topology segments (individual nodes in our case). So, volume group configured in storage class is also a part of args in GetCapacity method. In case, there are multiple volume group (with multi pool support) available on nodes, we choose the volume group having maximum free capacity & its name matching to volgroup parameter of the GetCapacity call.

Does this PR require any upgrade changes?:
No

If the changes in this PR are manually verified, list down the scenarios covered::
Consider a kubernetes cluster having 5 worker nodes (besides master ones) each having a storage capacity of size 32 Gi. Create a stateful set cluster say sts-a of size 4 each requesting storage capacity(pvc) of size 20Gi. Now create another stateful set cluster say sts-b of size 1 again requesting storage capacity(pvc) of size 20Gi. We'll see that pod in sts-b will be scheduled (by kube-scheduler) on right node having enough capacity.

Any additional information for your reviewer? :
Mention if this PR is part of any design or a continuation of previous PRs

Checklist:

Fixes #
PR Title follows the convention of <type>(<scope>): <subject>
Has the change log section been updated?
Commit has unit tests
Commit has integration tests
(Optional) Are upgrade changes included in this PR? If not, mention the issue/PR to track:
(Optional) If documentation changes are required, which issue on https://github.com/openebs/openebs-docs is used to track them:

codecov-io · 2021-02-12T09:03:22Z

Codecov Report

Merging #21 (ef941f5) into master (311e12b) will decrease coverage by 0.11%.
The diff coverage is 0.00%.

@@            Coverage Diff            @@
##           master     #21      +/-   ##
=========================================
- Coverage    1.20%   1.08%   -0.12%     
=========================================
  Files          11      11              
  Lines         831     920      +89     
=========================================
  Hits           10      10              
- Misses        821     910      +89

Impacted Files	Coverage Δ
pkg/driver/agent.go	`0.00% <0.00%> (ø)`
pkg/driver/controller.go	`0.63% <0.00%> (-0.15%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 311e12b...ef941f5. Read the comment docs.

iyashu · 2021-02-16T10:38:42Z

Gentle reminder @pawanpraka1 @akhilerm

pkg/apis/openebs.io/lvm/v1alpha1/lvmnode.go

pkg/driver/agent.go

pkg/driver/controller.go

pkg/lvm/lvm_util.go

pkg/apis/openebs.io/lvm/v1alpha1/register.go

prateekpandey14 · 2021-03-02T06:53:47Z

deploy/lvm-operator.yaml

+# to generate the CRD definition
+
+---
+apiVersion: apiextensions.k8s.io/v1beta1


we can start using v1 CRDs version ..v1beta going to be deprecated in k8s 1.22

Yeah, I had initially configured the same, then saw that other crds (LVMSnapshot, LVMVolume etc) are using v1beta1. I think, we can take this up as part of separate pull request as it'll required changes to other crds as well.

prateekpandey14 · 2021-03-02T06:54:57Z

deploy/lvm-operator.yaml

+kind: CustomResourceDefinition
+metadata:
+  annotations:
+    controller-gen.kubebuilder.io/version: v0.2.8


We can start using the controller-gen verison v0.4.0 version which will basically generates the v1 CRD version by default.

pkg/mgmt/lvmnode/builder.go

pkg/mgmt/lvmnode/lvmnode.go

Signed-off-by: Yashpal Choudhary <[email protected]>

akhilerm

LGTM

prateekpandey14

LGTM, some of the changes like CRDs version etc going to be fixed in upcoming PRs

pawanpraka1

@iyashu why do we need watcher for lvmnode object? does node daemonset need to take any action as per modification/updation of lvmnode object?

pkg/driver/agent.go

Signed-off-by: Yashpal Choudhary <[email protected]>

iyashu force-pushed the storage-capacity branch from 48285d7 to ef941f5 Compare February 12, 2021 08:56

akhilerm requested review from akhilerm and pawanpraka1 February 16, 2021 10:41

akhilerm reviewed Feb 17, 2021

View reviewed changes

pkg/apis/openebs.io/lvm/v1alpha1/lvmnode.go Show resolved Hide resolved

akhilerm reviewed Feb 17, 2021

View reviewed changes

pkg/driver/agent.go Show resolved Hide resolved

akhilerm reviewed Feb 17, 2021

View reviewed changes

pkg/driver/controller.go Outdated Show resolved Hide resolved

iyashu force-pushed the storage-capacity branch from 4bdd2cf to 263467b Compare February 17, 2021 13:12

pawanpraka1 added this to the v0.3 milestone Feb 17, 2021

pawanpraka1 added the pr/community label Feb 17, 2021

akhilerm reviewed Feb 18, 2021

View reviewed changes

pkg/lvm/lvm_util.go Show resolved Hide resolved

akhilerm reviewed Feb 18, 2021

View reviewed changes

pkg/lvm/lvm_util.go Show resolved Hide resolved

pawanpraka1 requested a review from prateekpandey14 February 19, 2021 07:25

prateekpandey14 reviewed Feb 23, 2021

View reviewed changes

pkg/apis/openebs.io/lvm/v1alpha1/register.go Outdated Show resolved Hide resolved

prateekpandey14 reviewed Mar 2, 2021

View reviewed changes

pkg/mgmt/lvmnode/builder.go Outdated Show resolved Hide resolved

iyashu force-pushed the storage-capacity branch 2 times, most recently from 878542c to e401866 Compare March 2, 2021 13:14

iyashu mentioned this pull request Mar 3, 2021

feat(provisioning): add support for multiple vg to use for provisioning #28

Merged

7 tasks

prateekpandey14 reviewed Mar 5, 2021

View reviewed changes

pkg/mgmt/lvmnode/lvmnode.go Outdated Show resolved Hide resolved

iyashu mentioned this pull request Mar 5, 2021

volume is in pendig state in case of waitforfirstconsumer #31

Closed

feat(scheduling): add volume group capacity tracking

bf82f9f

Signed-off-by: Yashpal Choudhary <[email protected]>

iyashu force-pushed the storage-capacity branch from e401866 to bf82f9f Compare March 5, 2021 18:26

akhilerm approved these changes Mar 8, 2021

View reviewed changes

prateekpandey14 approved these changes Mar 8, 2021

View reviewed changes

pawanpraka1 reviewed Mar 9, 2021

View reviewed changes

pkg/driver/agent.go Show resolved Hide resolved

pawanpraka1 merged commit 3df971c into openebs:master Mar 10, 2021

akhilerm pushed a commit to akhilerm/lvm-localpv that referenced this pull request Mar 10, 2021

feat(scheduling): add volume group capacity tracking (openebs#21)

736ca0c

Signed-off-by: Yashpal Choudhary <[email protected]>

akhilerm mentioned this pull request Mar 10, 2021

feat(scheduling): add volume group capacity tracking (#21) #33

Merged

pawanpraka1 pushed a commit that referenced this pull request Mar 10, 2021

feat(scheduling): add volume group capacity tracking (#21) (#33)

d5d13cb

Signed-off-by: Yashpal Choudhary <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(scheduling): add volume group capacity tracking #21

feat(scheduling): add volume group capacity tracking #21

iyashu commented Feb 11, 2021 •

edited

Loading

codecov-io commented Feb 12, 2021

iyashu commented Feb 16, 2021

prateekpandey14 Mar 2, 2021

iyashu Mar 2, 2021

prateekpandey14 Mar 2, 2021

akhilerm left a comment

prateekpandey14 left a comment

pawanpraka1 left a comment

feat(scheduling): add volume group capacity tracking #21

feat(scheduling): add volume group capacity tracking #21

Conversation

iyashu commented Feb 11, 2021 • edited Loading

codecov-io commented Feb 12, 2021

Codecov Report

iyashu commented Feb 16, 2021

prateekpandey14 Mar 2, 2021

Choose a reason for hiding this comment

iyashu Mar 2, 2021

Choose a reason for hiding this comment

prateekpandey14 Mar 2, 2021

Choose a reason for hiding this comment

akhilerm left a comment

Choose a reason for hiding this comment

prateekpandey14 left a comment

Choose a reason for hiding this comment

pawanpraka1 left a comment

Choose a reason for hiding this comment

iyashu commented Feb 11, 2021 •

edited

Loading