Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(provisioning): add support for multiple vg to use for provisioning #28

Merged
merged 1 commit into from
Apr 15, 2021

Conversation

iyashu
Copy link
Contributor

@iyashu iyashu commented Mar 3, 2021

Why is this PR required? What issue does it fix?:
See #17 for more details.

What this PR does?:
Since the volgroup param now represents the regex, controller pass the same to node plugin by setting up an additional field called VolGroupRegex in lvmvolume resource. Node plugin controller will then chose a vg matching the provided regex and sets up the VolGroup field under lvmvolume. If there are multiple vgs matching the regex, node plugin controller will choose the one having minimum free space (bin packing) available to accommodate that volume capacity.

Does this PR require any upgrade changes?:

If the changes in this PR are manually verified, list down the scenarios covered::
Consider a k8s cluster having 4 nodes with vgs [lvmvg-a, lvmvg-b, lvmvg-a, xyzvg]. Configure the storage class by setting volgroup parameter as lvmvg* (regex denoting a lvmvg prefix). After creating a stateful set of size 4, we'll see each pvc gets scheduled on first 3 nodes (since xyzvg doesn't matches the volgroup regex).

Any additional information for your reviewer? :
Mention if this PR is part of any design or a continuation of previous PRs
This pull request is dependent on #21. So, we need to close that first before closing this.

Checklist:

@akhilerm akhilerm requested review from pawanpraka1 and akhilerm and removed request for pawanpraka1 March 12, 2021 07:33
@pawanpraka1 pawanpraka1 added this to the v0.4 milestone Mar 17, 2021
@@ -62,9 +62,14 @@ type VolumeInfo struct {

// VolGroup specifies the name of the volume group where the volume has been created.
// +kubebuilder:validation:Required
// +kubebuilder:validation:MinLength=1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the required option and add omitempty, json:"volGroup,omitempty".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see VolumeInfo struct is also being used in LVMSnapShot crd. I was not sure if removing the required validation has any effect on it. Can you help me to understand what required validation does for string type? Does api-server check for string being non-empty in case of required? I'm not able to find any official documentation about the same.

@@ -250,7 +249,7 @@ func CreateLVMVolume(ctx context.Context, req *csi.CreateVolumeRequest,
}
}

nmap, err := getNodeMap(params.Scheduler, params.VolumeGroup)
nmap, err := getNodeMap(params.Scheduler, params.VgRegex)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should using VolumeGroup is VolRegex is empty here. what will happen if someone upgraded and having storageclass with volgroup only.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we not introducing any other storage class param, but rather using the same volgroup param. See the other related comment below.

pkg/driver/params.go Outdated Show resolved Hide resolved
@@ -130,6 +130,11 @@ func CreateVolume(vol *apis.LVMVolume) error {

// DestroyVolume deletes the lvm volume
func DestroyVolume(vol *apis.LVMVolume) error {
if vol.Spec.VolGroup == "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can it happen?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be in one corner scenario where deletion of lvmvol happens just after creation but before provisioning. e.g our creation workflow gets failed due to some reason & gets retried after some time but deletion event gets triggered in between.

params.VolumeGroup = m["volgroup"]

var err error
if params.VgRegex, err = regexp.Compile(m["volgroup"]); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we use volgroupregex here? why we are using volgroup key here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With multi-pool support, I think it's not required to introduce another storage class parameter. Here we are using the same existing volgroup parameter, but with slight different semantics i.e now the value will be treated as regex. If the values doesn't contains any special regex characters, then existing semantics remain same (i.e exact match).

@pawanpraka1
Copy link
Contributor

pawanpraka1 commented Mar 19, 2021

@iyashu let me know what do you think about this approach :

  1. have same paramater volgroup in the storageclass which can be used for both regx and for providing volgroup name as well
  2. now same parameter will be used in the LVMVolume CRD (VolGroup) to pass to the node daemonset by the controller
  3. node daemonset will find the volume group on the node based on whether it is a regx or full name
  4. once it finds the best volume group for provisioning the volume, it will update LVMVolume CRD with that volume group name(update the VolGroup paramenter), also we can update the regx in the annotation on this LVMVolume CRD for debug/other purpose.
  5. now we don't need to have one extra feild and same volGroup can be used to provide the regx also.

@iyashu
Copy link
Contributor Author

iyashu commented Mar 19, 2021

  1. once it finds the best volume group for provisioning the volume, it will update LVMVolume CRD with that volume group name(update the VolGroup paramenter), also we can update the regx in the annotation on this LVMVolume CRD for debug/other purpose.

@pawanpraka1 there are some failure recovery scenarios missing in above approach. Say, we provisioned the volume as per the VolGroup regex attribute, but failed to update the same (say kube-api-server failing intermittently) in api-server. In that case, controller retry may end up creating new lv on some other vg. Although, we can check if that lv is being created already on any vg or not by parsing lvs output, but it'll unnecessary add complexity to maintain idempotency in creation as well as deletion workflow of control loop. Also, putting regex in the annotation may not be a right k8s crd development approach since annotations are not designed for storing the spec, but just for metadata. Here volRegex is the spec (intend) from controller to the node agent to provision the volume rather some additional metadata.

It may not be good idea to represent different semantics with same attribute in the crd. First we are considering volGroup as intend (regex) for provisioning the volume and after provisioned we are considering it as output. Since this attribute is used across the project (including the snapshotting LVMSnapshot crd), It'll just confuse us wherever we are going to reference like whether it should be treated as intend or output & one needs to check the status field (i.e if its ready or not) every time to assert it.

With current implementation, ideally i see we can move volGroup field from spec to status in future once we do some refactoring specially de-coupling snapshot crd with lvm volume spec. But till then we can just mentally think of it as part of status.

@pawanpraka1
Copy link
Contributor

In that case, controller retry may end up creating new lv on some other vg.

We should handle this case as this can happen with the current approach also when it failed to update the status to Ready.

We need to keep the storageclass parameters in sync with the lvm volume CRDs. I agree we should not have different semantics for the same argument and we can solve that by passing regx in the annotaton as this is a meta info for lvm volume and keep the volgroup as empty until node agent fills that info.

@iyashu
Copy link
Contributor Author

iyashu commented Mar 30, 2021

@pawanpraka1 updated the pull request as per our discussion today, ptal.

@pawanpraka1
Copy link
Contributor

@iyashu can you update the storageclass parameter name to vgpattern?

@iyashu
Copy link
Contributor Author

iyashu commented Apr 11, 2021

@iyashu can you update the storageclass parameter name to vgpattern?

@pawanpraka1 done!

return err
}
if err = lvm.CreateVolume(vol); err == nil {
return lvm.UpdateVolInfo(vol, lvm.LVMStatusReady)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there could be a leak here if UpdateVolInfo fails, and it pick a new vg for creating the volume. We need to fix it @iyashu

Copy link
Contributor Author

@iyashu iyashu Apr 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pawanpraka1 I think there will not be any leak here. See we are setting up the vg name first in the spec using lvm.UpdateVolGroup(vol, vg.Name) at line 124 above before creating the lv. Lets say UpdateVolInfo fails here, in the next sync iteration, you can see we first check if the vg is set in the spec and accordingly call lvm.CreateVolume with same vg and UpdateVolInfo (see the line 104-109). As a result, we'll keep on retrying the UpdateVolInfo if it keeps on failing with backoff set by workqueue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good. Thanks @iyashu

Copy link
Contributor

@pawanpraka1 pawanpraka1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good.

@pawanpraka1 pawanpraka1 modified the milestones: v0.4, v0.5 Apr 15, 2021
@pawanpraka1 pawanpraka1 merged commit 4465d84 into openebs:master Apr 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

support for multiple volume group within a given node
2 participants