Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support unclear stale pvc #167

Merged
merged 4 commits into from
Nov 29, 2023

Conversation

njuptlzf
Copy link
Contributor

@njuptlzf njuptlzf commented Oct 9, 2023

Pull Request template

Why is this PR required? What issue does it fix?:
fix: #166

What this PR does?:
Provide a switch to turn off the function of clearing stale pvc to avoid being cleared by mistake after restoring stale pvc.

Does this PR require any upgrade changes?:
If you need to turn off the clear stale pvc function, you need to provide environment variables (enabled by default)

        # The NFSGarbageCollectionEnable environment variable is the switch for the garbage collector.(default true)
        - name: OPENEBS_IO_NFS_SERVER_GARBAGE_COLLECTION_ENABLED
          value: "false"

If the changes in this PR are manually verified, list down the scenarios covered::

Any additional information for your reviewer? :
Mention if this PR is part of any design or a continuation of previous PRs

Checklist:

PLEASE REMOVE BELOW INFORMATION BEFORE SUBMITTING

The PR title message must follow convention:
<type>(<scope>): <subject>.

Where:
Most common types are:
* feat - for new features, not a new feature for build script
* fix - for bug fixes or improvements, not a fix for build script
* chore - changes not related to production code
* docs - changes related to documentation
* style - formatting, missing semi colons, linting fix etc; no significant production code changes
* test - adding missing tests, refactoring tests; no production code change
* refactor - refactoring production code, eg. renaming a variable or function name, there should not be any significant production code changes
* cherry-pick - if PR is merged in master branch and raised to release branch(like v0.4.x)

IMPORTANT: Please review the CONTRIBUTING.md file for detailed contributing guidelines.

@njuptlzf njuptlzf force-pushed the cleanup_stale_pvc_enable branch from 199a938 to e99bc67 Compare October 9, 2023 03:58
@njuptlzf
Copy link
Contributor Author

njuptlzf commented Oct 9, 2023

1. My backup method

kubectl exec -n velero $(kubectl get po -n velero -l component=velero -oname | head -n 1) -it -- /velero backup create --include-namespaces=openebs --selector=nfs.openebs.io /nfs-pvc-name=nfs-pvc bk-pv-1

2. Create Disaster

# kubectl delete ns nginx-example

3. My way to restore

  1. Restart nfs-provisioner pod to turn off stale pvc automatic cleaning
         - name: OPENEBS_IO_NFS_SERVER_GARBAGE_COLLECTION_ENABLED
           value: "false"
  1. Restore the rwo volume first
kubectl exec -n velero $(kubectl get po -n velero -l component=velero -oname | head -n 1) -it -- /velero restore create --restore-volumes=true --from-backup bk-pv-1

With this step, I might even take a nap... : )
3. Deploy the app and create a new rwx volume (it will automatically create a new nfs-pvc pod, svc)

kubectl apply -f with-rxw-pvc.yaml
  1. Modify the nfs service of openebs and change the selector to the old nfs-pvc name to use the old rwo volume
  2. Restart the pod that mounts the nfs rwx volume to remount the rwo volume with data
  3. Restart nfs-provisioner pod to enable automatic cleanup of stale pvc
         - name: OPENEBS_IO_NFS_SERVER_GARBAGE_COLLECTION_ENABLED
           value: "true"

@njuptlzf
Copy link
Contributor Author

njuptlzf commented Oct 9, 2023

@niladrih ptal. : )

@niladrih
Copy link
Member

niladrih commented Nov 8, 2023

Hi @njuptlzf, I can see a CI check failing, could you take a look at this?

@lizhifengones
Copy link

Hi @njuptlzf, I can see a CI check failing, could you take a look at this?

I think it may be that the my golang version (the golang version I use go.mod) is inconsistent with the ci golang version(Still can't find it by looking at the ci log), causing go fmt to fail. But I have executed it myself and there is no problem. I will continue to output the same content with ci and correct it.

@@ -58,6 +58,9 @@ const (
// NFSBackendPvcTimeout defines env name to store BackendPvcBoundTimeout value
NFSBackendPvcTimeout menv.ENVKey = "OPENEBS_IO_NFS_SERVER_BACKEND_PVC_TIMEOUT"

// NFSCleanUpStalePvcEnable defines env name to store the value of clean stale pvc enable
NFSCleanUpStalePvcEnable menv.ENVKey = "OPENEBS_IO_NFS_SERVER_CLEANUP_STALE_PVC_ENABLE"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This flag essentially enables/disables garbage collection. Even though the garbage collection currently removes stale PVCs, maybe the flag should be instead called OPENEBS_IO_NFS_SERVER_GARBAGE_COLLECTION_ENABLED? Then in case the logic of what the garbage collector does is changed, this flag won't be misleading :)

Copy link
Contributor Author

@njuptlzf njuptlzf Nov 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though the garbage collection currently removes stale PVCs, maybe the flag should be instead called OPENEBS_IO_NFS_SERVER_GARBAGE_COLLECTION_ENABLED?

sounds good

Then in case the logic of what the garbage collector does is changed, this flag won't be misleading

Yes, when the logic of cleanup is optimized, we actually do not need to turn off recycling. Currently, only for data recovery scenarios, completely turning off can solve my problem.

Signed-off-by: Kasakaze <[email protected]>
@njuptlzf njuptlzf force-pushed the cleanup_stale_pvc_enable branch from 26ae9f1 to 02840d0 Compare November 20, 2023 15:54
@njuptlzf
Copy link
Contributor Author

njuptlzf commented Nov 20, 2023

Sorry, I took too long to respond to this PR due to other things.
I just checked my local code. It was me debugging make test that caused the problem. It has been solved. @niladrih :)

@njuptlzf njuptlzf requested a review from ryshoooo November 20, 2023 16:17
@njuptlzf
Copy link
Contributor Author

@niladrih ptal. :)

@codecov-commenter
Copy link

Codecov Report

Attention: 10 lines in your changes are missing coverage. Please review.

Comparison is base (5f0b16d) 50.17% compared to head (65cb5c8) 50.01%.
Report is 8 commits behind head on develop.

Files Patch % Lines
provisioner/provisioner.go 0.00% 9 Missing ⚠️
provisioner/env.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop     #167      +/-   ##
===========================================
- Coverage    50.17%   50.01%   -0.17%     
===========================================
  Files           38       38              
  Lines         2792     2801       +9     
===========================================
  Hits          1401     1401              
- Misses        1285     1294       +9     
  Partials       106      106              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@niladrih niladrih self-requested a review November 22, 2023 05:50
@niladrih
Copy link
Member

@njuptlzf -- This file https://github.com/openebs/dynamic-nfs-provisioner/blob/develop/deploy/kubectl/openebs-nfs-provisioner.yaml needs to have to ENV in a commented state.

        env:
        #   OPENEBS_IO_K8S_MASTER enables openebs provisioner to connect to K8s
        #   based on this address. This is ignored if empty.
        #   This is supported for openebs provisioner version 0.5.2 onwards
        # - name: OPENEBS_IO_K8S_MASTER
        #   value: "http://10.128.0.12:8080"
        #   OPENEBS_IO_KUBE_CONFIG enables openebs provisioner to connect to K8s
        #   based on this config. This is ignored if empty.
        #   This is supported for openebs provisioner version 0.5.2 onwards
        # - name: OPENEBS_IO_KUBE_CONFIG
        #   value: "/home/ubuntu/.kube/config"
        #   OPENEBS_IO_NFS_SERVER_NODE_AFFINITY defines the node affinity rules to place NFS Server
        #   instance. It accepts affinity rules in multiple ways:
        #   - If NFS Server needs to be placed on storage nodes as well as only in
        #     zone-1 & zone-2 then value can be:
        #     value:  "kubernetes.io/zone:[zone-1,zone-2],kubernetes.io/storage-node".
        #   - If NFS Server needs to be placed only on storage nodes & nfs nodes then
        #     value can be:
        #     value:  "kubernetes.io/storage-node,kubernetes.io/nfs-node"
        # - name: OPENEBS_IO_NFS_SERVER_NODE_AFFINITY
        #   value: "kubernetes.io/storage-node,kubernetes.io/nfs-node"
        #
        #   Provide a switch to turn off the function of clearing stale pvc to avoid
        #.  garbage collecting an NFS backend PVC if the NFS PVC is deleted.
        # - name: OPENEBS_IO_NFS_SERVER_GARBAGE_COLLECTION_ENABLED
        #.  value: false

You may want to add the change in the helm chart as well, but that needs to be in a separate PR. We merge all helm chart changes at release-time, and I'd be able to pull in your change if you raise a commit with the helm chart change separately.

Unique use-case! Thank you for the contribution.

@njuptlzf
Copy link
Contributor Author

njuptlzf commented Nov 22, 2023

@njuptlzf -- This file https://github.com/openebs/dynamic-nfs-provisioner/blob/develop/deploy/kubectl/openebs-nfs-provisioner.yaml needs to have to ENV in a commented state.

        env:
        #   OPENEBS_IO_K8S_MASTER enables openebs provisioner to connect to K8s
        #   based on this address. This is ignored if empty.
        #   This is supported for openebs provisioner version 0.5.2 onwards
        # - name: OPENEBS_IO_K8S_MASTER
        #   value: "http://10.128.0.12:8080"
        #   OPENEBS_IO_KUBE_CONFIG enables openebs provisioner to connect to K8s
        #   based on this config. This is ignored if empty.
        #   This is supported for openebs provisioner version 0.5.2 onwards
        # - name: OPENEBS_IO_KUBE_CONFIG
        #   value: "/home/ubuntu/.kube/config"
        #   OPENEBS_IO_NFS_SERVER_NODE_AFFINITY defines the node affinity rules to place NFS Server
        #   instance. It accepts affinity rules in multiple ways:
        #   - If NFS Server needs to be placed on storage nodes as well as only in
        #     zone-1 & zone-2 then value can be:
        #     value:  "kubernetes.io/zone:[zone-1,zone-2],kubernetes.io/storage-node".
        #   - If NFS Server needs to be placed only on storage nodes & nfs nodes then
        #     value can be:
        #     value:  "kubernetes.io/storage-node,kubernetes.io/nfs-node"
        # - name: OPENEBS_IO_NFS_SERVER_NODE_AFFINITY
        #   value: "kubernetes.io/storage-node,kubernetes.io/nfs-node"
        #
        #   Provide a switch to turn off the function of clearing stale pvc to avoid
        #.  garbage collecting an NFS backend PVC if the NFS PVC is deleted.
        # - name: OPENEBS_IO_NFS_SERVER_GARBAGE_COLLECTION_ENABLED
        #.  value: false

You may want to add the change in the helm chart as well, but that needs to be in a separate PR. We merge all helm chart changes at release-time, and I'd be able to pull in your change if you raise a commit with the helm chart change separately.

Unique use-case! Thank you for the contribution.

@niladrih A must! I will take a moment to introduce this ENV through another PR after testing.
Also, are there any other files besides helm that need to be modified?

@niladrih
Copy link
Member

@njuptlzf -- No. The helm chart is all that is left functionally. But, I'd invite you add a test in tests directory (ginkgo test), and add a .md file in the docs/tutorials directory explaining how this feature works.

@niladrih niladrih merged commit ec6a0f9 into openebs-archive:develop Nov 29, 2023
7 checks passed
@njuptlzf
Copy link
Contributor Author

njuptlzf commented Dec 4, 2023

@njuptlzf -- No. The helm chart is all that is left functionally. But, I'd invite you add a test in tests directory (ginkgo test), and add a .md file in the docs/tutorials directory explaining how this feature works.

@niladrih -- Sorry, I missed it again, I will finish the helm and e2e testing within this week.

njuptlzf added a commit to njuptlzf/dynamic-nfs-provisioner that referenced this pull request Dec 7, 2023
niladrih pushed a commit to niladrih/dynamic-nfs-provisioner that referenced this pull request Dec 8, 2023
niladrih pushed a commit to niladrih/dynamic-nfs-provisioner that referenced this pull request Dec 8, 2023
niladrih added a commit to niladrih/dynamic-nfs-provisioner that referenced this pull request Dec 8, 2023
niladrih added a commit to niladrih/dynamic-nfs-provisioner that referenced this pull request Dec 8, 2023
niladrih added a commit to niladrih/dynamic-nfs-provisioner that referenced this pull request Dec 8, 2023
niladrih added a commit that referenced this pull request Dec 8, 2023
…action (#180)

* chore: add @njuptlzf's change (PR #167) to the changelog

Signed-off-by: Niladri Halder <[email protected]>

* fix(ci): replace crazy-max/ghaction-docker meta with docker/metadata-action

Signed-off-by: Niladri Halder <[email protected]>

---------

Signed-off-by: Niladri Halder <[email protected]>
niladrih pushed a commit to niladrih/dynamic-nfs-provisioner that referenced this pull request Dec 8, 2023
niladrih pushed a commit to niladrih/dynamic-nfs-provisioner that referenced this pull request Dec 8, 2023
niladrih added a commit that referenced this pull request Dec 11, 2023
* fix(ci): update helm/chart-testing-action to v2.6.1

Signed-off-by: Niladri Halder <[email protected]>

* feat(helm): add gcEnable configuration to chart for PR #167

Signed-off-by: Kasakaze <[email protected]>
Signed-off-by: Niladri Halder <[email protected]>

* charts: change helm value enableGarbageCollection to a bool

Signed-off-by: Niladri Halder <[email protected]>

---------

Signed-off-by: Niladri Halder <[email protected]>
Signed-off-by: Kasakaze <[email protected]>
Co-authored-by: Kasakaze <[email protected]>
@njuptlzf njuptlzf deleted the cleanup_stale_pvc_enable branch December 16, 2023 10:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

stale nfs-pvc will be automatically deleted, causing backup recovery to fail.
5 participants