Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

disable aufs module #103831

Merged
merged 1 commit into from
Jul 22, 2021
Merged

disable aufs module #103831

merged 1 commit into from
Jul 22, 2021

Conversation

lizhuqi
Copy link
Contributor

@lizhuqi lizhuqi commented Jul 21, 2021

What type of PR is this?

/kind bug

What this PR does / why we need it:

This PR is to disable AUFS module which is no longer used by docker.

AUFSUmountHung kernel error slows down the node and caused timeout for other processes.

We've confirmed that docker is using overlay2 and containerd is using overlay so it is safe to disable AUFS.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?

None

disable aufs module for gce clusters

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/bug Categorizes issue or PR as related to a bug. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jul 21, 2021
@k8s-ci-robot
Copy link
Contributor

@lizhuqi: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Jul 21, 2021
@k8s-ci-robot
Copy link
Contributor

Hi @lizhuqi. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. area/provider/gcp Issues or PRs related to gcp provider sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jul 21, 2021
@k8s-ci-robot k8s-ci-robot requested review from karan and roycaihw July 21, 2021 22:20
@cheftako
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jul 21, 2021
@cheftako
Copy link
Member

/test pull-kubernetes-e2e-gce-100-performance

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Jul 22, 2021
@lizhuqi lizhuqi changed the title WIP disable aufs module disable aufs module Jul 22, 2021
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 22, 2021
@lizhuqi
Copy link
Contributor Author

lizhuqi commented Jul 22, 2021

See some error in containerd log:
msg="skip loading plugin "io.containerd.snapshotter.v1.aufs"..." error="aufs is not supported (modprobe aufs failed: exit status 1 "modprobe: FATAL: Module aufs not found in directory /lib/modules/5.4.129+\n"): skip plugin" type=io.containerd.snapshotter.v1
time="2021-07-22T00:20:42.322087568Z"

Need to check why performance test tries to load io.containerd.snapshotter.v1.aufs

containrd snapshotter should use overlay instead of aufs.

@lizhuqi
Copy link
Contributor Author

lizhuqi commented Jul 22, 2021

The test was testing AUFS Snapshotter for containerd. I think we can enable the aufs module in the test.

@lizhuqi
Copy link
Contributor Author

lizhuqi commented Jul 22, 2021

Confirmed that overlayfs is the default snapshotter for containerd
The snapshotter can be configured via containerd config - /etc/containerd/config.toml
https://github.com/containerd/containerd/blob/main/docs/cri/config.md

@lizhuqi
Copy link
Contributor Author

lizhuqi commented Jul 22, 2021

just compared the log with a successful run. This error also shows in the successful run. So failed to load io.containerd.snapshotter.v1 is okay. I think maybe I shall check the module is available or not before I tried to disable it. Will add a check and try again.

@k8s-ci-robot k8s-ci-robot added this to the v1.22 milestone Jul 22, 2021
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 22, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cheftako, lizhuqi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 22, 2021
@cheftako
Copy link
Member

/priority important-soon

@k8s-ci-robot k8s-ci-robot added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jul 22, 2021
@cheftako
Copy link
Member

aufs seems to be a common cause of flaky tests.
It has been linked to extended timeout periods for the kubelet during testing.

@cheftako
Copy link
Member

/cc @liggitt @jpbetz

@k8s-ci-robot k8s-ci-robot requested review from jpbetz and liggitt July 22, 2021 16:11
@k8s-ci-robot k8s-ci-robot merged commit 9b84e47 into kubernetes:master Jul 22, 2021
@saschagrunert
Copy link
Member

Do we need to cherry-pick this PR into release-1.22?

k8s-ci-robot added a commit that referenced this pull request Jul 28, 2021
…831-upstream-release-1.20

Automated cherry pick of #103831: disable aufs module
k8s-ci-robot added a commit that referenced this pull request Jul 28, 2021
…831-upstream-release-1.21

Automated cherry pick of #103831: disable aufs module
k8s-ci-robot added a commit that referenced this pull request Jul 28, 2021
…831-upstream-release-1.22

Automated cherry pick of #103831: disable aufs module
@cheftako
Copy link
Member

Do we need to cherry-pick this PR into release-1.22?

Yes. #103926

kl52752 pushed a commit to kl52752/kubernetes that referenced this pull request Dec 23, 2022
AUFSUmountHung kernel error slows down the node and caused timeout for other processes. Docker now prefers overlay2 to aufs and containerd prefers overlay so it is safe to disable AUFS.

Replicate of kubernetes#103831

Change-Id: I1b5dc48c50cb0553308e21f3c92e00386f814132
Bug: 193674321
aojea pushed a commit to aojea/kubernetes that referenced this pull request Jun 14, 2023
… used by Docker

AUFSUmountHung kernel error slows down the node and caused timeout for other processes. Docker now prefers overlay2 to aufs and containerd prefers overlay so it is safe to disable AUFS.

Replicate of kubernetes#103831

Change-Id: I1b5dc48c50cb0553308e21f3c92e00386f814132
Bug: 193674321
serathius pushed a commit to serathius/kubernetes that referenced this pull request Mar 14, 2024
… used by Docker

AUFSUmountHung kernel error slows down the node and caused timeout for other processes. Docker now prefers overlay2 to aufs and containerd prefers overlay so it is safe to disable AUFS.

Replicate of kubernetes#103831

Change-Id: I1b5dc48c50cb0553308e21f3c92e00386f814132
Bug: 193674321
hoskeri pushed a commit to hoskeri/kubernetes that referenced this pull request Jul 23, 2024
… used by Docker

AUFSUmountHung kernel error slows down the node and caused timeout for other processes. Docker now prefers overlay2 to aufs and containerd prefers overlay so it is safe to disable AUFS.

Replicate of kubernetes#103831

Change-Id: I1b5dc48c50cb0553308e21f3c92e00386f814132
Bug: 193674321
hoskeri pushed a commit to hoskeri/kubernetes that referenced this pull request Jul 23, 2024
… used by Docker

AUFSUmountHung kernel error slows down the node and caused timeout for other processes. Docker now prefers overlay2 to aufs and containerd prefers overlay so it is safe to disable AUFS.

Replicate of kubernetes#103831

Change-Id: I1b5dc48c50cb0553308e21f3c92e00386f814132
Bug: 193674321
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/provider/gcp Issues or PRs related to gcp provider cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants