Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crio_cgroupv2_imagefs.ign: run SELinux relabeling service before crio #33963

Conversation

bart0sh
Copy link
Contributor

@bart0sh bart0sh commented Dec 16, 2024

This is yet another attempt to fix "sh: error while loading shared libraries: /lib/libc.so.6: cannot apply additional memory protection after relocation: Permission denied" error.

It turned out that running label-graphroot service (semanage fcontext -a -e /var/lib/containers /var/lib/imagefs && restorecon -R -v /var/lib/imagefs) may race with crio service that uses this filesystem. Here is a quote from a typical serial log of the pull-kubernetes-node-crio-cgrpv2-imagefs-e2e-kubetest2 job illustrating that crio and label-graphroot are running in parallel:

Starting �[0;1;39mlabel-graphroot.service�[0m - Label Graphroot... [�[0;32m  OK  �[0m]
Created slice �[0;1;39muser-1000.slice�[0m - User Slice of UID 1000.
Starting �[0;1;[email protected]…�[0mRuntime Directory /run/user/1000...[�[0;32m  OK  �[0m]
Finished �[0;1;[email protected]…�[0mr Runtime Directory /run/user/1000.
...
Started �[0;1;39mcrio-conmon-d4d3685b8f9c6d…7b2047bc90e1507937e4dbfa157f.scope�[0m. [�[0;32m  OK  �[0m] 
Started �[0;1;39mcrio-conmon-daa248760de7bd…2d733f844f872e2d692185f360ad.scope�[0m.[�[0;32m  OK  �[0m]
Started �[0;1;39mcrio-d4d3685b8f9c6dd09a258…�[0mf4b92a7b2047bc90e1507937e4dbfa157f.[�[0;32m  OK  �[0m]
Started �[0;1;39mcrio-daa248760de7bde2af8e3…�[0mb23b142d733f844f872e2d692185f360ad.[�[0;32m  OK  �[0m]
Started �[0;1;39mcrio-conmon-734a303b34e522…edb2df2ecc538ad298a30446f745.scope�[0m.[�[0;32m  OK  �[0m]
Started �[0;1;39mcrio-734a303b34e522c76673d…�[0m15bde3edb2df2ecc538ad298a30446f745. [�[0;32m  OK [0m] 
Started �[0;1;39mcrio-conmon-d0f48acf71b1e6…18b36eb0bc78d292703960742bdb.scope�[0m.[�[0;32m  OK  �[0m]
...
Started �[0;1;39mcrio-86ee0a10033c780846ad2…�[0me9d5bb6f321cd033a95b1f1a8879514811.[�[0;32m  OK  �[0m]
...
Finished �[0;1;39mlabel-graphroot.service�[0m - Label Graphroot.

There is potential for file system inconsistency or SELinux label mismatches, especially if restorecon is actively relabeling /var/lib/imagefs while cri-o is trying to access or use the directory.

Ensuring that label-graphroot service finishes before any cri-o operations start can potentilly fix the issue.

Ref: ##32567 (comment) kubernetes/kubernetes#127831

Note: this is a test PR, it doesn't guarantee the expected effect as I can't reproduce the issue in my setup.

/sig node
/cc @elieser1101 @kannon92 @haircommander

@k8s-ci-robot k8s-ci-robot added the sig/node Categorizes an issue or PR as relevant to SIG Node. label Dec 16, 2024
@k8s-ci-robot
Copy link
Contributor

@bart0sh: GitHub didn't allow me to request PR reviews from the following users: elieser1101.

Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

This is yet another attempt to fix "sh: error while loading shared libraries: /lib/libc.so.6: cannot apply additional memory protection after relocation: Permission denied" error.

It turned out that running label-graphroot service (semanage fcontext -a -e /var/lib/containers /var/lib/imagefs && restorecon -R -v /var/lib/imagefs) may race with crio service that uses this filesystem. Here is a quote from a typical serial log of the pull-kubernetes-node-crio-cgrpv2-imagefs-e2e-kubetest2 job illustrating that crio and label-graphroot are running in parallel:

Starting �[0;1;39mlabel-graphroot.service�[0m - Label Graphroot... [�[0;32m  OK  �[0m]
Created slice �[0;1;39muser-1000.slice�[0m - User Slice of UID 1000.
Starting �[0;1;[email protected]…�[0mRuntime Directory /run/user/1000...[�[0;32m  OK  �[0m]
Finished �[0;1;[email protected]…�[0mr Runtime Directory /run/user/1000.
...
Started �[0;1;39mcrio-conmon-d4d3685b8f9c6d…7b2047bc90e1507937e4dbfa157f.scope�[0m. [�[0;32m  OK  �[0m] 
Started �[0;1;39mcrio-conmon-daa248760de7bd…2d733f844f872e2d692185f360ad.scope�[0m.[�[0;32m  OK  �[0m]
Started �[0;1;39mcrio-d4d3685b8f9c6dd09a258…�[0mf4b92a7b2047bc90e1507937e4dbfa157f.[�[0;32m  OK  �[0m]
Started �[0;1;39mcrio-daa248760de7bde2af8e3…�[0mb23b142d733f844f872e2d692185f360ad.[�[0;32m  OK  �[0m]
Started �[0;1;39mcrio-conmon-734a303b34e522…edb2df2ecc538ad298a30446f745.scope�[0m.[�[0;32m  OK  �[0m]
Started �[0;1;39mcrio-734a303b34e522c76673d…�[0m15bde3edb2df2ecc538ad298a30446f745. [�[0;32m  OK [0m] 
Started �[0;1;39mcrio-conmon-d0f48acf71b1e6…18b36eb0bc78d292703960742bdb.scope�[0m.[�[0;32m  OK  �[0m]
...
Started �[0;1;39mcrio-86ee0a10033c780846ad2…�[0me9d5bb6f321cd033a95b1f1a8879514811.[�[0;32m  OK  �[0m]
...
Finished �[0;1;39mlabel-graphroot.service�[0m - Label Graphroot.

There is potential for file system inconsistency or SELinux label mismatches, especially if restorecon is actively relabeling /var/lib/imagefs while cri-o is trying to access or use the directory.

Ensuring that label-graphroot service finishes before any cri-o operations start can potentilly fix the issue.

Ref: ##32567 (comment) kubernetes/kubernetes#127831

Note: this is a test PR, it doesn't guarantee the expected effect as I can't reproduce the issue in my setup.

/sig node
/cc @elieser1101 @kannon92 @haircommander

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Dec 16, 2024
@bart0sh bart0sh force-pushed the PR053-crio_cgroupv2_imagefs-run-semanage-before-crio branch from 81f6047 to 342220a Compare December 16, 2024 13:53
@bart0sh bart0sh changed the title crio_cgroupv2_imagefs.ign: run semanage before crio install crio_cgroupv2_imagefs.ign: run SELinux relabeling service before crio Dec 16, 2024
@haircommander
Copy link
Contributor

great find!
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 16, 2024
@kannon92
Copy link
Contributor

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bart0sh, kannon92

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 16, 2024
@k8s-ci-robot k8s-ci-robot merged commit 2764954 into kubernetes:master Dec 16, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
Development

Successfully merging this pull request may close these issues.

4 participants