You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
So in summary, for e.g. the tensorflow/tensorflow:2.17.0-gpu-jupyter image:
If you run the image in interactive mode (using e.g. the command bash when starting the container) and as the default (root) user, you get a startup message warning you that can screw up the ownership of new files created on your volumes. So the user presumably then wants to run it as an unprivileged user.
If you then run the image in interactive mode (using e.g. the command bash when starting the container) and as an unprivileged user (e.g. UID/GID 1000/1000), the startup message of the container tells you it's all correct and good, but ldconfig fails to run (giving /sbin/ldconfig.real: Can't create temporary cache file /etc/ld.so.cache~: Permission denied error message) and perhaps also nvidia-smi will not work at all.
So it's a catch 22, and whichever way you follow the startup recommendations, you're in the wrong :)
Should we perhaps stop recommending users to run the software as their own UID/GID and instead just inform them that if they do, file ownership/permissions might be a problem or need attention?
Or should we perhaps instead run the ldconfig as part of the image build (while the build is still running as root) instead of in the bashrc file? Or will that not make nvidia-smi work anymore?
All that said, if I run the tensorflow/tensorflow:2.17.0-gpu-jupyter image now, I can't even find nvidia-smi installed in it anywhere. Is it still there, or can we just get rid of the ldconfig in bashrc because it's not needed anymore? If so, there's no problem anymore I suppose.
The text was updated successfully, but these errors were encountered:
Following up on the problem reported in many places, but for example here at 8bfbcec#r111483861 with a reply by @angerson at 8bfbcec#r111514846 .
So in summary, for e.g. the tensorflow/tensorflow:2.17.0-gpu-jupyter image:
If you run the image in interactive mode (using e.g. the command
bash
when starting the container) and as the default (root) user, you get a startup message warning you that can screw up the ownership of new files created on your volumes. So the user presumably then wants to run it as an unprivileged user.If you then run the image in interactive mode (using e.g. the command
bash
when starting the container) and as an unprivileged user (e.g. UID/GID 1000/1000), the startup message of the container tells you it's all correct and good, butldconfig
fails to run (giving/sbin/ldconfig.real: Can't create temporary cache file /etc/ld.so.cache~: Permission denied
error message) and perhaps also nvidia-smi will not work at all.So it's a catch 22, and whichever way you follow the startup recommendations, you're in the wrong :)
Should we perhaps stop recommending users to run the software as their own UID/GID and instead just inform them that if they do, file ownership/permissions might be a problem or need attention?
Or should we perhaps instead run the
ldconfig
as part of the image build (while the build is still running as root) instead of in the bashrc file? Or will that not makenvidia-smi
work anymore?All that said, if I run the tensorflow/tensorflow:2.17.0-gpu-jupyter image now, I can't even find
nvidia-smi
installed in it anywhere. Is it still there, or can we just get rid of theldconfig
in bashrc because it's not needed anymore? If so, there's no problem anymore I suppose.The text was updated successfully, but these errors were encountered: