-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bugfix] Dockerfile.rocm #141
Conversation
@zstreet87 could you describe what the issue is? |
as the dockerfile is now that env var doesn't get overwritten properly so it's value is "gfx900;gfx906;gfx908;gfx90a;gfx1030;gfx1100;gfx1101;gfx940;gfx941;gfx942" Dockerfile doesn't work when using the same variable name for various things. |
Do you have a container built from this Dockerfile, or another scenario to reproduce?
|
Unfortunately, I don't - this is coming from a client who had the error and then fixed it by changing the name as I have done. Hmm, interesting - the one difference I can think of with your test case is in the dockerfile, that env var is already set from the base docker. |
Just tried using TERM (one of the few vars already present in ubuntu:20.04), and the results are the same |
Okay the plot thickens... This error was found on the AAC cluster which uses podman so the command was Perhaps podman still has this issue? |
IMHO supporting podman is good feature potentially. My reason is that podman's containers are exclusively rootless, which adds stability & security (at the very least to the CI pipeline). That is some extra motivation to explore this issue and find a resolution to it. |
Oh yes, I agree. As more people use vllm in cluster settings, odds are more in favor they will use podman instead of docker for the permission issue @Alexei-V-Ivanov-AMD brought up. Very quirky thing to stumble upon though, for sure. I asked the client to test the minimal reproducer with podman and well update with what they have. |
I'm ok with this change if podman has this issue still and this solves it |
appreciate it @gshtras! Please merge |
@zstreet87 @Alexei-V-Ivanov-AMD On further reflection I'm currently of the opinion that this change should be reverted. First, Docker explicitly supports aliasing
If Podman fails to implement the Docker spec correctly, this is not an issue we should fix in vLLM. Particularly because developers are used to using |
Need to have different names for ARG and ENV for dockerfile to function properly