Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make --gpus work with nvidia gpus #21180

Merged
merged 1 commit into from
Jan 30, 2024
Merged

Conversation

rhatdan
Copy link
Member

@rhatdan rhatdan commented Jan 5, 2024

Somewhat documented here:
https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/cdi-support.html https://stackoverflow.com/questions/25185405/using-gpu-from-a-docker-container

Fixes: #21156
Don't have access to nvidia GPUS, relying on contributor testing.

Does this PR introduce a user-facing change?

--gpus will now work with nvidia gpus.

@openshift-ci openshift-ci bot added release-note approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jan 5, 2024
@rhatdan
Copy link
Member Author

rhatdan commented Jan 5, 2024

This feels very nvidia specific, but not sure how to make it GPU agnostic.

@rhatdan
Copy link
Member Author

rhatdan commented Jan 5, 2024

@Gnyblast Please try this out.

Copy link

Ephemeral COPR build failed. @containers/packit-build please check.

gpuFlagName := "gpus"
createFlags.String(gpuFlagName, "", "GPU devices to add to the container ('all' to pass all GPUs)")
_ = cmd.RegisterFlagCompletionFunc(gpuFlagName, completion.AutocompleteNone)
_ = createFlags.MarkHidden("gpus")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why hide this flag if we're now supporting it?

@@ -121,6 +121,10 @@ func commonFlags(cmd *cobra.Command) error {
if cmd.Flags().Changed("image-volume") {
cliVals.ImageVolume = cmd.Flag("image-volume").Value.String()
}

if cmd.Flags().Changed("gpus") {
cliVals.Devices = append(cliVals.Devices, "nvidia.com/gpu="+cmd.Flag("gpus").Value.String())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, did docker and nvidia really hardcode "GPU = nvidia"? The docs seem to say so...

But I can't find a reference to nvidia.com in the docker sources; are we sure that's not actually in the nvidia container plugin?

Looking a bit more closely, this seems to map to a gpu capability: https://github.com/docker/cli/blob/master/opts/gpus_test.go#L24

@Gnyblast
Copy link

Gnyblast commented Jan 6, 2024

@Gnyblast Please try this out.

Shall I compile this version from the source and try?

@rhatdan
Copy link
Member Author

rhatdan commented Jan 6, 2024

Yes
Colin, I don't know, I was thinking that perhaps nvidia creates a /dev/nvidia.com/gpus directory, but got lost in trying to track it down, how this works, and figured to rely on @Gnyblast to test if it worked.

We have other of hidden Docker compatibility flags that we rework to make Podman handle it, without encouraging users to use these flags, which is why I kept --gpus hidden. Nvidia seems to document the Podman way of doing this.

@Luap99
Copy link
Member

Luap99 commented Jan 8, 2024

Note this only touches the cli level, #21156 explicitly talks about using the API so I don't think this can work.
As best I can tell the cli will map this to DeviceRequests option in the API payload which is not read anywhere in our code.
https://github.com/docker/cli/blob/4b06a93c5e43608a44009e272169666022f32cb7/cli/command/container/opts.go#L593C1-L593C1
And the syntax of the options seems to be different from how --device works so I don't think mapping --gpus to --device makes sense besides allowing the use of all, https://github.com/docker/cli/blob/4b06a93c5e43608a44009e272169666022f32cb7/opts/gpus.go#L29

@Gnyblast
Copy link

Gnyblast commented Jan 8, 2024

Note this only touches the cli level, #21156 explicitly talks about using the API so I don't think this can work.

This argument is correct, these application are using podman.sock to communicate, Thus it communicates with API from the looks of it. Having it working from the CLI is good but not something that will solve my problem.

@rhatdan
Copy link
Member Author

rhatdan commented Jan 8, 2024

Ok we could move it to the API level. But still didn't answer @Luap99 question.

@Gnyblast
Copy link

Gnyblast commented Jan 8, 2024

Ok we could move it to the API level. But still didn't answer @Luap99 question.

If you mean this:

And the syntax of the options seems to be different from how --device works so I don't think mapping --gpus to --device makes sense besides allowing the use of all

As far as I see, the way you mapped it on the CLI code was OK since you basically append the option argument to the end of --devices as "nvidia.com/gpu="+cmd.Flag("gpus").Value.String(). Am I missing something?

@rhatdan rhatdan force-pushed the nvidia branch 2 times, most recently from 9ea889a to 3d0230c Compare January 8, 2024 12:25
@rhatdan
Copy link
Member Author

rhatdan commented Jan 8, 2024

Ok I documented --gpus (No longer hidden)
Also added it to the API, I believe.

@Gnyblast
Copy link

Gnyblast commented Jan 8, 2024

Thanks @rhatdan ,
I'll need whole week to be able to compile and test all these within my other tasks because I also have to re-deploy these applications to that server which will be running this specific version of podman. Then I'll rewrite you back about the result.

Thanks a lot.

@mheon
Copy link
Member

mheon commented Jan 8, 2024

Build is broken, looks like a docs error.

On the flag itself: looks like there will be a hard requirement on a CDI config for the associated GPU be present. We'll have to document that.

@Gnyblast
Copy link

Gnyblast commented Jan 9, 2024

Build is broken, looks like a docs error.

On the flag itself: looks like there will be a hard requirement on a CDI config for the associated GPU be present. We'll have to document that.

I'll be awaiting the build issue to get fixed before going forward and try to test it since obviously I will not be able to compile it.

@mheon
Copy link
Member

mheon commented Jan 9, 2024

You should be fine, actually - that's just our CI making sure we don't commit without working manpages. Local builds with make don't have that check.

@Gnyblast
Copy link

Gnyblast commented Jan 10, 2024

For CLI it works:

guney@test:~$ ./podman run --rm --gpus all ubuntu:22.04 nvidia-smi -L
GPU 0: NVIDIA A40 (UUID: GPU-393d7bf9-e15f-c3ec-4216-b16ce9040376)
GPU 1: NVIDIA A40 (UUID: GPU-51e6cab9-7be7-14f0-0bde-939a9f1f76b6)
GPU 2: NVIDIA A40 (UUID: GPU-17cc7282-7b75-873a-78b4-48fb6448b4e4)
GPU 3: NVIDIA A40 (UUID: GPU-3f54228f-419b-6dfc-d0cc-1ee4902f1deb)

For API I couldn't test it yet because looks like podman/docker API changed the way it responds to a specific request. It was retuning status code 200 for /images/create before but with newly compiled binary it return status code 500, thus prefect as soon as it gets 500 it kills that process and does not try to create the container. I'll try to find some workarounds for this to get it tested and gonna post the result here.

This is from Podman 4.5.1

guney@test:/usr/local/bin$ curl -XPOST -I --unix-socket /run/user/${UID}/podman/podman.sock   -H content-type:application/json   "http://d/v1.41/images/create?tag=latest&fromImage=localhost%2Ftest-image"
HTTP/1.1 200 OK
Api-Version: 1.41
Libpod-Api-Version: 4.5.1
Server: Libpod/4.5.1 (linux)
X-Reference-Id: 0xc0001aa100
Date: Wed, 10 Jan 2024 10:38:53 GMT
Transfer-Encoding: chunked

This is from podman 5.0.0-dev that I compiled

guney@test:/usr/local/bin$ curl -XPOST -I --unix-socket /run/user/${UID}/podman/podman.sock   -H content-type:application/json   "http://d/v1.41/images/create?tag=latest&fromImage=localhost%2Ftest-image"
HTTP/1.1 500 Internal Server Error
Api-Version: 1.41
Libpod-Api-Version: 5.0.0-dev
Server: Libpod/5.0.0-dev (linux)
X-Reference-Id: 0xc0000f6c10
Date: Wed, 10 Jan 2024 11:27:15 GMT
Transfer-Encoding: chunked

@rhatdan
Copy link
Member Author

rhatdan commented Jan 12, 2024

I am not sure why it would fail as an API, it should be following the same route.

@rhatdan
Copy link
Member Author

rhatdan commented Jan 12, 2024

$ ./bin/podman run --gpus=all alpine echo hi
Error: setting up CDI devices: unresolvable CDI devices nvidia.com/gpu=all
$ ./bin/podman --remote run --gpus=all alpine echo hi
Error: preparing container e36eb040c6b9d5279eb878b437c4dbef0ac17c175310e1aa1d9d9cab20b92e23 for attach: setting up CDI devices: unresolvable CDI devices nvidia.com/gpu=all

Looks like I am getting the same error from both.

@Gnyblast
Copy link

Gnyblast commented Jan 12, 2024

Hi @rhatdan ,

It's not that API is not working. API started to return proper response codes since this commit:

commit d5454189453052a6c50a7af3ff341983f46c7b03                                                                                                                                                        [49/1699]
Author: Valentin Rothberg <[email protected]>
Date:   Tue Jun 20 14:36:29 2023 +0200
                                                    
    compat API push: fix error handling                                                                 
     
    Make sure that the push endpoint does not always return 200 even in case                
    of a push failure.  Some of the code had to be massaged since encoding a
    report implies sending a 200.  

And this breaks how prefect behaves since it was using old docker API and probably all endpoints was returning 200 regardless of the result. We have changed the source code and compiled it again to return http.StatusOK for all endpoints just to test. It's now working OK but looks like prefect also doesn't sending device_requests from it's backend to container. I still cannot verify if API works as expected but I'll try using datalore, hopefully it will tell us if that's working or not. ,

I have another question:
According to this thread docker/docker-py#2395 (comment) docker api uses:

    device_requests=[
        docker.types.DeviceRequest(count=-1, capabilities=[['gpu']])
    ]

above code let's say for Python to attach GPU to the container from API. This is the development you did on API side right?

@Gnyblast
Copy link

Gnyblast commented Jan 12, 2024

$ ./bin/podman run --gpus=all alpine echo hi
Error: setting up CDI devices: unresolvable CDI devices nvidia.com/gpu=all
$ ./bin/podman --remote run --gpus=all alpine echo hi
Error: preparing container e36eb040c6b9d5279eb878b437c4dbef0ac17c175310e1aa1d9d9cab20b92e23 for attach: setting up CDI devices: unresolvable CDI devices nvidia.com/gpu=all

Looks like I am getting the same error from both.

Does you machine has GPU? Did you followed the steps here in https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/cdi-support.html so that you generated the CDI specification file to get used when GPUs getting attached to the container?

@rhatdan
Copy link
Member Author

rhatdan commented Jan 12, 2024

No my laptop does not have GPUs, the errors are what I expected, I was just showing that the APIs seem to be returning the same error with the nvidia.com/gpu modification.

@Gnyblast
Copy link

Gnyblast commented Jan 12, 2024

No my laptop does not have GPUs, the errors are what I expected, I was just showing that the APIs seem to be returning the same error with the nvidia.com/gpu modification.

Yes but Rest API return also HTTP Status Codes which changed on the above commit. Anyway, the other thing I noticed, Rest API SDKs are actually using something called host_configs that has device_requests inside and that's where you attach the GPUs. It looks like different than what you implemented as --gpus all. Do you think this still should work?

@rhatdan
Copy link
Member Author

rhatdan commented Jan 15, 2024

If podman run --gpus ..., worked, then podman run --remote would work when the gpus flag is passed to the server.

So I don't see why this would not work, if you pass the flag via the API.

@rhatdan
Copy link
Member Author

rhatdan commented Jan 15, 2024

There is nothing in this PR that would effect the endpoint, but podman 5.0 is the active upstream repo. Meaning if you build podman you are testing against a 5.0 endpoint.

@rhatdan
Copy link
Member Author

rhatdan commented Jan 27, 2024

@mheon @Luap99 @giuseppe @vrothberg PTANL

I don't see any other changes, can we get this in?

Copy link
Member

@giuseppe giuseppe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

openshift-ci bot commented Jan 29, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: giuseppe, rhatdan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@rhatdan
Copy link
Member Author

rhatdan commented Jan 29, 2024

@Gnyblast @cgwalters @Luap99 PTAL

@Luap99
Copy link
Member

Luap99 commented Jan 29, 2024

Well this does not fix the API and as such does not fix #21156.

I am not against merging but just pointing out that it does not address the issue and likely only ever works for the all case due the syntax differences otherwise.

@Gnyblast
Copy link

@Gnyblast @cgwalters @Luap99 PTAL

I'm confirming that this development works with podman and podman-remote to attach devices with --gpus all and tested it. But I received some news from prefect that they don't support attaching GPUs to the containers atm from Docker API SDK. Which means I couldn't test it from an API that uses let says Python SDK Code below to send device requests.

    device_requests=[
        docker.types.DeviceRequest(count=-1, capabilities=[['gpu']])
    ]

I'm also not sure how Docker or Podman SDK converts this code to be readable by API. if it's get translated to API as --gpus all or --gpus <id> whatever... it must be good, otherwise if SDK doesn't attach it in a way that the development you did can understand, it will still not be able to function and attach GPUs to container.

@rhatdan
Copy link
Member Author

rhatdan commented Jan 29, 2024

@Luap99 Remind me how this does not fix the API, the GPUS is being passed to the server side, and handled there as I recall.

@rhatdan rhatdan added the lgtm Indicates that a PR is ready to be merged. label Jan 30, 2024
@openshift-merge-bot openshift-merge-bot bot merged commit c41c30b into containers:main Jan 30, 2024
90 of 92 checks passed
@Luap99
Copy link
Member

Luap99 commented Feb 5, 2024

Because docker sends the data via device_requests which is not handled anywhere in podman at all.

@laurensV
Copy link

Because docker sends the data via device_requests which is not handled anywhere in podman at all.

Yeah I had this problem when I wanted to use docker api with podman server, it seems to just ignore device_requests. Do you know if it is even possible to use the docker api to add support for gpu? docker api does have a Devices option, but not sure how to convert --device nvidia.com/gpu=all to what the docker API needs for Devices:
image

@klueska
Copy link

klueska commented Apr 25, 2024

Just dropping this here for reference.

NVIDIA has actively discouraged the use of --gpus in docker since 2020. It has many inconsistencies and never should have been added in the first place. The long-term plan is to standardize on CDI and eventually deprecate --gpus altogether.

@stale-locking-app stale-locking-app bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Jul 25, 2024
@stale-locking-app stale-locking-app bot locked as resolved and limited conversation to collaborators Jul 25, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. release-note
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Podman --gpus all support instead of --devices nvidia.com/gpu=all for compability
8 participants