Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for NVIDIA GPUs under Docker Compose #6691

Closed
collabnix opened this issue May 9, 2019 · 196 comments · Fixed by #7929
Closed

Support for NVIDIA GPUs under Docker Compose #6691

collabnix opened this issue May 9, 2019 · 196 comments · Fixed by #7929
Assignees
Milestone

Comments

@collabnix
Copy link

Under Docker 19.03.0 Beta 2, support for NVIDIA GPU has been introduced in the form of new CLI API --gpus. docker/cli#1714 talk about this enablement.

Now one can simply pass --gpus option for GPU-accelerated Docker based application.

$ docker run -it --rm --gpus all ubuntu nvidia-smi
Unable to find image 'ubuntu:latest' locally
latest: Pulling from library/ubuntu
f476d66f5408: Pull complete 
8882c27f669e: Pull complete 
d9af21273955: Pull complete 
f5029279ec12: Pull complete 
Digest: sha256:d26d529daa4d8567167181d9d569f2a85da3c5ecaf539cace2c6223355d69981
Status: Downloaded newer image for ubuntu:latest
Tue May  7 15:52:15 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 390.116                Driver Version: 390.116                   |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla P4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   39C    P0    22W /  75W |      0MiB /  7611MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+
:~$ 

As of today, Compose doesn't support this. This is a feature request for enabling Compose to support for NVIDIA GPU.

@collabnix collabnix changed the title Add DeviceRequests to HostConfig to support NVIDIA GPUs under Docker Compose Support for NVIDIA GPUs under Docker Compose May 9, 2019
@qhaas
Copy link

qhaas commented Jul 24, 2019

This is of increased importance now that the (now) legacy 'nvidia runtime' appears broken with Docker 19.03.0 and nvidia-container-toolkit-1.0.0-2: NVIDIA/nvidia-docker#1017

$ cat docker-compose.yml 
version: '2.3'

services:
 nvidia-smi-test:
  runtime: nvidia
  image: nvidia/cuda:9.2-runtime-centos7

$ docker-compose run nvidia-smi-test
Cannot create container for service nvidia-smi-test: Unknown runtime specified nvidia

This works: docker run --gpus all nvidia/cudagl:9.2-runtime-centos7 nvidia-smi

This does not: docker run --runtime=nvidia nvidia/cudagl:9.2-runtime-centos7 nvidia-smi

@michaelnordmeyer
Copy link

Any work happening on this?

I got the new Docker CE 19.03.0 on a new Ubuntu 18.04 LTS machine, have the current and matching NVIDIA Container Toolkit (née nvidia-docker2) version, but cannot use it because docker-compose.yml 3.7 doesn't support the --gpus flag.

@akiross
Copy link

akiross commented Jul 24, 2019

Is there a workaround for this?

@kiendang
Copy link

This works: docker run --gpus all nvidia/cudagl:9.2-runtime-centos7 nvidia-smi

This does not: docker run --runtime=nvidia nvidia/cudagl:9.2-runtime-centos7 nvidia-smi

You need to have

{
    "runtimes": {
        "nvidia": {
            "path": "/usr/bin/nvidia-container-runtime",
            "runtimeArgs": []
        }
    }
}

in your /etc/docker/daemon.json for --runtime=nvidia to continue working. More info here.

@VanDavv
Copy link

VanDavv commented Aug 9, 2019

ping @KlaasH @ulyssessouza @Goryudyuma @chris-crone . Any update on this?

@iedmrc
Copy link

iedmrc commented Aug 13, 2019

It is an urgent need. Thank you for your effort!

@Daniel451
Copy link

Is it intended to have user manually populate /etc/docker/daemon.json after migrating to docker >= 19.03 and removing nvidia-docker2 to use nvidia-container-toolkit instead?

It seems that this breaks a lot of installations. Especially, since --gpus is not available in compose.

@andyneff
Copy link

No, this is a work around for until compose does support the gpus flag.

@uderik
Copy link

uderik commented Aug 27, 2019

install nvidia-docker-runtime:
https://github.com/NVIDIA/nvidia-container-runtime#docker-engine-setup
add to /etc/docker/daemon.json
{
"runtimes": {
"nvidia": {
"path": "/usr/bin/nvidia-container-runtime",
"runtimeArgs": []
}
}
}

docker-compose:
runtime: nvidia
environment:
- NVIDIA_VISIBLE_DEVICES=all

@Kwull
Copy link

Kwull commented Aug 27, 2019

There is no such thing like "/usr/bin/nvidia-container-runtime" anymore. Issue is still critical.

@uderik
Copy link

uderik commented Aug 27, 2019

it will help run nvidia environment with docker-compose, untill fix docker-compose

@cheperuiz
Copy link

install nvidia-docker-runtime:
https://github.com/NVIDIA/nvidia-container-runtime#docker-engine-setup
add to /etc/docker/daemon.json
{
"runtimes": {
"nvidia": {
"path": "/usr/bin/nvidia-container-runtime",
"runtimeArgs": []
}
}
}

docker-compose:
runtime: nvidia
environment:

  • NVIDIA_VISIBLE_DEVICES=all

This is not working for me, still getting the Unsupported config option for services.myservice: 'runtime' when trying to run docker-compose up

any ideas?

@uderik
Copy link

uderik commented Aug 27, 2019

This is not working for me, still getting the Unsupported config option for services.myservice: 'runtime' when trying to run docker-compose up

any ideas?

after modify /etc/docker/daemon.json, restart docker service
systemctl restart docker
use Compose format 2.3 and add runtime: nvidia to your GPU service. Docker Compose must be version 1.19.0 or higher.
docker-compose file:
version: '2.3'

services:
nvsmi:
image: ubuntu:16.04
runtime: nvidia
environment:
- NVIDIA_VISIBLE_DEVICES=all
command: nvidia-smi

@Kwull
Copy link

Kwull commented Aug 27, 2019

@cheperuiz, you can set nvidia as default runtime in daemon.json and will not be dependent on docker-compose. But all you docker containers will use nvidia runtime - I have no issues so far.
{ "default-runtime": "nvidia", "runtimes": { "nvidia": { "path": "/usr/bin/nvidia-container-runtime", "runtimeArgs": [] } }, }

@cheperuiz
Copy link

Ah! thank you @Kwull , i missed that default-runtime part... Everything working now :)

@johncolby
Copy link

@uderik, runtime is no longer present in the current 3.7 compose file format schema, nor in the pending 3.8 version that should eventually align with Docker 19.03: https://github.com/docker/compose/blob/5e587d574a94e011b029c2fb491fb0f4bdeef71c/compose/config/config_schema_v3.8.json

@andyneff
Copy link

@johncolby runtime has never been a 3.x flag. It's only present in the 2.x track, (2.3 and 2.4).

@cheperuiz
Copy link

Yeah, I know, and even though my docker-compose.yml file includes the version: '2.3' (which have worked in the past) it seems to be ignored by the latest versions...
For future projects, what would be the correct way to enable/disable access to the GPU? just making it default + env variables? or will there be support for the --gpus flag?

@Daniel451
Copy link

@johncolby what is the replacement for runtime in 3.X?

@johncolby
Copy link

@Daniel451 I've just been following along peripherally, but it looks like it will be under the generic_resources key, something like:

services:
  my_app:
    deploy:
      resources:
        reservations:
          generic_resources:
            - discrete_resource_spec:
                kind: 'gpu'
                value: 2

(from https://github.com/docker/cli/blob/9a39a1/cli/compose/loader/full-example.yml#L71-L74)
Design document here: https://github.com/docker/swarmkit/blob/master/design/generic_resources.md

Here is the compose issue regarding compose 3.8 schema support, which is already merged in: #6530

On the daemon side the gpu capability can get registered by including it in the daemon.json or dockerd CLI (like the previous hard-coded runtime workaround), something like

/usr/bin/dockerd --node-generic-resource gpu=2

which then gets registered by hooking into the NVIDIA docker utility:
https://github.com/moby/moby/blob/09d0f9/daemon/nvidia_linux.go

It looks like the machinery is basically in place, probably just needs to get documented...

@chongyi-zheng
Copy link

Any update?

@statikkkkk
Copy link

Also waiting on updates, using bash with docker run --gpus until the official fix...

@celbirlik
Copy link

Waiting for updates asw ell.

@vk1z
Copy link

vk1z commented Feb 9, 2021

To fix install the nvidia-container-toolkit(https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#setting-up-nvidia-container-toolkit)

  1. docker-compose --version

docker-compose version 1.28.2, build 6763035

  1. Compose file: services:
    docker-compose-now-supports-device-requests:
    image: nvidia/cuda:11.0-base
    command: nvidia-smi
    deploy:
    resources:
    reservations:
    devices:
    - capabilities:
    - gpu
  2. docker-compose up
    ==
    docker-compose up
    Building with native build. Learn about native build in Compose here: https://docs.docker.com/go/compose-native-build/
    Removing vkurien_docker-compose-now-supports-device-requests_1
    Recreating 30ae4cbfb9c1_vkurien_docker-compose-now-supports-device-requests_1 ...
    Attaching to vkurien_docker-compose-now-supports-device-requests_1
    docker-compose-now-supports-device-requests_1 | Tue Feb 9 03:00:36 2021
    docker-compose-now-supports-device-requests_1 | +-----------------------------------------------------------------------------+
    docker-compose-now-supports-device-requests_1 | | NVIDIA-SMI 460.39 Driver Version: 460.39 CUDA Version: 11.2 |
    docker-compose-now-supports-device-requests_1 | |-------------------------------+----------------------+----------------------+
    docker-compose-now-supports-device-requests_1 | | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
    docker-compose-now-supports-device-requests_1 | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
    docker-compose-now-supports-device-requests_1 | | | | MIG M. |
    docker-compose-now-supports-device-requests_1 | |===============================+======================+======================|
    docker-compose-now-supports-device-requests_1 | | 0 GeForce GTX 107... Off | 00000000:07:00.0 On | N/A |
    docker-compose-now-supports-device-requests_1 | | 0% 58C P8 19W / 180W | 500MiB / 8111MiB | 0% Default |
    docker-compose-now-supports-device-requests_1 | | | | N/A |
    docker-compose-now-supports-device-requests_1 | +-------------------------------+----------------------+----------------------+
    docker-compose-now-supports-device-requests_1 |
    docker-compose-now-supports-device-requests_1 | +-----------------------------------------------------------------------------+
    docker-compose-now-supports-device-requests_1 | | Processes: |
    docker-compose-now-supports-device-requests_1 | | GPU GI CI PID Type Process name GPU Memory |
    docker-compose-now-supports-device-requests_1 | | ID ID Usage |
    docker-compose-now-supports-device-requests_1 | |=============================================================================|
    docker-compose-now-supports-device-requests_1 | +-----------------------------------------------------------------------------+
    vkurien_docker-compose-now-supports-device-requests_1 exited with code 0

@Motophan
Copy link

Motophan commented Feb 9, 2021

To fix install the nvidia-container-toolkit(https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#setting-up-nvidia-container-toolkit)

  1. docker-compose --version

docker-compose version 1.28.2, build 6763035

  1. Compose file: services:
    docker-compose-now-supports-device-requests:
    image: nvidia/cuda:11.0-base
    command: nvidia-smi
    deploy:
    resources:
    reservations:
    devices:
    • capabilities:
    • gpu
  2. docker-compose up

    docker-compose up
    Building with native build. Learn about native build in Compose here: https://docs.docker.com/go/compose-native-build/
    Removing vkurien_docker-compose-now-supports-device-requests_1
    Recreating 30ae4cbfb9c1_vkurien_docker-compose-now-supports-device-requests_1 ...
    Attaching to vkurien_docker-compose-now-supports-device-requests_1
    docker-compose-now-supports-device-requests_1 | Tue Feb 9 03:00:36 2021
    docker-compose-now-supports-device-requests_1 | +-----------------------------------------------------------------------------+
    docker-compose-now-supports-device-requests_1 | | NVIDIA-SMI 460.39 Driver Version: 460.39 CUDA Version: 11.2 |
    docker-compose-now-supports-device-requests_1 | |-------------------------------+----------------------+----------------------+
    docker-compose-now-supports-device-requests_1 | | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
    docker-compose-now-supports-device-requests_1 | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
    docker-compose-now-supports-device-requests_1 | | | | MIG M. |
    docker-compose-now-supports-device-requests_1 | |===============================+======================+======================|
    docker-compose-now-supports-device-requests_1 | | 0 GeForce GTX 107... Off | 00000000:07:00.0 On | N/A |
    docker-compose-now-supports-device-requests_1 | | 0% 58C P8 19W / 180W | 500MiB / 8111MiB | 0% Default |
    docker-compose-now-supports-device-requests_1 | | | | N/A |
    docker-compose-now-supports-device-requests_1 | +-------------------------------+----------------------+----------------------+
    docker-compose-now-supports-device-requests_1 |
    docker-compose-now-supports-device-requests_1 | +-----------------------------------------------------------------------------+
    docker-compose-now-supports-device-requests_1 | | Processes: |
    docker-compose-now-supports-device-requests_1 | | GPU GI CI PID Type Process name GPU Memory |
    docker-compose-now-supports-device-requests_1 | | ID ID Usage |
    docker-compose-now-supports-device-requests_1 | |=============================================================================|
    docker-compose-now-supports-device-requests_1 | +-----------------------------------------------------------------------------+
    vkurien_docker-compose-now-supports-device-requests_1 exited with code 0

That won't actually work, I know it looks like it will work but it will not work. Tested with a p2000 do you need logs?

@vk1z
Copy link

vk1z commented Feb 9, 2021

@Motophan : Define "will not work", it did just work on my machine (ubuntu 18.04, gtx 1070), a moment ago if you take a closer look at what I attached. Try this command for instance:
sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi

Tell me what you get after installing nvidia container toolkit and restarting docker daemon.

@estimadarocha
Copy link

estimadarocha commented Feb 9, 2021

@vk1z so as far as i understand from your statements we still need to install nvidia-container-toolkit?

i am running:

Docker version 20.10.3, build 48d30b5
docker-compose version 1.28.2, build 6763035

Update:

After installing nvidia-container-toolkit i can run nvidia/cuda docker and run nvidia-smi.

But...

When trying plex as @Motophan said i can't have access gpus

services:
plex:
deploy:
resources:
reservations:
devices:
- capabilities:
- gpu

and if i install portainer and look at i can't see GPU line in container details as mentioned here portainer/portainer#4791 (comment) by @xAt0mZ

@vk1z
Copy link

vk1z commented Feb 9, 2021

@estimadarocha : I am afraid that I don't know about portainer. But I do have some questions for you:

  1. Your understanding on whether we have to run nvidia-container-toolkit is correct.
  2. Did you try the compose file that I had set up earlier (with the right nvidia/cuda image?
  3. Did sudo docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi work?
  4. Did nvidia-smi run in the container and produce correct output using the compose file?

v

@xAt0mZ
Copy link

xAt0mZ commented Feb 9, 2021

@vk1z Portainer is a GUI to manage docker/kubernetes endpoints (clusters or standalone) to lower the CLI learning hassle

@estimadarocha it's a pull request not merged nor released yet. But it's doing basically the same as using the --gpus CLI option. So if it's not working on your env with CLI, it will not work with Portainer either

@estimadarocha
Copy link

@vk1z

  1. yes
  2. yes
  3. yes
  4. yes

@xAt0mZ i tought the portainer implementation is finished.

@vk1z
Copy link

vk1z commented Feb 9, 2021

@estimadarocha : Thanks for confirming. Therefore it seems to me that from the docker-compose point of view, we are good.

@Xefir
Copy link

Xefir commented Feb 9, 2021

@estimadarocha What image do you use for Plex ?

The image has to be optimised for this kind of work. For example, the official image of Plex is NOT compatible with GPUs, regardless of flags you pass on Docker or docker-compose file. Guys from linuxserver has done some work to be able to use the GPU, so try their image instead.

@fmoledina
Copy link

@Xefir I use the official Plex image for GPU transcoding using the nvidia-docker2 runtime. Are you saying that using the --gpus flag wouldn't work?

@euri10
Copy link

euri10 commented Feb 9, 2021 via email

@estimadarocha
Copy link

@Xefir I use linuxserver

The question here is related to docker compose...
Or better what's the needed options that we need to have present on compose.

Is only these ones:

services:
plex:
deploy:
resources:
reservations:
devices:

  • capabilities:
  • gpu

This equal to gpus -all on direct command line?

Is this enought?

Thanks

@C84186
Copy link

C84186 commented Feb 10, 2021

Problem Statement - How to enable HW accellerated transcoding for mediaserver (Jellyfin, etc)

I'm interesting in running a similar media server (jellyfin) w/ HW encoding via docker compose.

I would rather not have to build a GPU compatible image from scratch based off the base images, and instead continue using the standard jellyfin images.

Outdated - I've answered my question below It's encouraging to see people suggesting that you don't need a runtime specific image to make this work.

I'm not clear on what I do need to define for my media service in my compose spec in order to have it work correctly.

Could anyone please provide a minimum working example of a mediaserver (I'd prefer jellyfin, but beggars can't be choosers) leveraging this runtime?

Or does #6691 (comment) mean that this runtime isn't even strictly necessary to utilize GPU in your compose services? If so, very exciting.

docker-compose.yml that doesn't work
  jellyfin:
   image: jellyfin/jellyfin
   runtime: nvidia
   environment:
     NVIDIA_VISIBLE_DEVICES: all
   deploy:
      resources:
        reservations:
          devices:
            - capabilities:
              - gpu

or do I also need to set runtime ?

Update

The above was not sufficient - When I needed accellerated transcoding, I'd lose the stream.
The jellyfin logs told me ffmpeg was failing:

jellyfin logs
[2021-02-10 14:08:21.598 +11:00] [INF] [34] MediaBrowser.Api.Playback.Hls.DynamicHlsService: /usr/lib/jellyfin-ffmpeg/ffmpeg -c:v h264_cuvid -resize 720x404 -i file:"/path/to/file" -map_metadata -1 -map_chapters -1 -threads 0 -map 0:0 -map 0:1 -map -0:s -codec:v:0 h264_nvenc -pix_fmt yuv420p -preset default -b:v 1878633 -maxrate 1878633 -bufsize 3757266 -profile:v high  -g 72 -keyint_min 72 -sc_threshold 0 -start_at_zero -vsync -1 -codec:a:0 libmp3lame -ac 2 -ab 121367  -copyts -avoid_negative_ts disabled -f hls -max_delay 5000000 -hls_time 3 -individual_header_trailer 0 -hls_segment_type mpegts -start_number 0 -hls_segment_filename "/config/data/transcodes/8e249c1ea338e1b1054a42cbe728d068%d.ts" -hls_playlist_type vod -hls_list_size 0 -y "/config/data/transcodes/8e249c1ea338e1b1054a42cbe728d068.m3u8"
[2021-02-10 14:08:21.636 +11:00] [ERR] [25] MediaBrowser.Api.Playback.Hls.DynamicHlsService: FFMpeg exited with code 1
[2021-02-10 14:08:21.710 +11:00] [WRN] [32] MediaBrowser.Api.Playback.Hls.DynamicHlsService: cannot serve "/config/data/transcodes/8e249c1ea338e1b1054a42cbe728d0680.ts" as transcoding quit before we got there

the ffmpeg logs gave me "Operation not permitted":

FFMPEG logs
[h264_cuvid @ 0x55eb835ca4c0] Cannot load libnvcuvid.so.1
[h264_cuvid @ 0x55eb835ca4c0] Failed loading nvcuvid.
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (h264_cuvid) -> h264 (h264_nvenc))
  Stream #0:1 -> #0:1 (aac (native) -> mp3 (libmp3lame))
Error while opening decoder for input stream #0:0 : Operation not permitted

Fortunately, the fix was easy - this reddit thread gave the answer - the following works:

Working compose definition for hardware transcoding

# docker-compose.override.yml
# my volumes, ports, traefik, most of the "standard" jellyfin env is set elsewhere

  jellyfin:
   image: jellyfin/jellyfin
   runtime: nvidia
   environment:
     NVIDIA_VISIBLE_DEVICES: all
     NVIDIA_DRIVER_CAPABILITIES: all
   deploy:
      resources:
        reservations:
          devices:
            - capabilities:
              - gpu

To be clear - the above is actually from my docker-compose.override file, so isn't a full reprex for running this service.
You can easily substitute in your plex, emby, etc for this, however:

# docker-compose.override.yml

version: "2.4"
services:
  YOUR-SERVICE-NAME:
   runtime: nvidia
   environment:
     NVIDIA_VISIBLE_DEVICES: all
     NVIDIA_DRIVER_CAPABILITIES: all
   deploy:
      resources:
        reservations:
          devices:
            - capabilities:
              - gpu

I'm not actually clear on what parts of the above service defintion I actually need.

Also, I assume it's possible to set more fine-grained control on what driver capabilities your service needs, transcoding, machine learning acceleration etc, but I don't know that I care.

Update: Based off of #6691 (comment) , the following value should suffice:

NVIDIA_DRIVER_CAPABILITIES: 'compute,video,utility'

I think there's redundancy in the use of the runtime: + deploy settings, but hey, if it aint broke...

@vk1z
Copy link

vk1z commented Feb 10, 2021

@C84186: Thanks for your work. Frankly this points out a need for more compose "recipes". This thread is serving as a substitute for documentation, alas.

@euri10
Copy link

euri10 commented Feb 10, 2021

here's my docker-compose for plex official image that uses hw encoding fine (just edited useless parts).

the only thing I can comment on is that without the 2 NVIDIA env variables (which happen to be mentioned in the linuxserver image doc) there was no hw encoding happening, hope this helps

version: "3.8"
services:
  plex:
    image: plexinc/pms-docker:1.21.3.4014-58bd20c02
    runtime: nvidia
    deploy:
      resources:
        reservations:
          devices:
            - capabilities:
              - gpu
    environment:
     - TZ=Europe/Paris
     - PLEX_CLAIM=claim-xxx
     - ADVERTISE_IP=https://xxx:443
     - NVIDIA_VISIBLE_DEVICES=all
     - NVIDIA_DRIVER_CAPABILITIES=compute,video,utility
    volumes:
      - /home/xxx/plex/config:/config
      - /home/xxx/plex/transcode/:/transcode
    ports:
      - 32400:32400
    networks:
      - traefik-local
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.plex.rule=Host(`xxx`)"
      - "traefik.http.routers.plex.entrypoints=websecure"
      - "traefik.http.routers.plex.tls.certresolver=myhttpchallenge"
      - "traefik.http.services.plex.loadbalancer.server.port=32400"
    restart: unless-stopped
networks:
  traefik-local:
    external: true

20210210_0704_1777x380_1612937075

@RyanHakurei
Copy link

@C84186 I don't think you need runtime: nvidia and iirc the old Nvidia Runtime was deprecated in favor of Nvidia Container Toolkit anyway. I have that omitted and hardware transcoding is working for me:

version: "3.3"
services:
...
# Plex Media Server
    plex:
        restart: always
        container_name: Plex
        network_mode: host
        deploy:
           resources:
             reservations:
               devices:
                 - capabilities:
                   - gpu
        labels:
            - com.centurylinklabs.watchtower.enable=true
        environment:
            - PLEX_CLAIM=Lolno
            - PUID=1000
            - PGID=1000
            - VERSION=latest
            - NVIDIA_VISIBLE_DEVICES=all
            - NVIDIA_DRIVER_CAPABILITIE=all
        volumes:
            - '/mnt/SSD/Sandbox/Plex/Data:/config'
            - '/mnt/Media/Sandbox/Plex/Library:/data/:ro'
            - '/mnt/Media/Sandbox/Plex/Prerolls:/Prerolls:ro'
            - '/mnt/Media/Sandbox/Plex/Sync:/transcode'
            - '/mnt/Media/LetsEncrypt:/Keys:ro'
        image: linuxserver/plex

Screenshot at 2021-02-11 01-12-16
Screenshot at 2021-02-11 01-12-35

@estimadarocha
Copy link

i confirm what @ryaniskira said.

when nvidia start to deprecate runtime: nvidia in favor of --gpus all this is what leads to all this needed changes on docker compose and portainer.

so if we use the new options:
deploy:
resources:
reservations:
devices:
- capabilities:
- gpu

runtime:nvidia shouldn't be used

@vk1z
Copy link

vk1z commented Feb 11, 2021

I can't speak to Plex but I don't seem to need the environment variables NVIDIA_VISIBLE_DEVICES or NVIDIA_DRIVER_CAPABILITIES

@RyanHakurei
Copy link

@vk1z I seem to need it, I passed the GPU to my Boinc container without passing those variables and it did not detect my GPU, added them to Boinc's compose and suddenly it started to download tasks from GPUGrid.

@vk1z
Copy link

vk1z commented Feb 11, 2021

@ryaniskira : Weird. Doesn't engender confidence TBH. Will have to look at this more closely.

@Atralb
Copy link

Atralb commented Feb 15, 2021

@ryaniskira @estimadarocha You guys are ignorant of the current state of nvidia-docker and claiming something you heard as true without ever having verified it yourselves.

the old Nvidia Runtime was deprecated in favor of Nvidia Container Toolkit anyway

This is completely false, and even a misunderstanding of the different entities in the nvidia container stack and how they interact together.

You should maybe read the official documentation sometimes: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/arch-overview.html

As we can clearly see, runtime: nvidia is still there, and it is even precisely what is actually leveraged under the hood by the --gpus option.

The "nvidia runtime" is simply a piece of config in your daemon.json that asks to use the Nvidia container toolkit. The latter is absolutely not a replacement of the former, since they are both the same thing, just at different levels.

@Motophan
Copy link

Motophan commented Feb 15, 2021 via email

@estimadarocha
Copy link

@Atralb picking my ignorancy I will try to get some of my free time to have a close look at the info you post. Thanks for the info.

Meanwhile can you point us the best approach to use?

@Atralb
Copy link

Atralb commented Feb 15, 2021

@Motophan @estimadarocha Never set up a jellyfin/plex container yet. But since the new compose spec (docker-compose > 1.28) includes a runtime parameter, I simply use runtime: nvidia in my compose.yaml files for a tensorflow cuda container and my trainings work perfectly.

@RyanHakurei
Copy link

@Atralb Woah woah woah that's a lot of worlds for "I am fucking wrong and am going to look like an idiot while trying to grandstand above others" as even your own citation states:

With Docker 19.03+, this is fine because Docker directly invokes nvidia-container-toolkit when you pass it the --gpus option instead of relying on the nvidia-container-runtime as a proxy.

So bam, deprecated and no longer needed AS PER YOUR OWN DOCUMENTATION that you so gleefully suggest that I read. Nvidia-container-runtime is no longer needed to proxy things as Docker can directly invoke Nvidia-container-toolkit now. There's also the Archwiki which recommends as much in the Docker page:

Starting from Docker version 19.03, NVIDIA GPUs are natively supported as Docker devices. NVIDIA Container Toolkit is the recommended way of running containers that leverage NVIDIA GPUs.

but I guess they're wrong too huh?

EDIT: Also if you actually, you know, read the thread you would see people having issues trying to invoke the deprecated Nvidia-container-runtime.

@Motophan
Copy link

Motophan commented Feb 15, 2021 via email

@Atralb
Copy link

Atralb commented Feb 16, 2021

@ryaniskira Lol, as we all know, all caps is the best argument indeed :).

You're saying a lot of words, but nowhere did you provide actual source that the "runtime" is deprecated. That's just your sole interpretation of what you're reading.

EDIT: Also if you actually, you know, read the thread you would see people having issues trying to invoke the deprecated Nvidia-container-runtime.

Again, showcasing your ignorance of the history of this issue. What you're linking was during the era of compose file v3, where runtime didn't exist, and people simply didn't read the documentation just like you.

All these comments are void since the new compose spec reintroduced the keyword, and that's exactly the issue here. People are recommending methods which are obsolete, which were developed by the community to fill this gap of runtime in v3, but have no point existing now. Which is precisely why I intervened.

But sure, getting all worked up cause you're wrong will surely make you right :).

@chris-crone
Copy link
Member

Hi all, this thread is getting a bit heated. Let's remember there's a real person on the other side of each comment.

We've updated the official docs with instructions for how to get GPU support working with Compose: https://docs.docker.com/compose/gpu-support/

I've noticed that the prerequisites link there is broken (we'll fix it soon!), you'll need to follow these instructions: https://docs.docker.com/config/containers/resource_constraints/#gpu

I'll be locking this thread, please open a new issue if you've followed those instructions and run into an issue

@docker docker locked as too heated and limited conversation to collaborators Feb 16, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.