Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

native cuda support #75

Closed
wants to merge 9 commits into from
Closed

native cuda support #75

wants to merge 9 commits into from

Conversation

robertgzr
Copy link

@robertgzr robertgzr commented Jun 3, 2020

With balena-engine 19.03.13 merged in meta-balena it's possible to give containers access to the graphics hardware via NVIDIA's container toolkit.

We need:

  • meta-balena >= 2.51.0
  • nvidida-container-toolkit
  • cuda toolkit kernel + userspace libs

Connects-to: #57

acostach and others added 9 commits June 3, 2020 18:00
For some reason that needs a bit more digging,
the meta-tegra cuda recipe won't get the cuda-repo
deb from the devnet. Let's add an append for now, while testing,
since distributing cuda libraries already needs special
permissions.

Signed-off-by: Alexandru Costache <[email protected]>
We used to mask all graphics recipes,
since graphics packages go to containers.

But now we need some libs and their dependencies
in the hostOS, so let's unmask them.

Signed-off-by: Alexandru Costache <[email protected]>
Base recipe runtime depends render this package
unbuildable. Drop dependencies as we don't need
docker-ce, cuda-toolkit or others.

Signed-off-by: Alexandru Costache <[email protected]>
includes the 19.03.13 version of balena-engine

Signed-off-by: Robert Günzler <[email protected]>
@resin-jenkins
Copy link

Can one of the admins verify this patch?

@robertgzr
Copy link
Author

robertgzr commented Jun 3, 2020

cc @acostach @dremsol

@acostach
Copy link
Contributor

acostach commented Jun 4, 2020

@resin-jenkins test this please

robertgzr added a commit to balena-os/balena-supervisor that referenced this pull request Jun 5, 2020
In the absence of an upstream implementation of the DeviceRequest API introduced
as part of Docker API v1.40 we roll our own using a feature label.

As per my comment in the code, we fall back to the default behavior of
docker cli's `--gpu` and request single device with the `gpu` capabilty.
The only implementation at the moment is the NVIDIA driver; here:
https://github.com/balena-os/balena-engine/blob/master/daemon/nvidia_linux.go

Background on the composefile implementation:
compose-spec/compose-spec#74
docker/compose#6691

Change-type: patch
Connects-to: balena-os/balena-jetson#75
Signed-off-by: Robert Günzler <[email protected]>
robertgzr added a commit to balena-os/balena-supervisor that referenced this pull request Jun 5, 2020
In the absence of an upstream implementation of the DeviceRequest API introduced
as part of Docker API v1.40 we roll our own using a feature label.

As per my comment in the code, we fall back to the default behavior of
docker cli's `--gpu` and request single device with the `gpu` capabilty.
The only implementation at the moment is the NVIDIA driver; here:
https://github.com/balena-os/balena-engine/blob/master/daemon/nvidia_linux.go

Background on the composefile implementation:
compose-spec/compose-spec#74
docker/compose#6691

Change-type: patch
Connects-to: balena-os/balena-jetson#75
Signed-off-by: Robert Günzler <[email protected]>
robertgzr added a commit to balena-os/balena-supervisor that referenced this pull request Jun 5, 2020
In the absence of an upstream implementation of the DeviceRequest API introduced
as part of Docker API v1.40 we roll our own using a feature label.

As per my comment in the code, we fall back to the default behavior of
docker cli's `--gpu` and request single device with the `gpu` capabilty.
The only implementation at the moment is the NVIDIA driver; here:
https://github.com/balena-os/balena-engine/blob/master/daemon/nvidia_linux.go

Background on the composefile implementation:
compose-spec/compose-spec#74
docker/compose#6691

Change-type: patch
Connects-to: balena-os/balena-jetson#75
Signed-off-by: Robert Günzler <[email protected]>
robertgzr added a commit to balena-os/balena-supervisor that referenced this pull request Jun 8, 2020
In the absence of an upstream implementation of the DeviceRequest API introduced
as part of Docker API v1.40 we roll our own using a feature label.

As per my comment in the code, we fall back to the default behavior of
docker cli's `--gpu` and request single device with the `gpu` capabilty.
The only implementation at the moment is the NVIDIA driver; here:
https://github.com/balena-os/balena-engine/blob/master/daemon/nvidia_linux.go

Background on the composefile implementation:
compose-spec/compose-spec#74
docker/compose#6691

Change-type: patch
Connects-to: balena-os/balena-jetson#75
Signed-off-by: Robert Günzler <[email protected]>
robertgzr added a commit to balena-os/balena-supervisor that referenced this pull request Jun 9, 2020
In the absence of an upstream implementation of the DeviceRequest API introduced
as part of Docker API v1.40 we roll our own using a feature label.

As per my comment in the code, we fall back to the default behavior of
docker cli's `--gpu` and request single device with the `gpu` capabilty.
The only implementation at the moment is the NVIDIA driver; here:
https://github.com/balena-os/balena-engine/blob/master/daemon/nvidia_linux.go

Background on the composefile implementation:
compose-spec/compose-spec#74
docker/compose#6691

Change-type: patch
Connects-to: balena-os/balena-jetson#75
Signed-off-by: Robert Günzler <[email protected]>
robertgzr added a commit to balena-os/balena-supervisor that referenced this pull request Jun 9, 2020
In the absence of an upstream implementation of the DeviceRequest API introduced
as part of Docker API v1.40 we roll our own using a feature label.

As per my comment in the code, we fall back to the default behavior of
docker cli's `--gpu` and request single device with the `gpu` capabilty.
The only implementation at the moment is the NVIDIA driver; here:
https://github.com/balena-os/balena-engine/blob/master/daemon/nvidia_linux.go

Background on the composefile implementation:
compose-spec/compose-spec#74
docker/compose#6691

Change-type: patch
Connects-to: balena-os/balena-jetson#75
Signed-off-by: Robert Günzler <[email protected]>
robertgzr added a commit to balena-os/balena-supervisor that referenced this pull request Jun 10, 2020
In the absence of an upstream implementation of the DeviceRequest API introduced
as part of Docker API v1.40 we roll our own using a feature label.

As per my comment in the code, we fall back to the default behavior of
docker cli's `--gpu` and request single device with the `gpu` capabilty.
The only implementation at the moment is the NVIDIA driver; here:
https://github.com/balena-os/balena-engine/blob/master/daemon/nvidia_linux.go

Background on the composefile implementation:
compose-spec/compose-spec#74
docker/compose#6691

Change-type: patch
Connects-to: balena-os/balena-jetson#75
Signed-off-by: Robert Günzler <[email protected]>
robertgzr added a commit to balena-os/balena-supervisor that referenced this pull request Jun 11, 2020
In the absence of an upstream implementation of the DeviceRequest API introduced
as part of Docker API v1.40 we roll our own using a feature label.

As per my comment in the code, we fall back to the default behavior of
docker cli's `--gpu` and request single device with the `gpu` capabilty.
The only implementation at the moment is the NVIDIA driver; here:
https://github.com/balena-os/balena-engine/blob/master/daemon/nvidia_linux.go

Background on the composefile implementation:
compose-spec/compose-spec#74
docker/compose#6691

Change-type: patch
Connects-to: balena-os/balena-jetson#75
Signed-off-by: Robert Günzler <[email protected]>
@robertgzr
Copy link
Author

@acostach can we run your test device with my supervisor patch applied? would be good to verify that change suffices before it's merged

robertgzr added a commit to balena-os/balena-supervisor that referenced this pull request Jun 11, 2020
In the absence of an upstream implementation of the DeviceRequest API introduced
as part of Docker API v1.40 we roll our own using a feature label.

As per my comment in the code, we fall back to the default behavior of
docker cli's `--gpu` and request single device with the `gpu` capabilty.
The only implementation at the moment is the NVIDIA driver; here:
https://github.com/balena-os/balena-engine/blob/master/daemon/nvidia_linux.go

Background on the composefile implementation:
compose-spec/compose-spec#74
docker/compose#6691

Change-type: patch
Connects-to: balena-os/balena-jetson#75
Signed-off-by: Robert Günzler <[email protected]>
robertgzr added a commit to balena-os/balena-supervisor that referenced this pull request Jun 11, 2020
In the absence of an upstream implementation of the DeviceRequest API introduced
as part of Docker API v1.40 we roll our own using a feature label.

As per my comment in the code, we fall back to the default behavior of
docker cli's `--gpu` and request single device with the `gpu` capabilty.
The only implementation at the moment is the NVIDIA driver; here:
https://github.com/balena-os/balena-engine/blob/master/daemon/nvidia_linux.go

Background on the composefile implementation:
compose-spec/compose-spec#74
docker/compose#6691

Change-type: patch
Connects-to: balena-os/balena-jetson#75
Signed-off-by: Robert Günzler <[email protected]>
robertgzr added a commit to balena-os/balena-supervisor that referenced this pull request Jun 11, 2020
In the absence of an upstream implementation of the DeviceRequest API introduced
as part of Docker API v1.40 we roll our own using a feature label.

As per my comment in the code, we fall back to the default behavior of
docker cli's `--gpu` and request single device with the `gpu` capabilty.
The only implementation at the moment is the NVIDIA driver; here:
https://github.com/balena-os/balena-engine/blob/master/daemon/nvidia_linux.go

Background on the composefile implementation:
compose-spec/compose-spec#74
docker/compose#6691

Change-type: patch
Connects-to: balena-os/balena-jetson#75
Signed-off-by: Robert Günzler <[email protected]>
@dremsol
Copy link
Contributor

dremsol commented Jul 3, 2020

@robertgzr any updates regarding this PR?

@acostach
Copy link
Contributor

acostach commented Aug 1, 2020

All the changes in this PR have been included in #85

For native cuda support to work there's still necessary for the hostapp extras feature to be implemented in meta-balena. That will allow overlaying the tegra libraries from a container into the rootfs. Those libraries are necessary for the nvidia hook to run.

@acostach acostach closed this Aug 1, 2020
pipex pushed a commit to balena-io-modules/balena-compose-experiment that referenced this pull request Mar 29, 2021
In the absence of an upstream implementation of the DeviceRequest API introduced
as part of Docker API v1.40 we roll our own using a feature label.

As per my comment in the code, we fall back to the default behavior of
docker cli's `--gpu` and request single device with the `gpu` capabilty.
The only implementation at the moment is the NVIDIA driver; here:
https://github.com/balena-os/balena-engine/blob/master/daemon/nvidia_linux.go

Background on the composefile implementation:
compose-spec/compose-spec#74
docker/compose#6691

Change-type: patch
Connects-to: balena-os/balena-jetson#75
Signed-off-by: Robert Günzler <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants