-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ergonomic Rust CPU to Rust GPU / SPIRV bindings #10
Comments
In the CUDA driver API this binding model is pretty straight forward, and a pretty nice example how how easy it can be from the host side. I'm not entirely sure how we can get this to work nicely in Rust as written but we can probably get close or better then this. This is just an example: CUdeviceptr d_C;
cuMemAlloc(&d_C, mem_size_C);
int Matrix_Width_A = 64;
void *args[5] = { &d_C, &d_A, &d_B, &Matrix_Width_A, &Matrix_Width_B};
cuLaunchKernel(matrixMul, grid.x, grid.y, grid.z,
block.x, block.y, block.z,
2*block_size*block_size*sizeof(float),
NULL, args, NULL); Essentially different types of arguments get throw together into a void** one can mix and match small types (ints, floats etc) and resource bindings (actual pointers). CUDA doesn't need much reflection here because buffers are just plain memory and pointers to them (no descriptors for buffers needed). Texture are instead bound to "module expose" public's that you bind with explicit calls before launching the kernel. This isn't as nice. |
Metal argument buffers are very flexible and powerful, and work together with resource descriptors (T#/V# etc). https://developer.apple.com/documentation/metal/buffers/about_argument_buffers |
SYCL compiles down it's GPU side C++ code along with some side-cart metadata first, this metadata is then converted into regular host side C++ to make setting up bindings as easy as possible. This approach also allows mixing GPU and CPU code in the same file, doing multiple passes over the same file. I think ideally I'd like to keep command buffers as a first class citizen within our API because we need to have control over when/how stuff gets scheduled. CUDA tends to hide this for you by making implicit streams and it synchronizes them behind your back leading to some interesting perf debugging sessions. However, I think on top of that, I'd like kernel invocations to mimic function call syntax as closely as possible - however I do think that this can involve a lot of shenanigans with side-cart data and build.rs ticks. |
I think accel demonstrates the Rust version of what you're talking about with CUDA. The primary issue with it is that it locks you into using accel for host side types. That is, you can't use accel with another similar crate like RustaCUDA. I'm sure there is a way to use traits or something to allow for using different runtimes but it's complicated. The other problem is dependencies. With spirv-builder, you just make a crate and can declare device side deps normally. But with inline device code in a host crate, this doesn't work. I think you'd have to have a crate, potentially within the source tree, that could be referenced. That is, the inline spirv would declare a relative path to the device crate. That path can be used to identify the crate, which becomes an spirv module. The inline spirv could then be for each entry (including the body of the function), which would be host visible. That is, the inline spirv would be replaced with a stub function that returns something that can be used to acquire the source (of the module, and the entry name, which could be mangled). The point is that now host code only needs to refer to the stub, the stub refers to the crate which specifies dependencies. And then runtime specific methods could be implemented, potentially externally, to compile and execute the shader code via the stub (which would know the input kind / bindings, other attributes of the entry). Potentially this could be type safe somehow, at least partially, ie via generics. |
closing this old issue, do still think we have an opportunity to make a really ergonomic connection between Rust on the CPU-side and Rust on the GPU-side with this project, but this issue is not as actionable and not a current focus so closing |
This is a design issue to research and figure out how to the bindings between shaders and our Rust CPU code (native or WASM) should look like. How should resources like buffers, input layouts, samplers, textures be bound and connected between the CPU execution and the GPU dispatch.
A bunch of this work can be done today just based on existing SPIRV shaders, but there likely will be advantages and simplifications once we Rust on both sides, GPU shaders compiling with Rust being bound and dispatched from Rust CPU code, so that can also be discussed.
There is previous work to look into also wrt to SPIRV reflection and Rust binding, like Rendy's descriptor design
The text was updated successfully, but these errors were encountered: