Skip to content

Commit

Permalink
Merge branch 'master' into spec-const-numthreads
Browse files Browse the repository at this point in the history
  • Loading branch information
juliusikkala authored Jan 9, 2025
2 parents 736af8f + 6706c1a commit faa38f9
Show file tree
Hide file tree
Showing 108 changed files with 2,567 additions and 538 deletions.
3 changes: 3 additions & 0 deletions .github/workflows/ci-examples.sh
Original file line number Diff line number Diff line change
Expand Up @@ -170,6 +170,9 @@ function run_sample {
pushd "$bin_dir" 1>/dev/null 2>&1
if [[ ! "$dry_run" = true ]]; then
./"$sample" "${args[@]}" || result=$?
if [[ -f ./"log-$sample.txt" ]]; then
cat ./"log-$sample.txt"
fi
fi
if [[ $result -eq 0 ]]; then
summary=("${summary[@]}" " success")
Expand Down
199 changes: 199 additions & 0 deletions docs/proposals/015-bindless-t.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,199 @@
SP #015 - `Bindless<T>` type
==============

## Status

Author: Yong He

Status: Design review.

Implementation:

Reviewed by: Theresa Foley

## Background

Textures, sampler states and buffers are typically passed to shader as opaque handles whose size and storage address is undefined. These handles are communicated to the GPU via "bind states" that are modified with host-side APIs. Because the handle has unknown size, it is not possible to read, copy or construct such a handle from the shader code, and it is not possible to store the handle in buffer memory. This makes both host code and shader code difficult to write and prevents more flexible encapsulation or clean object-oriented designs.

With the recent advancement in hardware capabilities, a lot of modern graphics systems are adopting a "bindless" parameter passing idiom, where all resource handles are passed to the shader in a single global array, and all remaining references to texture, buffers or sampler states are represented as a single integer index into the array. This allows the shader code to workaround the restrictions around the opaque handle types.

Direct3D Shader Model 6.6 introduces the "Dynamic Resources" capability, which further simplifies the way to write bindless shader code by removing the need to even declare the global array.

We believe that graphics developers will greatly benefit from a system defined programming model for the bindless parameter passing idom that is versatile and cross-platform, which will provide a consistent interface so that different shader libraries using the bindless pattern can interop with each other without barriers.

## Proposed Approach

We introduce a `Bindless<T>` type that is defined as:
```
struct Bindless<T> : IComparable<Bindless<T>>
where T : IOpaqueHandle
{
__init(uint64_t value);
}
```
Where `IOpaqueHandle` is an interface that is implemented by all texture, buffer and sampler state types:

```slang
enum ResourceKind
{
Unknown, Texture, ConstantBuffer, StorageBuffer, Sampler
}
interface IOpaqueHandle
{
static const ResourceKind kind;
}
```

### Basic Usage

`Bindless<T>` should provide the following features:

- Construction/explicit cast from a 64-bit integer.
- Explicit cast to a 64-bit integer.
- Equality comparison.
- Implicit dereference to `T`.
- Implicit conversion to `T`.

For example:

```slang
uniform Bindless<Texture2D> texture;
uniform Bindless<SamplerState> sampler;
void test()
{
// Explicit cast from bindless handle to uint64_t value.
let idx = (uint64_t)texture;
// Constructing bindless handle from uint64_t value.
let t = Bindless<Texture2D>(idx);
// Comparison.
ASSERT(t == texture);
// OK, `t` is first implicitly dereferenced to producee `Texture2D`, and
// then `Texture2D::Sample` is called.
// The `sampler` argument is implicitly converted from `Bindless<SamplerState>`
// to `SamplerState`.
t.Sample(sampler, float2(0,0));
}
```

A `Bindless<T>` type is a concrete type whose size is always 8 bytes and is internally a 64-bit integer.
This means that you can use a `Bindless<T>` type in any context where an ordinary data type, e.g. `int` type
is allowed, such as in buffer elements.

On targets where resource handles are already concrete and sized types, `Bindless<T>` simply translates to just `T`.
If `T` has native size or alignment that is less than 8 bytes, it will be rounded up to 8 bytes. If the native size for
`T` is greater than 8 bytes, it will be treated as an opaque type instead of translating to `T`.

### Obtaining Actual Resource Handle from `Bindless<T>`

Depending on the target platform and the design choices of the user's application, the way to obtain the actual
resource handle from a `Bindless<T>` integer handle can vary. Slang does not dictate how this conversion is done,
and instead, this is left to the user via Slang's link-time specialization ability.

Slang defines the following core module declarations:

```slang
extern T getResourceFromBindlessHandle(uint64_t handle) where T : IOpaqueHandle
{
// Default Implementation
// ...
}
```

The `getResourceFromBindlessHandle` is used to convert from a bindless handle to actual opaque resource handle.
If this function is not provided by the user, the default implementation defined in the core module will be used.

By default, the core module implementation of `getResourceFromBindlessHandle` should use the `ResourceDescriptorHeap` and
`SamplerDescriptorHeap` builtin object when generating HLSL code. When generating code on other targets, `getResourceFromBindlessHandle`
will fetch the resource handle from a system defined global array of the corresponding resource type.

If/when SPIRV is extended to expose similar capabilities as D3D's `ResourceDescriptorHeap` feature, we should change the default implementation
to use that instead. Until we know the default implementation of `getResourceFromBindlessHandle` is stable, we should advise users
to provide their own implementation of `getResourceFromBindlessHandle` to prevent breakages.

If the user application requires a different bindless implementation, this default behavior can be overrided by defining
`getResourceFromBindlessHandle` in the user code. Below is a possible user-space implementation of `getResourceFromBindlessHandle`
for Vulkan:

```slang
// All texture and buffer handles are defined in descriptor set 100.
[vk::binding(0, 100)]
__DynamicResource<__DynamicResourceKind.General> resourceHandles[];
// All sampler handles are defined in descriptor set 101.
[vk::binding(0, 101)]
__DynamicResource<__DynamicResourceKind.Sampler> samplerHandles[];
export getResourceFromBindlessHandle<T>(uint64_t handle) where T : IOpaqueHandle
{
if (T.kind == ResourceKind.Sampler)
return (T)samplerHandles[handle];
else
return (T)resourceHandles[handle];
}
```

### Invalid Handle

We reserve `uint64_t.maxValue` as a special handle value of `Bindless<T>` types to mean an invalid/null resource.
This will allow us to optimize `Optional<Bindless<Texture2D>>` to use the reserved value to mean no-value.

The user should also be able to use `Bindless<T>.invalid` to refer to such an invalid value:

```slang
struct Bindless<T> where T:IOpaqueHandle
{
static const Bindless<T> invalid = Bindless<T>(uint64_t.maxValue);
}
```

## Alternatives Considered

We initially considered to support a more general `Bindless<T>` where `T` can be any composite type, for example, allowing the following:

```slang
struct Foo
{
Texture2D t;
SamplerState s;
float ordinaryData;
}
uniform Bindless<Foo> foo;
```

which is equivalent to:

```slang
struct Bindless_Foo
{
Bindless<Texture2D> t;
Bindless<SamplerState> s;
float s;
}
uniform Bindless_Foo foo;
```

While relaxing `T` this way adds an extra layer of convenience, it introduces complicated
semantic rules to the type system, and there is increased chance of exposing tricky corner
cases that are hard to get right.

An argument for allowing `T` to be general composite types is that it enables sharing the same
code for both bindless systems and bindful systems. But this argument can also be countered by
allowing the compiler to treat `Bindless<T>` as `T` in a special mode if this feature is found to be useful.

For now we think that restricting `T` to be an `IOpaqueHandle` type will result in a much simpler implementation, and is likely sufficient for current needs. Given that the trend of modern GPU architecture is moving towards bindless idioms and the whole idea of opaque handles may disappear in the future, we should be cautious at inventing too many heavy weight mechanisms around opaque handles. Nevertheless, this proposal still allows us to relax this requirement in the future if it becomes clear that such feature is valuable to our users.

## Conclusion

This proposal introduces a standard way to achieve bindless parameter passing idom on current graphics platforms.
Standardizing the way of writing bindless parameter binding code is essential for creating reusable shader code
libraries. The convenience language features around `Bindless<T>` type should also make shader code easier to write
and to maintain. Finally, by using Slang's link time specialization feature,
this proposal allows Slang to not get into the way of dicatating one specific way of passing
the actual resource handles to the shader code, and allows the user to customize how the conversion from integer handle
to resource handle is done in a way that best suites the application's design.
23 changes: 23 additions & 0 deletions examples/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,6 @@
function(example dir)
cmake_parse_arguments(ARG "WIN32_EXECUTABLE" "" "" ${ARGN})

set(debug_dir ${CMAKE_CURRENT_BINARY_DIR}/${dir})

file(
Expand Down Expand Up @@ -30,6 +32,22 @@ function(example dir)
)
endif()

# Libraries providing a main function that prints stack traces on exceptions
if(CMAKE_SYSTEM_NAME MATCHES "Windows")
# On Windows we have two different versions: main for "console applications" and
# WinMain for normal Windows applications.
if(${ARG_WIN32_EXECUTABLE})
set(main_wrapper_libraries example-winmain)
else()
set(main_wrapper_libraries example-main)
endif()
# Add stack printing support
set(main_wrapper_libraries ${main_wrapper_libraries} stacktrace-windows)
set(main_wrapper_libraries ${main_wrapper_libraries} dbghelp.lib)
else()
set(main_wrapper_libraries example-main)
endif()

slang_add_target(
${dir}
EXECUTABLE
Expand All @@ -42,7 +60,9 @@ function(example dir)
gfx-util
platform
$<$<BOOL:${SLANG_ENABLE_CUDA}>:CUDA::cuda_driver>
${main_wrapper_libraries}
EXTRA_COMPILE_DEFINITIONS_PRIVATE
SLANG_EXAMPLE_NAME=${dir}
$<$<BOOL:${SLANG_ENABLE_XLIB}>:SLANG_ENABLE_XLIB>
REQUIRED_BY all-examples
OPTIONAL_REQUIRES ${copy_assets_target} copy-prebuilt-binaries
Expand All @@ -68,6 +88,9 @@ if(SLANG_ENABLE_EXAMPLES)
$<$<BOOL:${SLANG_ENABLE_CUDA}>:CUDA::cuda_driver>
FOLDER examples
)
slang_add_target(example-main STATIC FOLDER examples)
slang_add_target(example-winmain STATIC FOLDER examples EXCLUDE_FROM_ALL)
slang_add_target(stacktrace-windows STATIC FOLDER examples EXCLUDE_FROM_ALL)

add_custom_target(
all-examples
Expand Down
2 changes: 1 addition & 1 deletion examples/autodiff-texture/main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -823,4 +823,4 @@ struct AutoDiffTexture : public WindowedAppBase
}
};

PLATFORM_UI_MAIN(innerMain<AutoDiffTexture>)
EXAMPLE_MAIN(innerMain<AutoDiffTexture>);
2 changes: 1 addition & 1 deletion examples/cpu-com-example/main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -175,7 +175,7 @@ static SlangResult _innerMain(int argc, char** argv)
return SLANG_OK;
}

int main(int argc, char** argv)
int exampleMain(int argc, char** argv)
{
return SLANG_SUCCEEDED(_innerMain(argc, argv)) ? 0 : -1;
}
2 changes: 1 addition & 1 deletion examples/cpu-hello-world/main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -217,7 +217,7 @@ static SlangResult _innerMain(int argc, char** argv)
return SLANG_OK;
}

int main(int argc, char** argv)
int exampleMain(int argc, char** argv)
{
return SLANG_SUCCEEDED(_innerMain(argc, argv)) ? 0 : -1;
}
13 changes: 13 additions & 0 deletions examples/example-base/example-base.h
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,19 @@
void _Win32OutputDebugString(const char* str);
#endif

#define SLANG_STRINGIFY(x) #x
#define SLANG_EXPAND_STRINGIFY(x) SLANG_STRINGIFY(x)

#ifdef _WIN32
#define EXAMPLE_MAIN(innerMain) \
extern const char* const g_logFileName = \
"log-" SLANG_EXPAND_STRINGIFY(SLANG_EXAMPLE_NAME) ".txt"; \
PLATFORM_UI_MAIN(innerMain);

#else
#define EXAMPLE_MAIN(innerMain) PLATFORM_UI_MAIN(innerMain)
#endif // _WIN32

struct WindowedAppBase : public TestBase
{
protected:
Expand Down
32 changes: 32 additions & 0 deletions examples/example-main/main.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
#include "../stacktrace-windows/common.h"

#include <stdio.h>
#include <stdlib.h>

extern int exampleMain(int argc, char** argv);

#if defined(_WIN32)

#include <windows.h>

int main(int argc, char** argv)
{
__try
{
return exampleMain(argc, argv);
}
__except (exceptionFilter(stdout, GetExceptionInformation()))
{
::exit(1);
}
}

#else // defined(_WIN32)

int main(int argc, char** argv)
{
// TODO: Catch exception and print stack trace also on non-Windows platforms.
return exampleMain(argc, argv);
}

#endif
28 changes: 28 additions & 0 deletions examples/example-winmain/main.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
#include "../stacktrace-windows/common.h"

#include <stdio.h>
#include <stdlib.h>
#include <windows.h>

extern int exampleMain(int argc, char** argv);
extern const char* const g_logFileName;

int WinMain(
HINSTANCE /* instance */,
HINSTANCE /* prevInstance */,
LPSTR /* commandLine */,
int /*showCommand*/)

{
FILE* logFile = fopen(g_logFileName, "w");
__try
{
int argc = 0;
char** argv = nullptr;
return exampleMain(argc, argv);
}
__except (exceptionFilter(logFile, GetExceptionInformation()))
{
::exit(1);
}
}
2 changes: 1 addition & 1 deletion examples/gpu-printing/main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -152,7 +152,7 @@ struct ExampleProgram : public TestBase
}
};

int main(int argc, char* argv[])
int exampleMain(int argc, char** argv)
{
ExampleProgram app;
if (SLANG_FAILED(app.execute(argc, argv)))
Expand Down
Loading

0 comments on commit faa38f9

Please sign in to comment.