Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vulkan raytracing plumbing #99119

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Fahien
Copy link
Contributor

@Fahien Fahien commented Nov 12, 2024

Here's a bunch of code adding some Vulkan raytracing stuff to the rendering device:

  • Vulkan implementations in RenderingDeviceDriverVulkan
  • Raytracing instruction list in RenderingDeviceGraph
  • Functions to create acceleration structures and raytracing pipelines in RenderingDevice

There's more in the fahien/raytracing-test branch, with code handling raygen, miss, and closest-hit shaders, but it's a hack on top of the forward clustered renderer, and it needs some "guided" refactoring.

Here's a sample which uses GDScript to drive the renderer: raytracing-gdscript-demo

I hope this changes would look useful to jump-start raytracing support.

Relevant for godotengine/godot-proposals#5162

@Fahien Fahien requested a review from a team as a code owner November 12, 2024 12:21
@Chaosus
Copy link
Member

Chaosus commented Nov 12, 2024

Commits needs to be squashed (see https://docs.godotengine.org/en/latest/contributing/workflow/pr_workflow.html).

@Chaosus Chaosus added this to the 4.4 milestone Nov 12, 2024
@Fahien Fahien force-pushed the fahien/raytracing-base branch from d5f4b02 to 4ff001d Compare November 12, 2024 13:24
@DarioSamo
Copy link
Contributor

DarioSamo commented Nov 12, 2024

I had a very brief look and it looks pretty good to me, although I'd hold off on making changes that add a new rendering method. Are you planning on sticking to only supporting the hit shader workflow or would you also like to add ray query support as well?

If you'd like to test your rendering without adding a new rendering method, keep in mind you can also use GDScript to drive the renderer and it should be pretty sufficient to test it out. This is also a simple way you can provide us with a demo project to test it out.

@octanejohn
Copy link

i see that you started on a renderer too on that repo, i think your renderer could be tested/based on the godot ideas

https://gist.github.com/reduz/c5769d0e705d8ab7ac187d63be0099b5

@Fahien Fahien force-pushed the fahien/raytracing-base branch 3 times, most recently from 3eea43f to 2e1d952 Compare November 16, 2024 09:14
@Fahien Fahien requested review from a team as code owners November 16, 2024 09:14
@Fahien Fahien force-pushed the fahien/raytracing-base branch from 2e1d952 to 61a9d1c Compare November 16, 2024 15:16
@Fahien Fahien requested a review from a team as a code owner November 16, 2024 15:16
@Fahien Fahien force-pushed the fahien/raytracing-base branch 2 times, most recently from b456ef5 to a784c7d Compare November 17, 2024 14:23
@Fahien
Copy link
Contributor Author

Fahien commented Nov 17, 2024

@DarioSamo

I had a very brief look and it looks pretty good to me, although I'd hold off on making changes that add a new rendering method. Are you planning on sticking to only supporting the hit shader workflow or would you also like to add ray query support as well?

Just focusing on the hit shader for now.

If you'd like to test your rendering without adding a new rendering method, keep in mind you can also use GDScript to drive the renderer and it should be pretty sufficient to test it out. This is also a simple way you can provide us with a demo project to test it out.

There you go: https://github.com/Fahien/godot-raytracing-gdscript-demo

@Fahien Fahien force-pushed the fahien/raytracing-base branch 2 times, most recently from de229f2 to 2474b4d Compare November 17, 2024 17:06
@a-johnston
Copy link
Contributor

a-johnston commented Nov 19, 2024

This is very cool to see! Last night I took a look at how hard it might be adding metal support and while it's definitely out of my wheelhouse here's a commit to at least get it building on macos + a new method to guard user code from invoking raytracing when it isn't supported (also includes some precommit changes) a-johnston@a220083

ed: I have some (probably very misguided and) incomplete changes on https://github.com/a-johnston/godot/tree/raytracing-metal but giving up where it is. Currently it blows up for spirv_cross::CompilerError: A memory declaration object must be used in TraceRayKHR. when in the raytracing renderer but there are other issues past that point. Also notable that this has a lot of conflicts with the hddagi branch, although that's too be expected, if one or the other is going to be merged soon.

@AThousandShips AThousandShips modified the milestones: 4.4, 4.x Nov 20, 2024
@Fahien Fahien force-pushed the fahien/raytracing-base branch 2 times, most recently from f03f6a2 to ef1d75e Compare December 6, 2024 10:20
doc/classes/RDShaderSource.xml Outdated Show resolved Hide resolved
doc/classes/RenderingDevice.xml Outdated Show resolved Hide resolved
doc/classes/RenderingDevice.xml Outdated Show resolved Hide resolved
@Fahien Fahien force-pushed the fahien/raytracing-base branch 6 times, most recently from 2246469 to 502748c Compare December 9, 2024 09:22
@DarioSamo
Copy link
Contributor

Thanks to @Fahien for making the changes we agreed to. I won't have much time to review this until I get back in a few weeks but I'd like to know if you're willing to extend this PR to cover the other backends or if the intention is only for Vulkan to be merged first.

If you wish to tackle the other APIs, I can leave you some relevant documentation on how to replicate the drivers functionality in the meantime.

@Fahien
Copy link
Contributor Author

Fahien commented Dec 9, 2024

@DarioSamo: I'd like to know if you're willing to extend this PR to cover the other backends or if the intention is only for Vulkan to be merged first.

I can try, but I wonder if it would be better doing this in subsequent PRs.

@Fahien Fahien force-pushed the fahien/raytracing-base branch 2 times, most recently from 355db76 to 4e5315f Compare December 9, 2024 18:05
@Fahien Fahien force-pushed the fahien/raytracing-base branch 4 times, most recently from 77d44b4 to c1adab9 Compare December 14, 2024 09:06
@Fahien
Copy link
Contributor Author

Fahien commented Dec 14, 2024

@Calinou: I wonder if this should also be exposed as a shader preprocessor define in the long run (for Godot's shader language), so that you can perform different shader compilations depending on support.

Done

@Bonkahe
Copy link
Contributor

Bonkahe commented Dec 16, 2024

Sorry if I'm a little bit in-experienced, but out of curiosity I pulled the branch and built it, then ran it on the example gdscript project, it was obviously out of date with the latest version of the branch, so I updated the render() function in raytracing.gd with the one found in the raytracing_list_begin description, with the addition of the last two lines actually getting the texture and assigning it to the viewport. Unfortunately this is the result:
image

Any idea why this is occurring? I assume I missed something but it's not currently throwing any errors.

The full script is here:

extends Node3D

@onready
var rd := RenderingServer.create_local_rendering_device()
@onready
var screen_texture := get_node("TextureRect")

var raytracing_texture: RID
var shader: RID
var raytracing_pipeline: RID
var vertex_buffer: RID
var vertex_array: RID
var index_buffer: RID
var index_array: RID
var transform_buffer: RID
var blas: RID
var tlas: RID
var uniform_set: RID

func _cleanup():
	if rd == null:
		return

	rd.free_rid(uniform_set)
	rd.free_rid(tlas)
	rd.free_rid(blas)
	rd.free_rid(transform_buffer)
	rd.free_rid(index_array)
	rd.free_rid(index_buffer)
	rd.free_rid(vertex_array)
	rd.free_rid(vertex_buffer)
	rd.free_rid(raytracing_pipeline)
	rd.free_rid(shader)
	rd.free_rid(raytracing_texture)
	rd.free()
	rd = null

func _notification(what: int):
	if what == NOTIFICATION_PREDELETE:
		_cleanup()

func _ready():
	if rd.raytracing_is_supported():
		_initialise_screen_texture()
		_initialize_raytracing_texture()
		_initialize_scene()
		_initialize_raytracing_pipeline()

func _process(_delta):
	if rd.raytracing_is_supported():
		_render()

func _initialize_raytracing_texture():
	# Create texture for raytracing rendering.
	var texture_format := RDTextureFormat.new()
	texture_format.texture_type = RenderingDevice.TEXTURE_TYPE_2D
	texture_format.format = RenderingDevice.DATA_FORMAT_R32G32B32A32_SFLOAT
	texture_format.width = get_viewport().size.x
	texture_format.height = get_viewport().size.y
	# It needs storage bit for the raytracing pipeline and can copy from for the presentation graphics pipeline.
	texture_format.usage_bits = RenderingDevice.TEXTURE_USAGE_CAN_COPY_FROM_BIT | RenderingDevice.TEXTURE_USAGE_STORAGE_BIT
	var texture_view := RDTextureView.new()
	raytracing_texture = rd.texture_create(texture_format, texture_view)

func _initialise_screen_texture():
	var image_size = get_viewport().size
	var image = Image.create(image_size.x, image_size.y, false, Image.FORMAT_RGBAF)
	var image_texture = ImageTexture.create_from_image(image)
	screen_texture.texture = image_texture

func _set_screen_texture_data(data: PackedByteArray):
	var image_size = get_viewport().size
	var image := Image.create_from_data(image_size.x, image_size.y, false, Image.FORMAT_RGBAF, data)
	screen_texture.texture.update(image)

func _initialize_raytracing_pipeline():
	# Create raytracing shaders.
	var shader_file := load("res://ray.glsl")
	var shader_spirv: RDShaderSPIRV = shader_file.get_spirv()
	shader = rd.shader_create_from_spirv(shader_spirv)
	raytracing_pipeline = rd.raytracing_pipeline_create(shader)

	var image_uniform := RDUniform.new()
	image_uniform.uniform_type = RenderingDevice.UNIFORM_TYPE_IMAGE
	image_uniform.binding = 0
	image_uniform.add_id(raytracing_texture)
	
	var as_uniform := RDUniform.new()
	as_uniform.uniform_type = RenderingDevice.UNIFORM_TYPE_ACCELERATION_STRUCTURE
	as_uniform.binding = 1
	as_uniform.add_id(tlas)
	
	uniform_set = rd.uniform_set_create([image_uniform, as_uniform], shader, 0)

func _initialize_scene():
	# Vertex buffer for a triangle
	# Prepare our data. We use floats in the shader, so we need 32 bit.
	var points := PackedFloat32Array([
			 0.0, -0.7, 1.0,
			 0.5, -0.7, 1.0,
			 0.0,  0.5, 1.0,
			-0.5, -0.7, 1.0,
			 0.5, -0.7, 1.0,
			-0.5,  0.5, 1.0,
		])
	var point_bytes := points.to_byte_array()
	vertex_buffer = rd.vertex_buffer_create(point_bytes.size(), point_bytes, true)
	var vertex_desc := RDVertexAttribute.new()
	vertex_desc.format = RenderingDevice.DATA_FORMAT_R32G32B32_SFLOAT
	vertex_desc.location = 0
	vertex_desc.stride = 4 * 3
	var vertex_format := rd.vertex_format_create([vertex_desc])
	vertex_array = rd.vertex_array_create(points.size() / 3, vertex_format, [vertex_buffer], [3*3*4])

	# Index buffer
	var indices := PackedInt32Array([0, 2, 1])
	var index_bytes := indices.to_byte_array()
	index_buffer = rd.index_buffer_create(indices.size(), RenderingDevice.INDEX_BUFFER_FORMAT_UINT32, index_bytes)
	index_array = rd.index_array_create(index_buffer, 0, indices.size())

	# Transform buffer
	var transform_matrix := PackedFloat32Array([
		1.0, 0.0, 0.0, 0.0,
		0.0, 1.0, 0.0, 0.0,
		0.0, 0.0, 1.0, 0.0,
	])
	var transform_bytes := transform_matrix.to_byte_array()
	transform_buffer = rd.storage_buffer_create(transform_bytes.size(), transform_bytes, RenderingDevice.STORAGE_BUFFER_USAGE_SHADER_DEVICE_ADDRESS | RenderingDevice.STORAGE_BUFFER_USAGE_ACCELERATION_STRUCTURE_BUILD_INPUT_READ_ONLY)

	# Create a BLAS for a mesh
	blas = rd.blas_create(vertex_array, index_array, transform_buffer)
	# Create TLAS with BLASs.
	tlas = rd.tlas_create([blas])

func _render():
	# Build acceleration structures.
	rd.acceleration_structure_build(blas)
	rd.acceleration_structure_build(tlas)
	
	var raylist = rd.raytracing_list_begin()
	
	# Bind pipeline and uniforms.
	rd.raytracing_list_bind_raytracing_pipeline(raylist, raytracing_pipeline)
	rd.raytracing_list_bind_uniform_set(raylist, uniform_set, 0)
	
	# Trace rays.
	var width = get_viewport().size.x
	var height = get_viewport().size.y
	rd.raytracing_list_trace_rays(raylist, width, height)
	
	rd.raytracing_list_end()
	
	var byte_data := rd.texture_get_data(raytracing_texture, 0)
	_set_screen_texture_data(byte_data)

@Fahien
Copy link
Contributor Author

Fahien commented Dec 16, 2024

@Bonkahe, thank you for testing this, and do not worry for your experience level, we all started from zero and improved with time and dedication :)

I suppose you expected to see the geometries in the scene, but the demo is currently not able to get them to build raytracing acceleration structures from GDscript. If you take a look at the _initialize_scene() function, you'll notice that it's creating vertices and indices for a simple geometry "manually", which is the triangle you see. Then, the raytracing shaders in the demo do not do anything fancy: green when a ray hits a geometry, red when it does not. So, everything works as expected!

P.S. I have been experimenting with a compositor effect in other branches, but there are still some issues to solve, but this is a story for another PR:

- Vulkan implementations in `RenderingDeviceDriverVulkan`
- Raytracing instruction list in `RenderingDeviceGraph`
- Functions to create acceleration structures and raytracing pipelines
  in `RenderingDevice`
- Raygen, Miss, and ClosestHit shader stages support
- GDScript bindings
- Update classes documentation
- Include a-johnston RenderingDeviceDriverMetal changes
- Unimplemented placeholders for Metal and D3D12.
- Apply Mickeon docs suggestions
- Build acceleration structure command
- Expose a shader preprocessor define
- Align build scratch address
- Separate buffer memory barriers
@Fahien Fahien force-pushed the fahien/raytracing-base branch from 38b8932 to 54b3c85 Compare December 28, 2024 14:56
@DarioSamo
Copy link
Contributor

I'll see if I can get an in-depth review of this soon.

Copy link
Contributor

@DarioSamo DarioSamo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left a large batch of comments during the review.

One major thing I've avoided repeatedly mentioning throughout is there's two stages that are currently missing from hit groups: any hit and intersection shaders.

While intersection might not be used very often and I'd argue it can be ignored, any hit is absolutely essential to implementing RT renderers in a performant way, so we can't skip on them. Shadow rays gain massive performance benefits from relying on any hit and transparency sorting is possible and very performant when using it.

There's already a very good framework in place on this PR for adding the new shader stages, so I think Any Hit being added will result in a very natural extension.

Some extra reminders:

  • Given the shader binary contents have changed, make sure to upgrade the version of the shader binary upon introducing any changes to invalidate user's shader caches completely.
  • There's a misconception regarding transform buffer usage in BLAS and TLAS. Ideally, you want to embed the transform in the TLAS, as it gives you the highest flexibility of only rebuilding the TLAS with updated transforms when entities in the scene move. BLAS rebuilding is often avoided unless the mesh is skinned or has a dynamic vertex shader. While BLAS APIs allow you to pass through a transform, that is mostly just for pre-applying a transform to what is intended to be a static structure for the most part. If Godot implements a path traced renderer, it is very likely it'd only create and build a BLAS for a mesh during loading and never again unless the vertices themselves change.

<param index="1" name="uniform_set" type="RID" />
<param index="2" name="set_index" type="int" />
<description>
Binds the [param uniform_set] to this [param raytracing_list]. Godot ensures that all textures in the uniform set have the correct Vulkan access masks. If Godot had to change access masks of textures, it will raise a Vulkan image memory barrier.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagine this is probably just a comment that was lifted from other documentation, but we shouldn't be referring to Vulkan barriers in the scope of RenderingDevice, as it's agnostic to the underlying driver API being used.

<method name="raytracing_list_begin">
<return type="int" />
<description>
Starts a list of raytracing drawing commands created with the [code]draw_*[/code] methods. The returned value should be passed to other [code]raytracing_list_*[/code] functions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation refers to draw_ prefixed methods which are non-existent.

cmd_buf_info->cmd_list->SetGraphicsRoot32BitConstants(0, p_data.size(), p_data.ptr(), p_dst_first_index);
} else {
// TODO
ERR_FAIL_MSG("Unimplemented!");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On D3D12, you can share the code with the compute branch.

Comment on lines +5960 to +5962
/********************/
/**** RAYTRACING ****/
/********************/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize most of this is unimplemented, so I'll just leave some future reference on what the relevant classes are for these operations.

Comment on lines +5967 to +5968
// TODO
ERR_FAIL_V_MSG(AccelerationStructureID(), "Unimplemented!");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

D3D12_RAYTRACING_GEOMETRY_DESC, D3D12_BUILD_RAYTRACING_ACCELERATION_STRUCTURE_INPUTS, GetRaytracingAccelerationStructurePrebuildInfo.

ERR_FAIL_COND_V(!vertex_formats.has(vertex_array->description), RID());
vertex_format = vertex_formats[vertex_array->description].driver_id;
}
_check_transfer_worker_vertex_array(vertex_array);
Copy link
Contributor

@DarioSamo DarioSamo Jan 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are currently misplaced, as the check must be performed during the actual BLAS/TLAS building, not during the creation of the BLAS/TLAS driver resource. That way the check is deferred as much as possible until the acceleration structure is actually built.

// VK_PIPELINE_STAGE_*_SHADER_BIT stages except VK_PIPELINE_STAGE_RAY_TRACING_SHADER_BIT_KHR

uint32_t acceleration_structure_barrier_count = 0;
LocalVector<uint32_t> acceleration_structure_barrier_indices;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same reasoning as everywhere else, thread_local or ALLOCA must be used on hot paths such as these (preferrably TLS as it won't blow up if it runs out of stack memory).

0, nullptr,
acceleration_structure_barrier_count, as_barriers,
0, nullptr);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering the awkwardness of this implementation, I think it'd be warranted that we make up a fake new type of barrier specifically for acceleration structures.

From the driver's abstraction side, we identify buffers, textures and AS as separate entities. Therefore it'd make perfect sense to add a new type of barrier for AS specifically and add support to the graph to track these separately. That'd allow each back-end to have an easier time implementing the synchronization primitives required for it.

Comment on lines +5730 to +5747
memcpy(sbt_data, handles_ptr + handle_index * handle_size, handle_size);
++handle_index;

sbt_data = sbt_ptr + shader_info->regions.raygen.size;
for (uint32_t i = 0; i < shader_info->regions.miss_count; ++i) {
memcpy(sbt_data, handles_ptr + handle_index * handle_size, handle_size);
sbt_data += shader_info->regions.miss.stride;
++handle_index;
}

sbt_data = sbt_ptr + shader_info->regions.raygen.size + shader_info->regions.miss.size;
for (uint32_t i = 0; i < shader_info->regions.closest_hit_count; ++i) {
memcpy(sbt_data, handles_ptr + handle_index * handle_size, handle_size);
sbt_data += shader_info->regions.closest_hit.stride;
++handle_index;
}

buffer_unmap(shader_info->sbt_buffer);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this data was filled out and remains static during the lifetime of the shader, then it must be mapped and fill during creation and only then. If this varies per dispatch call, then it'll completely overwrite the data used by the previous call. This seems to indicate to me this is in the wrong spot.

As far as I can see, it currently seems to indicate that this depends on the pipeline itself, which means the buffer might belong there and not in the ShaderInfo and can probably be created and filled right after the raytracing pipeline is created.

vkCmdBindDescriptorSets((VkCommandBuffer)p_cmd_buffer.id, VK_PIPELINE_BIND_POINT_RAY_TRACING_KHR, shader_info->vk_pipeline_layout, p_set_index, 1, &usi->vk_descriptor_set, 0, nullptr);
}

void RenderingDeviceDriverVulkan::command_raytracing_trace_rays(CommandBufferID p_cmd_buffer, RaytracingPipelineID p_pipeline, ShaderID p_shader, uint32_t p_width, uint32_t p_height) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The need for the pipeline as an argument can be skipped by just storing the last bound pipeline passed through command_bind_raytracing_pipeline and asserting for its existence.

@DarioSamo
Copy link
Contributor

I've also left comments for a potential D3D12 implementation. If you feel that is out of the scope of the PR, feel free to ignore them and I can potentially take over that task when the time arrives.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants