Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

performance much different with create buffer and pipeline during dx12 & vulkan #2921

Closed
ghost opened this issue Jul 29, 2022 · 2 comments
Closed

Comments

@ghost
Copy link

ghost commented Jul 29, 2022

I test that the dx12 is low 10 times than vulkan, on my gpu.

what i can do that can improve dx12 performance?

Log as follow:

============================== DX12 ==============================

  1. Adapter info = AdapterInfo { name: "NVIDIA GeForce RTX 3050 Ti Laptop GPU", vendor: 4318, device: 9696, device_type: DiscreteGpu, backend: Dx12 }

  2. Device features = POLYGON_MODE_LINE

  3. Device Limits { max_texture_dimension_1d: 16384, max_texture_dimension_2d: 16384, max_texture_dimension_3d: 2048, max_texture_array_layers: 256, max_bind_groups: 4, max_dynamic_uniform_buffers_per_pipeline_layout: 8, max_dynamic_storage_buffers_per_pipeline_layout: 0, max_sampled_textures_per_shader_stage: 16, max_samplers_per_shader_stage: 16, max_storage_buffers_per_shader_stage: 0, max_storage_textures_per_shader_stage: 0, max_uniform_buffers_per_shader_stage: 11, max_uniform_buffer_binding_size: 16384, max_storage_buffer_binding_size: 0, max_vertex_buffers: 8, max_vertex_attributes: 16, max_vertex_buffer_array_stride: 255, max_push_constant_size: 0, min_uniform_buffer_offset_alignment: 256, min_storage_buffer_offset_alignment: 256, max_inter_stage_shader_components: 60, max_compute_workgroup_storage_size: 0, max_compute_invocations_per_workgroup: 0, max_compute_workgroup_size_x: 0, max_compute_workgroup_size_y: 0, max_compute_workgroup_size_z: 0, max_compute_workgroups_per_dimension: 0, max_buffer_size: 268435456 }

Test Buffer 1: create_buffer_init, size = 65508 B, time = 642.9µs
Vertex Buffer: create_buffer, size = 576 B, time = 264.1µs
Vertex Buffer: write_buffer, offset = 0 B, size = 576 B, time = 282.1µs
Vertex Buffer: create_buffer_init, size = 576 B, time = 515.9µs
Index Buffer: create_buffer_init, size = 72 B, time = 409.6µs
Index Buffer: create_buffer, size = 72 B, time = 244.4µs
Index Buffer: write_buffer, offset = 0 B, size = 72 B, time = 250.4µs
Test Buffer 1: create_buffer_init, size = 65508 B, time = 450.6µs
Test Buffer 1: create_buffer, size = 65508 B, time = 187.4µs
Test Buffer 1: write_buffer, offset = 0 B, size = 65508 B, time = 204µs
Test Buffer 2: create_buffer_init, size = 65508 B, time = 397.5µs
Test Buffer 2: create_buffer, size = 65508 B, time = 217.7µs
Test Buffer 2: write_buffer, offset = 0 B, size = 65508 B, time = 178.8µs
Test Buffer 3: create_buffer_init, size = 65508 B, time = 380.7µs
Test Buffer 3: create_buffer, size = 65508 B, time = 212.1µs
Test Buffer 3: write_buffer, offset = 0 B, size = 65508 B, time = 242.1µs
Test Buffer 4: create_buffer_init, size = 65508 B, time = 390.4µs
Test Buffer 4: create_buffer, size = 65508 B, time = 220.2µs
Test Buffer 4: write_buffer, offset = 0 B, size = 65508 B, time = 218.2µs
Test Buffer 5: create_buffer_init, size = 65508 B, time = 399.5µs
Test Buffer 5: create_buffer, size = 65508 B, time = 236.8µs
Test Buffer 5: write_buffer, offset = 0 B, size = 65508 B, time = 176.9µs
Test Buffer 6: create_buffer_init, size = 65508 B, time = 421.2µs
Test Buffer 6: create_buffer, size = 65508 B, time = 245.1µs
Test Buffer 6: write_buffer, offset = 0 B, size = 65508 B, time = 186µs
Test Buffer 7: create_buffer_init, size = 65508 B, time = 380.8µs
Test Buffer 7: create_buffer, size = 65508 B, time = 231µs
Test Buffer 7: write_buffer, offset = 0 B, size = 65508 B, time = 194.5µs
Test Buffer 8: create_buffer_init, size = 65508 B, time = 416.7µs
Test Buffer 8: create_buffer, size = 65508 B, time = 200.1µs
Test Buffer 8: write_buffer, offset = 0 B, size = 65508 B, time = 240.1µs
Test Buffer 9: create_buffer_init, size = 65508 B, time = 388.6µs
Test Buffer 9: create_buffer, size = 65508 B, time = 289.6µs
Test Buffer 9: write_buffer, offset = 0 B, size = 65508 B, time = 195.5µs
Test Buffer 10: create_buffer_init, size = 65508 B, time = 424.2µs
Test Buffer 10: create_buffer, size = 65508 B, time = 218.2µs
Test Buffer 10: write_buffer, offset = 0 B, size = 65508 B, time = 227.4µs
World_Matrix: create_buffer_init, size = 64 B, time = 411.6µs
bind_group_camera create_bind_group, time = 79.5µs
bind_group_cube create_bind_group, time = 6.1µs
Shader: create_shader_module, time = 743.7µs
create_render_pipeline, time = 23.2472ms
pipeline_wire create_render_pipeline, time = 13.9877ms
Avg frame time: 0.73338383 ms

============================== Vulkan ==============================

  1. Adapter info = AdapterInfo { name: "NVIDIA GeForce RTX 3050 Ti Laptop GPU", vendor: 4318, device: 9696, device_type: DiscreteGpu, backend: Vulkan }

  2. Device features = POLYGON_MODE_LINE

  3. Device Limits { max_texture_dimension_1d: 32768, max_texture_dimension_2d: 32768, max_texture_dimension_3d: 16384, max_texture_array_layers: 256, max_bind_groups: 4, max_dynamic_uniform_buffers_per_pipeline_layout: 8, max_dynamic_storage_buffers_per_pipeline_layout: 0, max_sampled_textures_per_shader_stage: 16, max_samplers_per_shader_stage: 16, max_storage_buffers_per_shader_stage: 0, max_storage_textures_per_shader_stage: 0, max_uniform_buffers_per_shader_stage: 11, max_uniform_buffer_binding_size: 16384, max_storage_buffer_binding_size: 0, max_vertex_buffers: 8, max_vertex_attributes: 16, max_vertex_buffer_array_stride: 255, max_push_constant_size: 0, min_uniform_buffer_offset_alignment: 256, min_storage_buffer_offset_alignment: 256, max_inter_stage_shader_components: 60, max_compute_workgroup_storage_size: 0, max_compute_invocations_per_workgroup: 0, max_compute_workgroup_size_x: 0, max_compute_workgroup_size_y: 0, max_compute_workgroup_size_z: 0, max_compute_workgroups_per_dimension: 0, max_buffer_size: 268435456 }

Test Buffer 1: create_buffer_init, size = 65508 B, time = 351.2µs
Vertex Buffer: create_buffer, size = 576 B, time = 8.6µs
Vertex Buffer: write_buffer, offset = 0 B, size = 576 B, time = 68.4µs
Vertex Buffer: create_buffer_init, size = 576 B, time = 10.1µs
Index Buffer: create_buffer_init, size = 72 B, time = 9.9µs
Index Buffer: create_buffer, size = 72 B, time = 8.6µs
Index Buffer: write_buffer, offset = 0 B, size = 72 B, time = 7.1µs
Test Buffer 1: create_buffer_init, size = 65508 B, time = 11.2µs
Test Buffer 1: create_buffer, size = 65508 B, time = 3.6µs
Test Buffer 1: write_buffer, offset = 0 B, size = 65508 B, time = 8.8µs
Test Buffer 2: create_buffer_init, size = 65508 B, time = 16.3µs
Test Buffer 2: create_buffer, size = 65508 B, time = 12.1µs
Test Buffer 2: write_buffer, offset = 0 B, size = 65508 B, time = 25.5µs
Test Buffer 3: create_buffer_init, size = 65508 B, time = 9.2µs
Test Buffer 3: create_buffer, size = 65508 B, time = 8.5µs
Test Buffer 3: write_buffer, offset = 0 B, size = 65508 B, time = 10.2µs
Test Buffer 4: create_buffer_init, size = 65508 B, time = 11µs
Test Buffer 4: create_buffer, size = 65508 B, time = 3.5µs
Test Buffer 4: write_buffer, offset = 0 B, size = 65508 B, time = 9µs
Test Buffer 5: create_buffer_init, size = 65508 B, time = 9.5µs
Test Buffer 5: create_buffer, size = 65508 B, time = 4µs
Test Buffer 5: write_buffer, offset = 0 B, size = 65508 B, time = 11.7µs
Test Buffer 6: create_buffer_init, size = 65508 B, time = 10.7µs
Test Buffer 6: create_buffer, size = 65508 B, time = 16.9µs
Test Buffer 6: write_buffer, offset = 0 B, size = 65508 B, time = 13.1µs
Test Buffer 7: create_buffer_init, size = 65508 B, time = 16µs
Test Buffer 7: create_buffer, size = 65508 B, time = 7µs
Test Buffer 7: write_buffer, offset = 0 B, size = 65508 B, time = 12µs
Test Buffer 8: create_buffer_init, size = 65508 B, time = 19.6µs
Test Buffer 8: create_buffer, size = 65508 B, time = 3.4µs
Test Buffer 8: write_buffer, offset = 0 B, size = 65508 B, time = 11.2µs
Test Buffer 9: create_buffer_init, size = 65508 B, time = 9.5µs
Test Buffer 9: create_buffer, size = 65508 B, time = 13.9µs
Test Buffer 9: write_buffer, offset = 0 B, size = 65508 B, time = 7.9µs
Test Buffer 10: create_buffer_init, size = 65508 B, time = 8.7µs
Test Buffer 10: create_buffer, size = 65508 B, time = 3.5µs
Test Buffer 10: write_buffer, offset = 0 B, size = 65508 B, time = 5.1µs
bind_group_layout_camera create_bind_group_layout, time = 73.1µs
bind_group_layout_cube create_bind_group_layout, time = 16.6µs
pipeline_layout create_pipeline_layout, time = 258.8µs
texture: create_texture, size = Extent3d { width: 256, height: 256, depth_or_array_layers: 1 }, time = 78.4µs
texture: create_view, time = 57.8µs
texture: write_texture, size = 256, time = 100.3µs
View_Matrix: create_buffer_init, size = 64 B, time = 12.2µs
Project_Matrix: create_buffer_init, size = 64 B, time = 8.8µs
World_Matrix: create_buffer_init, size = 64 B, time = 8.9µs
bind_group_camera create_bind_group, time = 309.8µs
bind_group_cube create_bind_group, time = 28.5µs
Shader: create_shader_module, time = 545.5µs
create_render_pipeline, time = 3.8281ms
pipeline_wire create_render_pipeline, time = 1.3159ms
Avg frame time: 0.34925503 ms

@cwfitzgerald
Copy link
Member

cwfitzgerald commented Jul 29, 2022

A good chunk of the difference in buffer creation should be #2720.

Pipeline issues likely will be partially solved by #2722 as fxc is very slow, but will long-term be fixed by us having a DXIL backend.

@ghost
Copy link
Author

ghost commented Jul 29, 2022

3Q Very Much

@ghost ghost closed this as completed Jul 29, 2022
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant