-
Notifications
You must be signed in to change notification settings - Fork 849
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SPIR-V: prune unreachable merge block and continue target #1943
Conversation
Existing tests affected: [==========] 1372 tests from 54 test suites ran. (4038 ms total) |
I generated the test changes locally (easier to include them and have them here to see and comment on). Most look okay, but some are losing test cases now:
Would like confirmation that this change should not cross the line into being an actual optimization. I think it's not, because at the physical CFG level there actually were no branches to the beginning of the code that disappeared. Yes? It will be required though to change the test sources a bit such that they don't have most their results eliminated by this change. I think I identified the necessary changes above. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Just a couple questions and then nits about MSVC and indentation.
SPIRV/SpvPostProcess.cpp
Outdated
Block* merge = merge_and_header.first; | ||
merge->forceDeadMerge(); | ||
} | ||
for (auto continue_and_header : headerForUnreachableContinue) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This :
is one of those features not initially supported in MSVC with C++11. There is another PR wanting to add = default
, which is a similar situation. It's pretty trivial to not do this, but then maybe it's time to try again and see if anyone still complains.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack. I'll fix in a subsequent patch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in the next commit by using iterator iteration.
SPIRV/spvIR.h
Outdated
// The different reasons for reaching a block in the inReadableOrder traversal. | ||
typedef enum ReachReason { | ||
// Reachable from the entry block via transfers of control, i.e. branches. | ||
ReachViaControlFlow = 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Four-space indent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in the next commit.
SPIRV/spvIR.h
Outdated
// as necessary. A canonical dead merge block has only an OpLabel and an | ||
// OpUnreachable. | ||
void forceDeadMerge() { | ||
assert(localVariables.empty()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This all looks like 7-space indentation; probably you intended 8.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in the next commit.
return; | ||
o = 3; | ||
} | ||
// This is considered reachable since we don't assume |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Saying it stronger: we should be quite clear this PR is not an optimization and has no smarts, other than taking away something that was not hooked up.
Is that all true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's the intent.
I added an option to get back to the old behaviour. |
I think this is ready for review now. |
I really hope this is non-modal. There should be the one right way to do this, with one set of test results. The amount of change seemed fine earlier (haven't caught up with the latest round of changes). |
Point taken. I confirmed that for the top-of-tree master branch of Vulkan CTS, all the shaders compile and pass SPIR-V validation. I'm continuing to investigate. |
The Vulkan CTS tests has had targeted OpUnreachable tests since November 2015, or before Vulkan 1.0. See this version of the SPIR-V assembly-based tests: https://github.com/KhronosGroup/VK-GL-CTS/blob/6632073b4f79df50f93b27c3d2ed005d21f2c367/external/vulkancts/modules/vulkan/spirv_assembly/vktSpvAsmInstructionTests.cpp#L315 This interesting different cases include:
I'm attaching a .csv file that analyzes the test diffs in the current version of the patch.
glslang-pruning-dead-code-cases.csv.txt I'm pretty happy with the existing test coverage. So I'll update the PR to make the new behaviour the one and only behaviour. That will simplify the code and testing. |
More aggressively prune unreachable code as follows. When no control flow edges reach a merge block or continue target: - delete their contents so that: - a merge block becomes OpLabel, then OpUnreachable - a continue target becomes OpLabel, then an OpBranch back to the loop header - any basic block which is dominated by such a merge block or continue target is removed as well. - decorations targeting the removed instructions are removed. Enables the SPIR-V builder post-processing step the GLSLANG_WEB case.
I think this is ready review, and that questions have hopefully addressed. |
We should bump up the Also, tag, etc. I will do these, before merging. |
// based on the resulting SPIR-V. | ||
// Note: WebGPU code generation must have the opportunity to aggressively |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This turned on lots of code having to do with types and capabilities that should not be relevant to WebGPU.
Don't we need to just turn on the CFG topology part?
I can look at that too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, just reply with the specific subset needed by WebGPU, and I'll do the experiment/pruning of other things and verify how size is impacted.
(First, I'll merge this, and then do that as a second step to get space back for WebGPU.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we need to just turn on the CFG topology part?
In principle, yes.
But when I looked more deeply at Builder::postProcess, I became worried that the WEB path was missing functionality, particularly around adding per-instruction capabilities and extensions.
Looking again, at the moment those additional things are only affecting:
- 8bit and 16bit storage
- physical storage
Neither of these are in the WEB variant for now, so that could be saved.
I can take on getting that code space back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I posted #1968 to reclaim the codespace wasted by this part of the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That looks like it solves it, thanks.
@@ -268,10 +299,24 @@ class Block { | |||
bool unreachable; | |||
}; | |||
|
|||
// The different reasons for reaching a block in the inReadableOrder traversal. | |||
typedef enum ReachReason { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the purpose of typedef
here without also declaring a name? I'm getting warnings on this have no purpose. Does it work just as well to remove the typedef
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note, I removed the typedef
.
I'm going to push this now, with some modifications, including new test results due to bumping up the generator version number. |
The SPIR-V post-processing to discover capabilities and extensions does not apply to WebGPU compilation. So don't include that code. This reclaims some of the code space added by KhronosGroup#1943
Within a function, only emit a block if:
Includes tests showing intersting cases.
This is WIP. I haven't updated the existing tests that are affected by this change.