Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Support f32-dwconv-unipass with
channel_tile = 8v
.Notice that we use the variant intrinsics of tail undisturbed
tu
,because we observed that the original intrinsics (tail agnostic
ta
) caused register spilling problems.Our experimental observations:
Here is the code using the original intrinsics and the code using the variant intrinsics of
_tu
:And here is part of the disassembly of f32-dwconv-9p8vc-minmax-rvv.c.o:
.github/workflows/build.yml
.