-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable libsleefdft and libsleefquad on riscv64 #503
Conversation
Enabling inline headers will come next: rivosinc#4 It refactors this code a little bit, but I assume it's better to draw a line under this before moving on. |
@blapie could we please get a review on that, IMO it should be ready for RISC-V. Also it's passing on CI: https://github.com/rivosinc/sleef/actions/runs/7596179815 |
Hi! Will have a look this week! Thanks for the contribution already! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks sensible to me, but might need to have a second look because atm I'm mostly just trusting you on the technicalities.
Soliciting review from @ericlove and @GlassOfWhiskey. I understand you were contributors at the beginning of this PR. |
I'm not sure about some of the cast/truncate implementations, here. vint is a different size from vmask (not sure why, is that ok?), and when casting between the two there are a mix of different policies for which bits to throw away. Unfortunately if I change the implementation the tests still pass. What's the expectation for functions that go between vint and vmask? This has come up where I've extended the logic for quad support, where I did indeed get test failures, but only in RVVM2 tests, where the truncation frees a whole register which the compiler can reuse. In RVVM1 bugs can be hidden because vint uses half a register and if you truncate and un-truncate the data can survive. |
What do you mean by "there are a mix of different policies"? Do you mean cross vector extensions in SLEEF?
I'm not aware of any particular expectation, as long as the choice is consistent, but my understanding of the dft part of the library is limited. |
Sorry for taking so long to join this conversation, but just thought it might be useful to chime in with my memory of why The Meanwhile, Hopefully one of these constraints I described is unnecessary, and can be relaxed to simplify both the original RVV design, and your work on the DFT extensions. |
To clarify what I mean about 'different policies', here's the list of conversions to or from the vmask type that I found (supposing a configuration where m1 represents the minimum 128 bits):
Here I think Though that's in my local work tree after I've refactored a bit. Comparing to the version in the current pull request, I see two implementations for
But all the tests pass. I'll try rationalising the size of |
The way it's used in sleefsimdsp.c it looks like it's just creating pairs of complementary floats, so it's not unreasonable to cast it back to a 32-bit type having the same VECTLENSP as float32 operations -- then it works out to be the same bit width for DP and SP cases. It just needs a temporary 64-bit splat while the pair of floats are unrolled. Working that change through piece by piece it looks like all (or just most?) of the compatibility workarounds between SP and DP disappear. So I guess I'll do that. |
605cb26
to
be6106e
Compare
Looks like the first step in rebasing automatically closed this PR? Anyway, I've rebased it, now, and fixed or understood everything that I was worried about. |
760f34a
to
75c1178
Compare
I've carved out some of this patch into #520 and #521. I'm not sure how github works, here, but I assume that this patch now depends on those PRs going in first, and when they do this diff will shrink? #522 to follow. Diff from #521: rivosinc/sleef@dev/shosie/fix-rvv-warnings...dev/shosie/fix-rvv-dft |
75c1178
to
5bd14f3
Compare
5bd14f3
to
a5ab642
Compare
Ok, all 4 PRs make sense for the most part. Really appreciate you linking to the diffs, although my eyes started to burn a bit after 2 PRs... |
a5ab642
to
82411c1
Compare
@sh1boot This one need a rebase now. |
Co-Authored-By: Simon Hosie <[email protected]>
82411c1
to
b1c49df
Compare
@blapie I've rebased it and updated this PR. The diff between before and after the rebase is the same: https://github.com/shibatch/sleef/compare/82411c1f1883a528471db11c7439b56857038de8..b1c49df4ebe7163f6ab6c553586d0840a46f627e. |
Continuation of #496
Mostly just the in-fill of src/arch/helperrvv.h.