You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the context of #17 (Vector extension: Carry-less Multiply) - How best to handle needing to reverse bits-in-bytes to get the input in the correct form?
In Markku's analysis for the scalar implementation of GCM, he used the Bitmanip "grev" instruction to reverse the bits in every byte of the input. I can't see an existing instruction in the V-spec which does this efficiently. Adding an instruction like
vbrev.vv vrt // Bit reverse, destructive.
Which reverses the bits in every scalar element based on SEW, and requiring support for SEW=8 for the crypto extension seems reasonable?
There are other (non-obvious) ways to do bit reversal quickly which look amenable to vectorising, which we can fall back on.
The text was updated successfully, but these errors were encountered:
The best alternative method I have seen so far, adapted for RVV from one of the methods in the link in the main post (this one) , uses masking and shifting and takes 15 base vector extension instructions to reverse the order of the 8 bits in all the bytes of all the elements in a vector (not counting loading and other set-up overhead)
In the context of #17 (Vector extension: Carry-less Multiply) - How best to handle needing to reverse bits-in-bytes to get the input in the correct form?
In Markku's analysis for the scalar implementation of GCM, he used the Bitmanip "
grev
" instruction to reverse the bits in every byte of the input. I can't see an existing instruction in the V-spec which does this efficiently. Adding an instruction likeWhich reverses the bits in every scalar element based on
SEW
, and requiring support forSEW=8
for the crypto extension seems reasonable?There are other (non-obvious) ways to do bit reversal quickly which look amenable to vectorising, which we can fall back on.
The text was updated successfully, but these errors were encountered: