chore(interpreter): optimisation for BYTE, SHL, SHR and SAR #1418

jpgonzalezra · 2024-05-13T19:38:30Z

Hey guys, this is my first contribution here, so nice to meet you. I was working on this issue 1251

Motivation:
Investigate the performance of the BYTE, SHL, SHR and SAR OPCODEs and try to improvement

Solution:
The improvements I've tried to apply in the resulting assembly were:

reduced branching
optimized memory access
fewer instructions

I'm going to attach the before and after assembly for the opcode functions.

crates/interpreter/src/instructions/bitwise.rs

DaniPopes · 2024-05-13T20:17:16Z

crates/interpreter/src/instructions/bitwise.rs

-    };
+    let o1 = as_usize_saturated!(op1) % 32;
+    let byte_value = op2.byte(31 - o1);
+    *op2 = U256::from(byte_value);


I don't think this is right, but also not sure if there is a test for this in statetests

This way when 'o1' exceeds 31, the function returns ZERO but to ensure the implementation works as expected, can you suggest any specific tests we should do just in case?

I don't think so? 33 % 32 = 1 so it will be 31 - 1 byte instead of zero

Yes, you're right. Sometimes we have things right in front of our eyes and they escape us. If you have any ideas on how to improve the method, please let me know, I am going to continue thinking about it ^^

we can shifts the op2 right by calculated bits to align and isolate the desired byte using bitmask but I don't think it's better than byte(..)

something like this:

*op2 = if o1 < 32 { let shift = (31 - o1) * 8; U256::from((*op2 >> shift) & U256::from(0xFF)) } else { U256::ZERO };

What do you think ?

I believe byte is already optimal on since it will simply index into the value as a byte array. Thanks for the efforts anyway.

Thanks, I'm going to leave this method as it was =)

DaniPopes

This looks fine to me, I have not verified all assembly but it looks like they would be improved.

crates/interpreter/src/instructions/bitwise.rs

rakita · 2024-05-16T08:14:20Z

crates/interpreter/src/instructions/bitwise.rs

 }

 /// EIP-145: Bitwise shifting instructions in EVM
 pub fn shr<H: Host + ?Sized, SPEC: Spec>(interpreter: &mut Interpreter, _host: &mut H) {
    check!(interpreter, CONSTANTINOPLE);
    gas!(interpreter, gas::VERYLOW);
    pop_top!(interpreter, op1, op2);
-    *op2 >>= as_usize_saturated!(op1);
+    let shift = as_usize_saturated!(op1);
+    *op2 = if shift < 256 {


I was looking at what ruint do
https://github.com/recmo/uint/blob/b041f09be7035bb61f8e52e39194eb838e832483/src/bits.rs#L353-L356

shift < 256 is better.

Nice. so, are we on the right track?

rakita

lgtm

chore(interpreter): optimisation for BYTE, SHL, SHR and SAR

87f4eec

DaniPopes requested changes May 13, 2024

View reviewed changes

added previus comment in byte function

617d880

jpgonzalezra requested a review from DaniPopes May 15, 2024 13:50

DaniPopes approved these changes May 15, 2024

View reviewed changes

crates/interpreter/src/instructions/bitwise.rs Outdated Show resolved Hide resolved

crates/interpreter/src/instructions/bitwise.rs Outdated Show resolved Hide resolved

updated pr comments

fe3d78c

rakita reviewed May 16, 2024

View reviewed changes

rakita approved these changes May 24, 2024

View reviewed changes

rakita merged commit ff2dcf5 into bluealloy:main May 24, 2024
25 checks passed

rakita mentioned this pull request May 24, 2024

Review optimisation for BYTE, SHL, SHR and SAR #1251

Closed

This was referenced May 24, 2024

chore: release #1431

Closed

chore: release #1449

Closed

chore: release #1453

Closed

chore: release #1456

Closed

chore: release #1463

Closed

This was referenced May 31, 2024

chore: release #1474

Closed

chore: release #1475

Closed

chore: release #1485

Closed

chore: release #1492

Closed

This was referenced Jun 8, 2024

chore: release #1497

Closed

chore: release #1509

Closed

This was referenced Jun 17, 2024

chore: release #1537

Closed

chore: release #1538

Closed

chore: release #1548

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(interpreter): optimisation for BYTE, SHL, SHR and SAR #1418

chore(interpreter): optimisation for BYTE, SHL, SHR and SAR #1418

jpgonzalezra commented May 13, 2024 •

edited

Loading

DaniPopes May 13, 2024

jpgonzalezra May 13, 2024

DaniPopes May 13, 2024

jpgonzalezra May 13, 2024 •

edited

Loading

jpgonzalezra May 13, 2024 •

edited

Loading

DaniPopes May 15, 2024

jpgonzalezra May 15, 2024

DaniPopes left a comment

rakita May 16, 2024

jpgonzalezra May 16, 2024 •

edited

Loading

rakita left a comment

chore(interpreter): optimisation for BYTE, SHL, SHR and SAR #1418

chore(interpreter): optimisation for BYTE, SHL, SHR and SAR #1418

Conversation

jpgonzalezra commented May 13, 2024 • edited Loading

Byte

SAR

SHL

SHR

DaniPopes May 13, 2024

Choose a reason for hiding this comment

jpgonzalezra May 13, 2024

Choose a reason for hiding this comment

DaniPopes May 13, 2024

Choose a reason for hiding this comment

jpgonzalezra May 13, 2024 • edited Loading

Choose a reason for hiding this comment

jpgonzalezra May 13, 2024 • edited Loading

Choose a reason for hiding this comment

DaniPopes May 15, 2024

Choose a reason for hiding this comment

jpgonzalezra May 15, 2024

Choose a reason for hiding this comment

DaniPopes left a comment

Choose a reason for hiding this comment

rakita May 16, 2024

Choose a reason for hiding this comment

jpgonzalezra May 16, 2024 • edited Loading

Choose a reason for hiding this comment

rakita left a comment

Choose a reason for hiding this comment

jpgonzalezra commented May 13, 2024 •

edited

Loading

jpgonzalezra May 13, 2024 •

edited

Loading

jpgonzalezra May 13, 2024 •

edited

Loading

jpgonzalezra May 16, 2024 •

edited

Loading