Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use single lookup and binary search for log2 #1

Merged

Conversation

cairoeth
Copy link

@cairoeth cairoeth commented Oct 16, 2024

We can do a much more efficient and simple implementation by employing a binary search to quickly narrow down the most significant bit, followed by a small lookup table for the final 4 bits.

It's more performant as we avoid using multiple lookup tables and modulo.

To generate the table lookup value:

 def msb_position(n):
    return (n.bit_length() - 1) if n else 0

def generate_lookup_table():
    table = [msb_position(i) for i in range(16)]
    
    # Pad the table to 32 bytes
    padded_table = table + [0] * 16
    
    # Convert to a single 32-byte value
    lookup_value = 0
    for i, value in enumerate(padded_table):
        lookup_value |= value << (8 * (31 - i))
    
    return f"0x{lookup_value:064x}"

print(generate_lookup_table())

Inspired by https://www.chessprogramming.org/BitScan#De_Bruijn_Multiplication_2

@Lohann
Copy link
Owner

Lohann commented Oct 17, 2024

@cairoeth Great work! your solution is indeed simpler, cheaper and generate a smaller binary 👏 .
I rewrote it in pure EVM assembly to check how the optimal code looks like (is hard to make solidity generate optimal code), your solution optimal code consumes 156 gas:

PUSH32 <value_here>

// Lookup Table
PUSH30 0x010102020202030303030303030300000000000000000000000000000000

// If value has upper 128 bits set, log2 result is at least 128
DUP2
PUSH16 0xffffffffffffffffffffffffffffffff
LT
PUSH1 7
SHL

// If upper 64 bits of 128-bit half set, add 64 to result
DUP3
DUP2
SHR
PUSH8 0xffffffffffffffff
LT
PUSH1 6
SHL
OR

// If upper 32 bits of 64-bit half set, add 32 to result
DUP3
DUP2
SHR
PUSH4 0xffffffff
LT
PUSH1 5
SHL
OR

// If upper 16 bits of 32-bit half set, add 16 to result
DUP3
DUP2
SHR
PUSH2 0xffff
LT
PUSH1 4
SHL
OR

// If upper 8 bits of 16-bit half set, add 8 to result
DUP3
DUP2
SHR
PUSH1 0xff
LT
PUSH1 3
SHL
OR

// If upper 4 bits of 8-bit half set, add 4 to result
DUP3
DUP2
SHR
PUSH1 0x0f
LT
PUSH1 2
SHL
OR

// Table lookup
SWAP2
DUP3
SHR
BYTE
OR

@Lohann
Copy link
Owner

Lohann commented Oct 17, 2024

In comparison, my solution optimal code consumes 162 gas and generates a larger binary 127 bytes, while yours 121 bytes, thank you very much for this contribution @cairoeth 🎉 :

PUSH32 <value_here>

// Round down to the nearest power of two
DUP1
PUSH1 1
SHR
OR

DUP1
PUSH1 2
SHR
OR

DUP1
PUSH1 4
SHR
OR

DUP1
PUSH1 8
SHR
OR

DUP1
PUSH1 16
SHR
OR

DUP1
PUSH1 32
SHR
OR

DUP1
PUSH1 64
SHR
OR

DUP1
PUSH1 128
SHR
OR

PUSH1 1
SHR
PUSH1 1
ADD

// log2(n) and log2(x / n) lookup tables
PUSH31 0x08101820283038404850586068707880889098a0a8b0b8c0c8d0d8e0e8f0f8
PUSH30 0x010002040007030605000000000000000000000000000000000000000000

// log2(n) table lookup
PUSH1 11
PUSH1 255
DUP5
MOD
MOD
BYTE

// log2(x / n) lookup
SWAP2
DUP3
SHR
MUL
PUSH1 248
SHR

// log2(n) + log2(x / n)
ADD

@Lohann Lohann merged commit 4521ed4 into Lohann:lohann/efficient-log2-algorithm Oct 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants