Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logarithm instruction mlog needs discussion #150

Closed
nfurfaro opened this issue Jun 15, 2022 · 5 comments · Fixed by #240
Closed

Logarithm instruction mlog needs discussion #150

nfurfaro opened this issue Jun 15, 2022 · 5 comments · Fixed by #240
Assignees
Labels
bug Something isn't working fuel-vm Related to the `fuel-vm` crate.

Comments

@nfurfaro
Copy link

nfurfaro commented Jun 15, 2022

I would expect that assert(1000.logarithm(10) == 3); would pass, but it does not.
Instead, logging 1000.logarithm(10) returns 2,

The testing I've done uses this implementation in sway:

impl Logarithm for u64 {
    fn logarithm(self, base: Self) -> Self {
        asm(r1: self, r2: base, r3) {
            mlog r3 r1 r2;
            r3: Self
        }
    }
}

Note that other cases are working as expected, i.e:

assert(4.logarithm(2) == 2);
assert(16.logarithm(2) == 4);
assert(9.logarithm(3) == 2);
assert(100.logarithm(10) == 2);
@nfurfaro nfurfaro added the bug Something isn't working label Jun 15, 2022
@nfurfaro nfurfaro moved this to Todo in Fuel Network Jun 15, 2022
@vlopes11
Copy link
Contributor

vlopes11 commented Jun 15, 2022

Its a problem in Rust stdlib:

fn main() {
    let b = 1000u64;
    let c = 10u64;
    let x = (b as f64).log(c as f64).floor() as u64;
    
    // Truncate incorrectly to `2`
    assert_eq!(2, x);
    
    let b = 1000f64;
    let c = 10f64;
    let x = b.log(c);
    
    // Result is some approximation to 3; 2.999~
    assert!(2.9 < x && x < 3.0);
    
    let b = 1000f64;
    let x = b.log10();
    
    // Bases 2 and 10 are more precise as in the language specs
    assert_eq!(3.0, x);
}

As I see, we have some alternatives

  1. restrict the functionality domain to provide only log for bases 2 and 10 to not compromise correctness
  2. round instead of floor
  3. always add epsilon to the result, before trunc:
let b = 1000f64;
let c = 10f64;
let x = b.log(c) + f64::EPSILON;

assert_eq!(3.0, x);

I favor option 3 since this convenience constant will always be the machine epsilon, regardless of the target arch

@vlopes11 vlopes11 self-assigned this Aug 15, 2022
@Voxelot Voxelot removed this from Fuel Network Sep 29, 2022
@Voxelot
Copy link
Member

Voxelot commented Oct 23, 2022

We should find a solution that avoids f64 if at all possible due to imprecision of floating point numbers. While epsilon fixes one edge case, there may be others it doesn't fix due to f64 precision.

@ControlCplusControlV
Copy link
Contributor

I do want to add context that one major use for MLOG is computing ticks in Concentrated Liquidity, where log2 (number) is computed to find an integer tick. Can be seen here, so getting closer to a real integer rather than just cutting down could be very helpful for that usecase

@Dentosal
Copy link
Member

Dentosal commented Oct 24, 2022

I think the best solution would be to use the unstable #![feature(int_log)] Rust feature, see tracking issue 70887. It's nearing it's final comment period with no unresolved issues, and I do not expect any breaking changes to it anymore. The functionality is available on nightly. If we do not want to depend on nightly, the implementation can be copied from the Rust corelib source code.

The implementation behaves correctly for our use cases. For instance, it gives 1000u64.checked_ilog(10) == Some(3) while still returning zero for log{3,4,5,...}(2). The performance is reasonable, although likely slower than the floating point implementation.

The impl boils down to some sanity checks for values, and then

let mut n = 0;
let mut r = exp;
while r >= base {
    r /= base;
    n += 1;
}
return n;

As we can see, worst case would likely be some small-but-hard-to-divide-with number. In the worst theoretical case, i.e. when division takes constant cycles, base is 2 and exp is u64::MAX, this still takes only 64 loops, the whole function taking estimated 211 = 2048 cpu cycles. Compared to FYL2X instruction used by the floating point, I'd approximate this is around one magnitude slower (looking at the Agner Fog instruction tables for the fp instruction timings). This should still mean at most 1 microsecond per MLOG instruction.

Benchmark

The ilog-based code is a bit faster for small values, but even 16x slower for large exponent and small base. We can as the compiler to specialize for some common cases, making them them even faster. Also, if we special-case the smallest bases, those already reduce the worst-case execution time as well. This gives 5x worst case compared to the floating point, while making many common cases faster than the float impl.

Results

Running with cargo run --release on a M1 macbook air. Results for 10_000_000 rounds.

Exponent Base ilog timing custom-impl timing float timing ilog/float timing custom/float timing
2 2 14.271ms 15.967ms 61.436ms 0.2323 0.2599
2 3 7.029ms 9.382ms 56.127ms 0.1252 0.1671
2 7 7.030ms 9.698ms 57.405ms 0.1224 0.1689
2 10 7.036ms 9.497ms 56.705ms 0.1241 0.1675
2 16 7.044ms 9.366ms 57.820ms 0.1218 0.1620
2 1337 7.027ms 9.372ms 57.017ms 0.1232 0.1644
2 2147483647 7.025ms 9.368ms 58.398ms 0.1203 0.1604
3 2 9.368ms 12.492ms 62.111ms 0.1508 0.2011
3 3 9.368ms 12.516ms 56.683ms 0.1653 0.2208
3 7 7.027ms 9.659ms 57.271ms 0.1227 0.1687
3 10 7.031ms 9.369ms 56.885ms 0.1236 0.1647
3 16 7.094ms 9.414ms 61.845ms 0.1147 0.1522
3 1337 7.113ms 9.703ms 62.175ms 0.1144 0.1560
3 2147483647 7.049ms 9.398ms 56.396ms 0.1250 0.1666
1000 2 56.467ms 40.878ms 56.422ms 1.0008 0.7245
1000 3 37.603ms 31.359ms 56.503ms 0.6655 0.5550
1000 7 23.078ms 27.576ms 66.298ms 0.3481 0.4159
1000 10 22.573ms 19.896ms 58.515ms 0.3858 0.3400
1000 16 18.978ms 13.279ms 58.155ms 0.3263 0.2283
1000 1337 7.052ms 9.377ms 60.087ms 0.1174 0.1561
1000 2147483647 7.597ms 9.466ms 60.761ms 0.1250 0.1558
1337 2 63.355ms 44.388ms 62.326ms 1.0165 0.7122
1337 3 37.589ms 31.238ms 56.629ms 0.6638 0.5516
1337 7 21.910ms 21.885ms 57.130ms 0.3835 0.3831
1337 10 22.012ms 18.740ms 56.751ms 0.3879 0.3302
1337 16 18.828ms 12.519ms 57.186ms 0.3292 0.2189
1337 1337 9.550ms 9.966ms 58.041ms 0.1645 0.1717
1337 2147483647 7.408ms 10.454ms 75.479ms 0.0981 0.1385
2147483647 2 302.645ms 106.395ms 57.864ms 5.2304 1.8387
2147483647 3 201.175ms 95.195ms 57.607ms 3.4923 1.6525
2147483647 7 80.972ms 81.369ms 57.036ms 1.4197 1.4266
2147483647 10 69.989ms 37.727ms 58.637ms 1.1936 0.6434
2147483647 16 43.864ms 44.326ms 60.382ms 0.7264 0.7341
2147483647 1337 18.805ms 12.961ms 58.278ms 0.3227 0.2224
2147483647 2147483647 9.400ms 9.815ms 57.973ms 0.1621 0.1693
18446744073709551615 2 916.004ms 209.287ms 56.259ms 16.2822 3.7201
18446744073709551615 3 550.023ms 260.707ms 64.415ms 8.5389 4.0474
18446744073709551615 7 218.680ms 219.390ms 61.748ms 3.5415 3.5530
18446744073709551615 10 180.620ms 92.285ms 57.404ms 3.1465 1.6077
18446744073709551615 16 116.527ms 101.520ms 57.653ms 2.0212 1.7609
18446744073709551615 1337 38.670ms 38.569ms 63.863ms 0.6055 0.6039
18446744073709551615 2147483647 18.832ms 12.551ms 58.532ms 0.3217 0.2144

Code

#![feature(test, int_log)]

extern crate test;

use std::time::Instant;

const ROUNDS: usize = 10_000_000;

#[inline(always)]
const fn custom_log(exp: u64, base: u64) -> u32 {
    let mut n = 0;
    let mut r = exp;
    while r >= base {
        r /= base;
        n += 1;
    }
    return n;
}

const fn custom(exp: u64, base: u64) -> Option<u32> {
    if exp <= 0 || base <= 1 {
        None
    } else {
        Some(match base {
            2 => custom_log(exp, 2),
            3 => custom_log(exp, 3),
            4 => custom_log(exp, 4),
            5 => custom_log(exp, 5),
            10 => custom_log(exp, 10),
            n => custom_log(exp, n),
        })
    }
}

fn main() {
    print!("Exponent             | ");
    print!("Base                 | ");
    print!("ilog timing          | ");
    print!("custom-impl timing   | ");
    print!("float timing         | ");
    print!("ilog/float timing  |");
    println!("custom/float timing");
    println!("{}", "---------------------:|".repeat(7).strip_prefix("-").unwrap().strip_suffix(":|").unwrap());

    for exp in [2, 3, 1_000, 1337, 2_147_483_647, u64::MAX] {
        for base in [2, 3, 7, 10, 16, 1337, 2_147_483_647] {
            assert_eq!(exp.checked_ilog(base), custom(exp, base));

            let a_start = Instant::now();
            for _ in 0..ROUNDS {
                let b: u64 = test::black_box(exp);
                let c: u64 = test::black_box(base);
        
                test::black_box(b.checked_ilog(c));
            }
        
            let a_duration = a_start.elapsed();
        

            let b_start = Instant::now();
            for _ in 0..ROUNDS {
                let b: u64 = test::black_box(exp);
                let c: u64 = test::black_box(base);
        
                test::black_box(custom(b, c));
            }
        
            let b_duration = b_start.elapsed();
            let c_start = Instant::now();
            for _ in 0..ROUNDS {
                let b: u64 = test::black_box(exp);
                let c: u64 = test::black_box(base);
        
                test::black_box((b as f64).log(c as f64).floor());
            }
        
            let c_duration = c_start.elapsed();

            let ratio1 = (a_duration.as_micros() as f64) / (c_duration.as_micros() as f64);
            let ratio2 = (b_duration.as_micros() as f64) / (c_duration.as_micros() as f64);

            println!("{exp:<20} | {base:<20} | {a_duration:>20.3?} | {b_duration:>20.3?}| {c_duration:>20.3?} | {ratio1:>16.4} | {ratio2:>16.4}");
        }
    }
}

Alternatives

If this is not sufficiently fast and we still want the correct solution, maybe we could implement some proven-efficient algorithm, e.g. this one https://hal.archives-ouvertes.fr/hal-01227877/document

@Voxelot
Copy link
Member

Voxelot commented Oct 24, 2022

Taking a performance hit here is acceptable since precision errors can accumulate and this may be used in financial contexts. We must prioritize numerical precision and correctness over performance.

Taking a dependency on nightly will break too much tooling and external processes, and since we don't expect ilog to stay in nightly long I don't think it's worth the lift to make the process change now. Given it is likely temporary, I'm ok with copying the Rust impl for now until it has stabilized (unless there's some integer math library out there that has already done the same thing).

Regarding alternate algorithms, extra perf is always nice but let's get a vetted implementation in place first and then optimize later.

Regarding gas pricing, we'll need to bench this operation based on the worst case as variably pricing the log op based on the inputs could add more overhead and complicate things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working fuel-vm Related to the `fuel-vm` crate.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants