Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consolidate apfloat "round with ties to nearest even" logic #1656

Open
mikex-oss opened this issue Oct 9, 2024 · 0 comments
Open

Consolidate apfloat "round with ties to nearest even" logic #1656

mikex-oss opened this issue Oct 9, 2024 · 0 comments
Labels
cleanup Tech debt reduction, factoring, consolidation, rework, etc. dslx DSLX (domain specific language) implementation / front-end

Comments

@mikex-oss
Copy link
Collaborator

During code review of f885388, we discussed that there would now be multiple (different) implementations of "should we round up (with ties to nearest even)" for apfloats. Specifically, we have:

// Round to nearest, ties to even (aka roundTiesToEven).
// if truncated bits > halfway bit: round up.
// if truncated bits < halfway bit: round down.
// if truncated bits == halfway bit and lsb bit is odd: round up.
// if truncated bits == halfway bit and lsb bit is even: round down.
fn rne<FRACTION_SZ: u32, LSB_INDEX_SZ: u32 = {std::clog2(FRACTION_SZ)}>
(fraction: uN[FRACTION_SZ], lsb_idx: uN[LSB_INDEX_SZ]) -> bool {
let lsb_bit_mask = uN[FRACTION_SZ]:1 << lsb_idx;
let halfway_idx = lsb_idx as uN[FRACTION_SZ] - uN[FRACTION_SZ]:1;
let halfway_bit_mask = uN[FRACTION_SZ]:1 << halfway_idx;
let trunc_mask = (uN[FRACTION_SZ]:1 << lsb_idx) - uN[FRACTION_SZ]:1;
let trunc_bits = trunc_mask & fraction;
let trunc_bits_gt_half = trunc_bits > halfway_bit_mask;
let trunc_bits_are_halfway = trunc_bits == halfway_bit_mask;
let to_fraction_is_odd = (fraction & lsb_bit_mask) == lsb_bit_mask;
let round_to_even = trunc_bits_are_halfway && to_fraction_is_odd;
let round_up = trunc_bits_gt_half || round_to_even;
round_up
}

// Extract the bits of the input fraction used to decide the direction of rounding.
let lsb = f.fraction[lsb_index+:u1];
let round = f.fraction[lsb_index - u32:1+:u1];
let sticky = std::or_reduce_lsb(f.fraction, lsb_index - u32:1);
let truncated_fraction = f.fraction[lsb_index as s32:FROM_FRACTION_SZ as s32];
// L R S
// X 0 X --> Round down (less than half)
// 0 1 0 --> Round down (half, already even)
// 1 1 0 --> Round up (half, to even)
// X 1 1 --> Round up (greater than half)
let round_up = (round && sticky) || (round && lsb);

We left it as is because of a few considerations:

  1. The implementations are different, and it would be nice to compare QoR.
  2. Neither implementation is part of the public API.
  3. rne (unintentionally?) adds behavior for lsb_idx overflowing the input fraction size (
    assert_eq(rne(u5:0b11111, u3:0b111), true); // overflow lsb index.
    ), while the latter knows this will never happen by construction.
  4. rne may benefit from renaming since it doesn't do any "rounding" as implied by the name.
@mikex-oss mikex-oss added cleanup Tech debt reduction, factoring, consolidation, rework, etc. dslx DSLX (domain specific language) implementation / front-end labels Oct 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cleanup Tech debt reduction, factoring, consolidation, rework, etc. dslx DSLX (domain specific language) implementation / front-end
Projects
Status: No status
Development

No branches or pull requests

1 participant