Consolidate apfloat "round with ties to nearest even" logic #1656

mikex-oss · 2024-10-09T16:35:39Z

During code review of f885388, we discussed that there would now be multiple (different) implementations of "should we round up (with ties to nearest even)" for apfloats. Specifically, we have:

xls/xls/dslx/stdlib/apfloat.x

Lines 280 to 298 in 53f30d0

    
           //  Round to nearest, ties to even (aka roundTiesToEven). 
        
           // if truncated bits > halfway bit: round up. 
        
           // if truncated bits < halfway bit: round down. 
        
           // if truncated bits == halfway bit and lsb bit is odd: round up. 
        
           // if truncated bits == halfway bit and lsb bit is even: round down. 
        
           fn rne<FRACTION_SZ: u32, LSB_INDEX_SZ: u32 = {std::clog2(FRACTION_SZ)}> 
        
               (fraction: uN[FRACTION_SZ], lsb_idx: uN[LSB_INDEX_SZ]) -> bool { 
        
               let lsb_bit_mask = uN[FRACTION_SZ]:1 << lsb_idx; 
        
               let halfway_idx = lsb_idx as uN[FRACTION_SZ] - uN[FRACTION_SZ]:1; 
        
               let halfway_bit_mask = uN[FRACTION_SZ]:1 << halfway_idx; 
        
               let trunc_mask = (uN[FRACTION_SZ]:1 << lsb_idx) - uN[FRACTION_SZ]:1; 
        
               let trunc_bits = trunc_mask & fraction; 
        
               let trunc_bits_gt_half = trunc_bits > halfway_bit_mask; 
        
               let trunc_bits_are_halfway = trunc_bits == halfway_bit_mask; 
        
               let to_fraction_is_odd = (fraction & lsb_bit_mask) == lsb_bit_mask; 
        
               let round_to_even = trunc_bits_are_halfway && to_fraction_is_odd; 
        
               let round_up = trunc_bits_gt_half || round_to_even; 
        
               round_up 
        
           }

xls/xls/dslx/stdlib/apfloat.x

Lines 794 to 806 in 53f30d0

    
           // Extract the bits of the input fraction used to decide the direction of rounding. 
        
           let lsb = f.fraction[lsb_index+:u1]; 
        
           let round = f.fraction[lsb_index - u32:1+:u1]; 
        
           let sticky = std::or_reduce_lsb(f.fraction, lsb_index - u32:1); 
        
           let truncated_fraction = f.fraction[lsb_index as s32:FROM_FRACTION_SZ as s32]; 
        
           //  L R S 
        
           //  X 0 X   --> Round down (less than half) 
        
           //  0 1 0   --> Round down (half, already even) 
        
           //  1 1 0   --> Round up (half, to even) 
        
           //  X 1 1   --> Round up (greater than half) 
        
           let round_up = (round && sticky) || (round && lsb);

We left it as is because of a few considerations:

The implementations are different, and it would be nice to compare QoR.
Neither implementation is part of the public API.
rne (unintentionally?) adds behavior for lsb_idx overflowing the input fraction size (

xls/xls/dslx/stdlib/apfloat.x

Line 311 in 53f30d0

assert_eq(rne(u5:0b11111, u3:0b111), true); // overflow lsb index.

), while the latter knows this will never happen by construction.
rne may benefit from renaming since it doesn't do any "rounding" as implied by the name.

The text was updated successfully, but these errors were encountered:

mikex-oss added cleanup Tech debt reduction, factoring, consolidation, rework, etc. dslx DSLX (domain specific language) implementation / front-end labels Oct 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consolidate apfloat "round with ties to nearest even" logic #1656

Consolidate apfloat "round with ties to nearest even" logic #1656

mikex-oss commented Oct 9, 2024

Consolidate apfloat "round with ties to nearest even" logic #1656

Consolidate apfloat "round with ties to nearest even" logic #1656

Comments

mikex-oss commented Oct 9, 2024