Avoid 64bit division #291

jannic · 2022-02-10T19:17:31Z

While looking at the code generated by #288, I wondered why there were references to 64bit division code (compiler_builtins::int::specialized_div_rem::u64_div_rem) for a basically empty firmware.

Turned out that init_clocks_and_plls() contained 64bit integer divisions.

As the code to do those divisions takes about 1kB even in fully optimized builds, I changed the clock calculation code to use 32bit divisions, instead.

To keep the code small, I accepted some minimal additional rounding error: In case the remainder of the division is bigger than 2^24, the lower 8 bits will be thrown away, causing a relative rounding error of at most 2^-16.
(To be exact, instead of the expected rounding-down of normal integer division, the value might get rounded up, instead. The resulting difference from the true value might be even smaller than before.)

9names · 2022-02-13T12:04:17Z

I tested this after rebasing onto the new main, since the new intrinsics code saves a decent chunk of code space too.

On a trivial project I tested on it's clearly a saving (240 bytes dev, 944 bytes release).
With a more complex project (defmt enabled, lots of printing, approx 22KB release and 27KB dev), it saved 288 bytes on dev build but adds 272 bytes in release (because it's no longer using the u64_div_rem that was already included due to math done elsewhere).

It's a trade-off. 272 bytes isn't a big chunk out of the 2MB on the Pico but it is something.
On the other hand, 944 bytes is a huge amount for a small firmware - my small project went from 5KB to 4KB - that might be a big deal if you were running entirely from RAM.

jannic · 2022-02-13T13:41:44Z

Those 272 are a real concern, as this change is all about saving a few bytes of flash.

How common is 64 bit division in embedded software? Having used ATmega before, even 32bit arithmetic sometimes feels like a luxury. 😄 If most real-world code ends up using it, the proposed optimization would be counterproductive.

In theory one could implement both approaches and let the user chose, but most users would not care at all, so having this choice would only be confusing.

thejpster

Some unit tests / doctests on fractional_div would be useful I think, to test all the corner cases.

thejpster · 2022-02-14T21:34:17Z

I think I'm happy with this though. 64 bit maths is fairly uncommon so most people benefit.

I wonder though if we can use const fn to make this all go away though? I mean, the crystal frequency is always known and the sysclk is almost always known at compile time.

jannic · 2022-02-14T22:58:35Z

Some unit tests / doctests on fractional_div would be useful I think, to test all the corner cases.

I agree. I did some tests myself by pasting both the old and the new code into the same binary, throwing millions of random numbers at them, and comparing the results - and caught some subtle bugs in earlier versions of that patch.
Probably too much for a unit test, but having some tests is definitely a good idea.

I wonder though if we can use const fn to make this all go away though?

I thought the same yesterday, when I saw embassy-rs/embassy@640ddc9.
But init_clocks_and_plls()/configure_clock() have side effects and so can't be a const fn. And inside those functions, the frequency is no longer a constant known in advance.
Some API like this could work, but would be ugly:

  const fn calculate_divider(freq: u32, source_freq: u32) -> u32 { ... }
  
  const DIVIDER = calculate_divider( ... );
  [...]
  some_clock.configure( DIVIDER );

For init_clocks_and_plls one would need many dividers, so that would become even more ugly.
Perhaps the BSPs (where the crystal frequency can be assumed as constant) could provide some kind of init_clocks_and_plls_to_max_frequency() method which doesn't take any clock frequency parameters at all, so everything could be precalculated?

thejpster · 2022-02-24T07:47:46Z

Latest changes look good

9names

This looks ready to merge to me.
Are you okay with this being merged @jannic or are you still thinking about updating it further?

jannic · 2022-02-26T07:44:00Z

It can be merged. I don't have actionable ideas on how to improve this further, at the moment.

9names · 2022-02-26T12:56:32Z

Would you mind resolving the merge conflict on the changelog?

This saves about 1kB of flash by removing compiler_builtins::int::specialized_div_rem::u64_div_rem if no other code uses u64 divisions.

jannic · 2022-02-26T13:35:38Z

Ok, done

thejpster · 2022-02-26T13:56:25Z

Looks like clippy has a sad

jannic · 2022-02-26T16:49:24Z

Looks like clippy has a sad

Yes - but not related to this pull request.
Fixed the clippy warnings in #304

thejpster reviewed Feb 14, 2022

View reviewed changes

jannic force-pushed the avoid-64bit-division branch from 66d9b13 to eb83c3d Compare February 22, 2022 23:17

9names approved these changes Feb 26, 2022

View reviewed changes

jannic added 4 commits February 26, 2022 13:33

Use u32 instead of u64 division in clock calculations

402b7f1

This saves about 1kB of flash by removing compiler_builtins::int::specialized_div_rem::u64_div_rem if no other code uses u64 divisions.

Derive several traits for ClockError

fecde70

Add test cases for fractional_div()

b46ddd7

Actually run fractional_div test case from CI

00b49d5

jannic force-pushed the avoid-64bit-division branch from eb83c3d to 00b49d5 Compare February 26, 2022 13:34

9names approved these changes Feb 26, 2022

View reviewed changes

9names merged commit 111654f into rp-rs:main Feb 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid 64bit division #291

Avoid 64bit division #291

jannic commented Feb 10, 2022

9names commented Feb 13, 2022

jannic commented Feb 13, 2022

thejpster left a comment

thejpster commented Feb 14, 2022

jannic commented Feb 14, 2022

thejpster commented Feb 24, 2022

9names left a comment

jannic commented Feb 26, 2022

9names commented Feb 26, 2022

jannic commented Feb 26, 2022

thejpster commented Feb 26, 2022

jannic commented Feb 26, 2022

Avoid 64bit division #291

Avoid 64bit division #291

Conversation

jannic commented Feb 10, 2022

9names commented Feb 13, 2022

jannic commented Feb 13, 2022

thejpster left a comment

Choose a reason for hiding this comment

thejpster commented Feb 14, 2022

jannic commented Feb 14, 2022

thejpster commented Feb 24, 2022

9names left a comment

Choose a reason for hiding this comment

jannic commented Feb 26, 2022

9names commented Feb 26, 2022

jannic commented Feb 26, 2022

thejpster commented Feb 26, 2022

jannic commented Feb 26, 2022