Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

To chars #160

Merged
merged 6 commits into from
May 2, 2024
Merged

To chars #160

merged 6 commits into from
May 2, 2024

Conversation

CrustyAuklet
Copy link
Member

As discussed in #158 , this is a merge request exploring the best option to eliminate the dependence on snprintf, since the capability of printf family of functions may vary depending on implementation. Specifically in the most common embedded ARM compiler newlib-nano won't print large integers.

The printing stuff is annoying. Are you able to use <charconv>? Or <format>? There's an issue and branch already adding support for std::to_chars from <charconv>, but it is somewhat blocked by the poor STL support for floating point (only available in very recent STD libs, which would limit adoption). This could be added as a configurable option for the "fast" (runtime) append():

1. use `std::snprintf` (current status quo)

2. use `std::to_chars` ([Use `std::to_chars` instead of `std::snprintf` for converting numbers to string #18](https://github.com/snitch-org/snitch/issues/18); faster, better behavior, but poorer STL coverage)

3. use _snitch_ `append_constexpr()` (explored in [Use `std::to_chars` instead of `std::snprintf` for converting numbers to string #18](https://github.com/snitch-org/snitch/issues/18) as an alternative; slower in Debug, but fully portable)

Option 3 is the simplest, since the code is already there. It is just not as optimised as the STL stuff, so we currently branch out to the "fast" algorithm at runtime when possible for maximum speed. Option 2 would be ideal, but I don't think it has broad enough compiler support; so perhaps a mix of option 2 and 3 would be optimal: use std::to_chars if available, else fall back to append_constexpr().

This change currently goes with option 2, falling back on snitch append_constexpr() function for floating point based on standard library version. standard library detection is using macros for libstdc++, libc++, and MSVC. A "cleaner" solution in CMake could be to set a configuration based on a test compile. Not sure how to do the same in meson.

Should this have an option to override the detected capability?

@CrustyAuklet
Copy link
Member Author

Interesting discovery, the std::to_char implementation (at least in GCC) is extremely large! I started working on the tests I am adding to our production pipeline and all of a sudden the tests blew up in size and wouldn't link. I had to erase most tests to get the size down so I could use bloaty and found this:

37.5%   104Ki    [1344 Others]
  26.4%  73.4Ki    (anonymous namespace)::ryu::POW10_SPLIT_2
  10.3%  28.7Ki    (anonymous namespace)::ryu::POW10_SPLIT
   8.6%  24.0Ki    [section .stack]
   4.7%  13.0Ki    [section .text]
   2.0%  5.47Ki    test_fun_1()
   1.8%  5.11Ki    (anonymous namespace)::pool_buffer_
   1.3%  3.75Ki    snitch::tests
   0.7%  2.03Ki    snitch::impl::(anonymous namespace)::parse_color_options(int, char const* const*)
   0.6%  1.80Ki    __aeabi_dadd
   0.5%  1.50Ki    snitch::impl::(anonymous namespace)::expected_args
   0.5%  1.40Ki    __aeabi_dmul
   0.5%  1.33Ki    snitch::impl::(anonymous namespace)::parse_arguments(int, char const* const*, snitch::small_vector<snitch::impl::(anonymous namespace)::expected_argument, 32u> const&, snitch::impl::(anonymous namespace)::parser_settings const&)
   0.4%  1.09Ki    board::i2c_init()
   0.4%  1.04Ki    (anonymous namespace)::ryu::d2exp_buffered_n(double, unsigned long, char*, int*)
   0.4%    1024    seal::core::host_stack_
   0.3%     972    (anonymous namespace)::ryu::d2fixed_buffered_n(double, unsigned long, char*)

All the (anonymous namespace)::ryu::* symbols I didn't recognize so I searched and found this issue.

I've been using the charconv functions and been happy so far, but I never really deal with floating point since all our values are in fixed point representations. So maybe this implies that the flag should be over-ridable.

So if floating point std::to_char isn't available would you prefer to fall back to your constexpr function, snprintf, or something else?

Copy link

codecov bot commented Apr 25, 2024

Codecov Report

Attention: Patch coverage is 97.50000% with 1 lines in your changes are missing coverage. Please review.

Project coverage is 93.91%. Comparing base (ccd3afa) to head (4096a6c).

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #160      +/-   ##
==========================================
+ Coverage   93.87%   93.91%   +0.04%     
==========================================
  Files          27       27              
  Lines        1664     1660       -4     
==========================================
- Hits         1562     1559       -3     
+ Misses        102      101       -1     
Files Coverage Δ
include/snitch/snitch_append.hpp 96.15% <100.00%> (+0.02%) ⬆️
src/snitch_append.cpp 94.11% <94.73%> (+1.80%) ⬆️

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ccd3afa...4096a6c. Read the comment docs.

@cschreib
Copy link
Member

cschreib commented Apr 27, 2024

So if floating point std::to_char isn't available would you prefer to fall back to your constexpr function, snprintf, or something else?

I think falling back to the constexpr functions is simplest and most portable. The issue with the large size of std::to_chars is interesting, and makes me wonder if it isn't simpler to use the constexpr functions all the time. That might negatively impact the benchmarks a little, but serialisation is unlikely to be the main bottleneck of test execution.

I'll see if I can rerun the benchmarks with that solution, to test the impact. Thanks for looking into this.

Otherwise, it seems the Windows CI is broken, I don't think that's related to your PRs. I'll investigate separately. Edit: Yes, it's #140 again, with a newer version of MSVC. We need to disable the offending test with that version. I'll make a separate PR.

@cschreib
Copy link
Member

The CI failures weren't related to this PR and have been fixed in main.

Otherwise, I have benchmarked using append_constexpr() to implement append_fast(), the changes are in the append branch.

Debug:

Before After
Build framework 3.7s 3.7s
Build tests 67s 67s
Build all 70s 70s
Run tests 42ms 40ms
Library size 7.6MB 7.7MB
Executable size 35.7MB 35.7MB

Release:

Before After
Build framework 5.0s 5.2s
Build tests 142s 142s
Build all 147s 147s
Run tests 26ms 26ms
Library size 1.3MB 1.3MB
Executable size 10.1MB 10.1MB

--> no significant change, so that would be a viable solution.

@CrustyAuklet
Copy link
Member Author

I think in the end it is your decision but since there is minimal performance difference I would lean towards using the constexpr version so that those functions get more use and implicit testing. At least until C++23 wide support and you could just use to_char for everything integral if you wanted. to_char for floating point seems like a non-starter for embedded at least.

tocic
tocic previously approved these changes Apr 28, 2024
tests/runtime_tests/string_utility.cpp Outdated Show resolved Hide resolved
@cschreib
Copy link
Member

cschreib commented Apr 28, 2024

I think in the end it is your decision but since there is minimal performance difference I would lean towards using the constexpr version so that those functions get more use and implicit testing. At least until C++23 wide support and you could just use to_char for everything integral if you wanted. to_char for floating point seems like a non-starter for embedded at least.

I think I'd be happy with either of these two solutions:

  • use append_constexpr all the time -- simple, small, but not optimal runtime performance.
  • use either std::to_chars or append_constexpr conditionally, using a config toggle like SNITCH_APPEND_TO_CHARS -- a bit more complex, but lets people opt-out of the cost of std::to_chars if they don't need the optimal speed (or simply if their compiler/STL can't handle it). I think it's probably better to use one or the other, but not mix the two, so the option is simply "use std::to_chars or not", rather than "use it for integers all the time but not always for floats".

src/snitch_append.cpp Outdated Show resolved Hide resolved
Copy link
Member

@cschreib cschreib left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments above :)

use std::to_char for formatting, fallback to constexpr formatting if needed
and the compiler supports it. std::to_char is available widley except for
floating point. Updated tests to expect the same value from runtime and
constexpr format tests.

std::to_char for floating point is the real edge case since it has poor
compiler support. Where it is supported it seems most implementations are
using a space-inefficient algorithm that can increase code size by about
1.2kB on ARM thumb architectures. This makes it unusable on embedded
platforms.
@CrustyAuklet
Copy link
Member Author

Looking at the performance measurements in #18, I went with option 2. I don't think it's reasonable to give up the 5x performance increase of std::to_chars in the basic case. For embedded the SNITCH_APPEND_TO_CHARS option can be set to OFF to avoid the huge space cost.

I moved the standard library detection macro logic to the config header, and disable the setting if a library version is detected that doesn't support std::to_char for floating point, or std::to_char at all, SNITCH_APPEND_TO_CHARS is undefined and defined as 0. similar to the SNITCH_CONSTEXPR_FLOAT_USE_BITCAST setting.

I like to restrict macro logic to one location so I re-named the functions in the anonymous namespace to append_to since they don't necessarily use std::to_char. I pulled in the append_constexpr changes from the append branch, and made the conversion base a template parameter in append_to to match those signatures.

@CrustyAuklet
Copy link
Member Author

The single MacOS failure is a little baffling to me. I think apple clang, in Release mode, is printing out an extra character of precision and therefore the acceptance test regex isn't converting the time to a "*" as expected.

Not sure why this wouldn't happen in debug, or normal clang.. it does look like libc++ doesn't define the feature test macro for std::to_char even though they have it implemented.

so format will fall back to the constexpr versions.

src/snitch_append.cpp Outdated Show resolved Hide resolved
@cschreib
Copy link
Member

Not sure why this wouldn't happen in debug, or normal clang.

The offending number that fails the regex (1e-6, it seems) is the duration of a test (in seconds), as measured in the CI environment. That number is likely to be highly variable, so it would not be a surprise if the other runs (e.g. in Debug, or with other OS/compilers) ended up with a different value. If the problem is real (i.e. not a compiler bug) and only affects specific values, then it's possible that the other runs were just lucky and didn't trigger the issue.

And it seems indeed the issue is real, and I can reproduce it on my local branch. Adding the following line to the "append floats" test triggers a failure if using append_contexpr:

CONSTEXPR_CHECK(a(1e-6f) == ae{"1.000000e-06"sv, true});

So that's a bug in append_constexpr. I traced it down to set_precision(), which didn't account for the number of digits potentially staying the same when doing a divide-and-round (in this case, rounding 99999999/10 to 10000000). Fix is in 52ff3e8.

cschreib and others added 4 commits April 30, 2024 13:14
constexpr append changes to take into account the base of string conversion
so that pointers can be printed out in hex format and with padding zeros.
If the compiler doesn't support std::to_char forfloating point, then the
compile time option should disable use of std::to_char for all conversions.
This avoids mixing and matching implementations. Either append_constexpr
is used for all conversion, or std::to_char is used for all conversions.

This slightly complicates the testing as there is now a two variable truth
table as to the expected results when formatting -0.0.
On the 32-bit platforms in CI (windows and emscripten) large_int_t and
large_uint_t seem to alias each other in function definitions. Fix this by
using signed/unsigned concepts for the parameter. That way the incorrect
function will not be considered.

This required small bug fix in snitch::concepts. signed_integral and
unsigned_integral concepts need to check for sign and also make sure the
type is integral so that floating point or other types are not considered.
CMakeLists.txt Outdated Show resolved Hide resolved
meson_options.txt Outdated Show resolved Hide resolved
tests/runtime_tests/string_utility.cpp Outdated Show resolved Hide resolved
tests/runtime_tests/string_utility.cpp Outdated Show resolved Hide resolved
tests/runtime_tests/string_utility.cpp Outdated Show resolved Hide resolved
tests/runtime_tests/string_utility.cpp Outdated Show resolved Hide resolved
tests/runtime_tests/string_utility.cpp Outdated Show resolved Hide resolved
@cschreib
Copy link
Member

cschreib commented May 1, 2024

Looking good! I'm happy to approve it once the small changes requested above are made.

Explicit base template parameter on num_digits in set_precision function.
Also change all instances of `std::to_char` to the correct spelling
`std::to_chars`.
Copy link
Member

@cschreib cschreib left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for contributing! If you are interested in joining us as maintainer of the library, please have a look here. This comes with no strings attached.

@cschreib cschreib merged commit 81cce09 into snitch-org:main May 2, 2024
43 checks passed
@CrustyAuklet CrustyAuklet deleted the to_chars branch May 2, 2024 16:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants