Non-zero padded exponent in float string representation after Ryu implementation #14682

franciscoadasme · 2024-06-10T15:38:07Z

As discussed in the forum post of the same name, there was an unintended change in the scientific notation for single-digit exponents, where they are no longer zero-padded after PR #14084 included in v1.11:

# before
printf("%E", 123.45) # => 1.234500E+02
printf("%E", 123.45e15) # => 1.234500E+17
# after
printf("%E", 123.45) # => 1.234500E+2 (note the missing leading zero in the exponent)
printf("%E", 123.45e15) # => 1.234500E+17

This no longer follows the C99 standard, which most languages adhere to. Furthermore, the official and other Ryu implementations (used in #14084) also print zero-padded exponents. It allows nicely aligned numbers, which is useful when writing files with hundreds/thousands lines of floating numbers such as those used in computational chemistry; my field of work.

I ask to revert the format back to include the leading zero.

beta-ziliani · 2024-06-11T14:42:51Z

I mildly agree with reverting this change in formatting:

As mentioned in the forum, the original reason to drop the leading zero was to make it consistent with normal printing (e.g., 1e-6.to_s returns "1.0e-6").

So reverting it means breaking again the internal consistency, in favor of some external consistency and backward consistency.

I would like to point out, though, that the argument drawn was to ensure numbers will be parsed correctly. This sounds a bit sketchy. Because while indeed Python, Ruby, OCaml, and, of course, C, seems to agree on this formatting, .NET and Haskell doesn't. .NET pads the exponent with three digits. The C99 std says (bold is mine):

The exponent always contains at least two digits, and only as many more digits as necessary to represent the exponent.

Technically speaking, if a parser fails to parse a number without the "%2d" format of the exponent, it might as well break it .NET's "%3d".

Haskell does what Crystal 1.11 does.

ghci> Text.Printf.printf "%e\n" 1e-6
1.0e-6

Sija · 2024-06-11T16:31:49Z

So reverting it means breaking again the internal consistency, in favor of some external consistency and backward consistency.

@beta-ziliani I wouldn't call C99 spec some external consistency. It is a spec after all.

And so according to the spec part you've quoted, both the .NET and Haskell are simply wrong (and a minority among the languages). Following it means in this case reverting to the previous behaviour.

ysbaddaden · 2024-06-11T16:55:49Z

@Sija they're not wrong, they're free to not follow an external spec perfectly. Lang X printf doesn't have to be an exact C printf implementation.

ysbaddaden · 2024-06-13T07:11:25Z

That being said, the "at least two digits" rule likely didn't come from nowhere, and it may have some practical use, maybe just to improve readability, and maybe improve interoperability with other languages (the format is consistent).

beta-ziliani · 2024-06-13T14:03:59Z

Or maybe it's just historical?

franciscoadasme · 2024-06-13T17:04:38Z

Hey everyone, thank you for your input in this minor issue. I think the main problem is that #14084 introduced the change without documenting it, as @straight-shoota suggested in the forum post. From your discussion, it's not agreed which format should be used, so I think is better to revert it to the previous behavior for now. If it's later decided to drop the leading zero, it should be clearly documented on the PR/release notes.

IMHO, being consistent with the C99 standard and most other languages is important to avoid surprises in a heterogeneous environment (multiple languages), especially in scientific software (I believe Crystal is an excellent language for this use case). I think there should a very strong reason to go against a widely-used spec. The issue of internal consistency may be resolved by following this standard instead when using scientific notation in the normal printing of floats.

franciscoadasme added the kind:bug A bug in the code. Does not apply to documentation, specs, etc. label Jun 10, 2024

Blacksmoke16 added topic:stdlib:text kind:regression Something that used to correctly work but no longer works and removed kind:bug A bug in the code. Does not apply to documentation, specs, etc. labels Jun 10, 2024

straight-shoota mentioned this issue Jun 11, 2024

Restore leading zero in exponent for printf("%e") and printf("%g") #14695

Merged

straight-shoota closed this as completed in #14695 Jun 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Non-zero padded exponent in float string representation after Ryu implementation #14682

Non-zero padded exponent in float string representation after Ryu implementation #14682

franciscoadasme commented Jun 10, 2024

beta-ziliani commented Jun 11, 2024

Sija commented Jun 11, 2024

ysbaddaden commented Jun 11, 2024

ysbaddaden commented Jun 13, 2024

beta-ziliani commented Jun 13, 2024

franciscoadasme commented Jun 13, 2024 •

edited

Loading

Non-zero padded exponent in float string representation after Ryu implementation #14682

Non-zero padded exponent in float string representation after Ryu implementation #14682

Comments

franciscoadasme commented Jun 10, 2024

beta-ziliani commented Jun 11, 2024

Sija commented Jun 11, 2024

ysbaddaden commented Jun 11, 2024

ysbaddaden commented Jun 13, 2024

beta-ziliani commented Jun 13, 2024

franciscoadasme commented Jun 13, 2024 • edited Loading

franciscoadasme commented Jun 13, 2024 •

edited

Loading