Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow ties in floating literals #866

Merged
merged 1 commit into from
Oct 2, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 5 additions & 3 deletions docs/design/expressions/implicit_conversions.md
Original file line number Diff line number Diff line change
Expand Up @@ -121,8 +121,8 @@ An integer constant can be implicitly converted to any type `iM`, `uM`, or `fM`
in which that value can be exactly represented. A floating-point constant can be
implicitly converted to any type `fM` in which that value is between the least
representable finite value and the greatest representable finite value
(inclusive), and does not fall exactly half-way between two representable
values, and converts to the nearest representable finite value.
(inclusive), and converts to the nearest representable finite value, with ties
broken by picking the value for which the mantissa is even.

The above conversions are also precisely those that C++ considers non-narrowing,
except:
Expand Down Expand Up @@ -294,4 +294,6 @@ types.

- [Implicit conversions in C++](https://en.cppreference.com/w/cpp/language/implicit_conversion)
- Proposal
[#820: implicit conversions](https://github.com/carbon-language/carbon-lang/pull/820).
[#820: Implicit conversions](https://github.com/carbon-language/carbon-lang/pull/820).
- Proposal
[#866: Allow ties in floating literals](https://github.com/carbon-language/carbon-lang/pull/866).
29 changes: 4 additions & 25 deletions docs/design/lexical_conventions/numeric_literals.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@ SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
- [Details](#details)
- [Integer literals](#integer-literals)
- [Real number literals](#real-number-literals)
- [Ties](#ties)
- [Digit separators](#digit-separators)
- [Divergence from other languages](#divergence-from-other-languages)
- [Alternatives considered](#alternatives-considered)
Expand Down Expand Up @@ -87,7 +86,7 @@ example, `3e10` is not a valid literal.

When a real number literal is interpreted as a value of a real number type, its
value is the representable real number closest to the value of the literal. In
the case of a [tie](#ties), the conversion to the real number type is invalid.
the case of a tie, the nearest value whose mantissa is even is selected.

The decimal real number syntax allows for any decimal fraction to be expressed
-- that is, any number of the form _a_ x 10<sup>-_b_</sup>, where _a_ is an
Expand All @@ -100,29 +99,6 @@ decimal equivalent that is known to convert to the intended value. Hexadecimal
real number literals are provided in order to permit values of binary floating
or fixed point real number types to be expressed directly.

#### Ties

As described above, a real number literal that lies exactly between two
representable values for its target type is invalid. Such ties are extremely
unlikely to occur by accident: for example, when interpreting a literal as
`Float64`, `1.` would need to be followed by exactly 53 decimal digits (followed
by zero or more `0`s) to land exactly half-way between two representable values,
and the probability of `1.` followed by a random 53-digit sequence resulting in
such a tie is one in 5<sup>53</sup>, or about
0.000000000000000000000000000000000009%. For `Float32`, it's about
0.000000000000001%, and even for a typical `Float16` implementation with 10
fractional bits, it's around 0.00001%.

Ties are much easier to express as hexadecimal floating-point literals: for
example, `0x1.0000_0000_0000_08p+0` is exactly half way between `1.0` and the
smallest `Float64` value greater than `1.0`, which is `0x1.0000_0000_0000_1p+0`.

Whether written in decimal or hexadecimal, a tie provides very strong evidence
that the developer intended to express a precise floating-point value, and
provided one bit too much precision (or one bit too little, depending on whether
they expected some rounding to occur), so rejecting the literal is preferred
over making an arbitrary choice between the two possible values.

### Digit separators

If digit separators (`_`) are included in literals, they must meet the
Expand Down Expand Up @@ -165,9 +141,12 @@ cases for the goal of not leaving room for a lower level language:
- [Decimal literals](/proposals/p0143.md#decimal-literals)
- [Case sensitivity](/proposals/p0143.md#case-sensitivity)
- [Real number syntax](/proposals/p0143.md#real-number-syntax)
- [Disallow ties](/proposals/p0866.md)
- [Digit separator syntax](/proposals/p0143.md#digit-separator-syntax)

## References

- Proposal
[#143: Numeric literals](https://github.com/carbon-language/carbon-lang/pull/143)
- Proposal
[#866: Allow ties in floating literals](https://github.com/carbon-language/carbon-lang/pull/866)
114 changes: 114 additions & 0 deletions proposals/p0866.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
# Allow ties in floating literals

<!--
Part of the Carbon Language project, under the Apache License v2.0 with LLVM
Exceptions. See /LICENSE for license information.
SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-->

[Pull request](https://github.com/carbon-language/carbon-lang/pull/866)

<!-- toc -->

## Table of contents

- [Problem](#problem)
- [Background](#background)
- [Proposal](#proposal)
- [Details](#details)
- [Rationale based on Carbon's goals](#rationale-based-on-carbons-goals)
- [Alternatives considered](#alternatives-considered)

<!-- tocstop -->

## Problem

Proposal [#143](https://github.com/carbon-language/carbon-lang/pull/143)
suggested that we do not allow ties in floating-point literals. That is, given a
literal whose value lies exactly half way between two representable values, we
should reject rather than arbitrarily picking one of the two possibilities.

However, the [statistical argument](p0143.md#ties) presented in that proposal
misses an important fact: the distribution of the values that are exactly half
way between representable values includes several values of the form A x
10<sup>B</sup>, where A and B are small integers.

For example, the current rule rejects this very reasonable looking code:

```
var v: f32 = 9.0e9;
```

... because 9 x 10<sup>9</sup> lies exactly half way between the nearest two
representable values of type `f32`, namely 8999999488 and 9000000512. Similar
examples exist for larger floating point types:

```
// Error, half way between two exactly representable values.
var w: f64 = 5.0e22;
```

We would also reject an attempted workaround such as:

```
var v: f32 = 5 * 1.0e22;
```

... because the literal arithmetic would be performed exactly, resulting in the
same tie. A workaround such as

```
var v1: f32 = 5.0e22 + 1.0;
var v2: f32 = 5.0e22 - 1.0;
```

... to request rounding upwards and downwards, respectively, would work.
However, these seem cumbersome and burden the Carbon developer with
floating-point minutiae about which they very likely do not care.
Comment on lines +57 to +67
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I actually think math between floating point literals is even more important than just removing another way to write this... It seems likely to make it much more likely to encounter. It may not be reasonable to somehow change arithmetic to not produce evenly between values.

Anyways, not disagreeing, just suggesting its even better motivated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree that working around such things will be awkward. On the other hand, we'd still need to refute the statistical argument from #143, that such exact ties are unlikely to come up by chance if your computation doesn't put heavy skew into the distribution of resultant mantissas. I don't think it's going to be all that hard to contrive a computation where that'd happen, but I think such easy-to-find examples are likely to amount to computing one of the simple base-10 values that hit this problem or a variant of one of those.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(discussed, and fine to move forward as is for now)


## Background

For background on the ties-to-even rounding rule, see
[this Wikipedia article](https://en.wikipedia.org/wiki/Rounding#Round_half_to_even).
The ties-to-even rule is the default rounding mode specified by ISO 60559 /
IEEE 754.

## Proposal

Instead of rejecting exact ties, we use the default IEEE floating point rounding
mode: we round to even.

## Details

See design changes.

## Rationale based on Carbon's goals

- [Code that is easy to read, understand, and write](/docs/project/goals.md#code-that-is-easy-to-read-understand-and-write)
- This improves the ease of both reading and writing floating-point
literals that would result in ties.
- This improves the language consistency, by performing the same rounding
when converting literals as is performed by default when converting
runtime values.
- [Practical safety and testing mechanisms](/docs/project/goals.md#practical-safety-and-testing-mechanisms)
- It is unlikely that making an arbitrary but consistent rounding choice
will harm safety or program correctness.
- [Fast and scalable development](/docs/project/goals.md#fast-and-scalable-development)
- [Modern OS platforms, hardware architectures, and environments](/docs/project/goals.md#modern-os-platforms-hardware-architectures-and-environments)
- [Interoperability with and migration from existing C++ code](/docs/project/goals.md#interoperability-with-and-migration-from-existing-c-code)
- This rule, likely because it is the IEEE default rounding mode, already
appears to be used by major C++ compilers such as Clang, GCC, MSVC, and
ICC.

## Alternatives considered

We could round to even only for decimal floating-point literals, and still use
the rule that ties are rejected for hexadecimal floating point. In the latter
case, a tie means that too many digits were specified, and the trailing digits
were exactly `80000...`.

However, because we support arithmetic on literals, forming other literals, this
would mean that whether a literal was originally written in hexadecimal would
form part of its value and thereby part of its type. There would also be
problems with literals produced by arithmetic. The complexity involved here far
outweighs any perceived benefit of diagnosing mistyped literals.