Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add f16b floating-point type for native support of bfloat16 #2690

Closed
wants to merge 5 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
173 changes: 173 additions & 0 deletions text/0000-f16b.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,173 @@
- Feature Name: `f16b`
- Start Date: 2019-04-17
- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000)
- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000)

# Summary
[summary]: #summary

Add the floating-point type `f16b` to Rust, providing native support for the
`bfloat16` 16-bit floating-point format.

# Motivation
[motivation]: #motivation

The [`bfloat16` floating-point format](https://arxiv.org/abs/1711.10374)
provides a memory-dense format for floating-point values, supported by various
hardware platforms and software libraries. This format allows storing twice as
many values in the same amount of memory or cache compared to `f32`, making
better use of memory bandwidth and storage. The `bfloat16` representation
consists of a truncation of `f32` values that discards bits of the mantissa,
making conversions between `f32` and `bfloat16` trivial and allowing platforms
to easily use `f32` for computation if necessary.

The `bfloat16` format serves particularly well in handling large matrices or
vectors of numbers, for which it allows denser memory and cache usage; in
particular, `bfloat16` sees widespread use in machine learning / neural network
applications, as the format to store weights for training and inference.

This RFC proposes adding the `bfloat16` floating-point format to Rust as the
type `f16b`, with a full complement of standard mathematical operations and
functions.

# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

After this RFC, we could explain this as follows:

In addition to the `f32` and `f64` types, Rust provides the `f16b` type for
16-bit floating-point operations. The `f16b` type corresponds to the
[`bfloat16` floating-point format](https://arxiv.org/abs/1711.10374). This type
provides a 1-bit sign, 8-bit exponent, and 7-bit mantissa (effectively 8-bit
with implicit leading 1), by contrast with the 23-bit (effectively 24-bit)
mantissa of `f32`. Rust supports all the same operations and constants for
`f16b` that it does for `f32` and `f64`.

# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation

The `f16b` type should always have size 2 and alignment 2 on all platforms,
even if the target platform does not have native support for `f16b`. This
allows all platforms to use `f16b` as a memory-dense storage format.

You may use `f16b` on all platforms. Some platforms provide native support for
operations on `f16b` values; other platforms will map `f16b` operations to
`f32` or `f64` operations and convert back to `f16b` for storage.
(Implementations of Rust using a code generation backend without native
`bfloat16` support may use a software implementation that converts to and from
`f32`.)

You may declare literals of type `f16b` by suffixing them with `f16b`, such as
`1.0f16b`, by analogy with `f32` and `f64`. An unsuffixed floating-point
literal may resolve to the `f16b` type through inference.

The `f16b` type implements all the same operations and typeclasses as other
floating-point types, including:

- Add, Sub, Mul, Div, Rem, and the Assign variants
- Neg
- Copy and Clone
- Display and Debug
- Default
- LowerExp and UpperExp
- FromStr
- PartialEq and PartialOrd
- Sum and Product
- All built-in methods common to the `f32` and `f64` types.

The `f16b` type does not implement `From` for any integral type, as any such
conversion could potentially overflow.

`f32` and `f64` provide `impl From<f16b>`.

A new module `std::f16b` will provide `f16b` versions of all the same constants
as `std::f32` and `std::f64`, including their inner `consts` modules.

The `f16b` type is FFI-safe, and may be used in foreign-function calls to
C-compatible functions expecting a `bfloat16` value.

Rust's `primitive_docs` will need an update to document the `f16b` type.

A few external crates will need updates to support the new types,
including `serde` and `num-traits`.

# Drawbacks
[drawbacks]: #drawbacks

This adds another floating-point type for developers to learn and select from,
and increases the compiled size of the standard library (though the size
observed by user code will remain the same if that code does not use the new
functionality).

# Rationale and alternatives
[rationale-and-alternatives]: #rationale-and-alternatives

We could use another name for `f16b`, such as `bfloat16` or `bf16`. The name
`f16b` represents an attempt to align with existing floating-point types in
Rust, and allow for other future types.

We could make this type `f16`; however, there are two common 16-bit
floating-point formats (`binary16` and `bfloat16`). This RFC does not preclude
choosing one of those formats as `f16` in the future, but chooses to avoid
pre-determining the answer to that question.

We could support `f16b` exclusively on platforms with native hardware support.
However, this would generate substantial conditional code within software
wanting to use this type for memory-efficient storage and reduced memory
bandwidth usage. Rather than forcing reimplementation of that conditional code,
we can supply implementations on all platforms and allow code to use it
uncondtionally. As precedent, note that Rust supports `i128` and `u128` on all
platforms, whether or not those platforms have hardware support for 128-bit
registers or 128-bit mathematical operations.

We could support `bfloat16` via a separate crate, with no native support in
Rust. However, native support would allow for native code generation (in LLVM
or future backends), which a separate crate could not take advantage of. A
separate crate could provide a fallback implementation via `f32` or `f64`, but
not a native one.

We should also provide hardware intrinsics for platforms with native `bfloat16`
support. However, such intrinsics do not obviate the need for native code
generation support in the compiler. Intrinsics only allow code to directly
force the use of specific instructions, rather than supporting high-level

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @oli-obk may have covered this but I want to attach it to this line because it is wrong: intrinsics do not in general force the use of a particular instruction and in particular don't have to impede vectorization. Not only can Rust-level intrinsics expand to inline LLVM IR instruction sequences (e.g. as a radical example, arch::x86_64::_mm256_setzero_ps maps to the constant zeroinitializer in LLVM IR), even things that are intrinsics at the LLVM IR level can be detached from any particular target instruction and be as optimizable as anything else. For example, @llvm.sqrt can be constant folded, is subject to algebraic simplifications (tho most are only valid in fast-math mode), can be vectorized, etc. and any Rust program that calls f{32,64}::sin is using it.

generation of SIMD instructions from natural-looking loops or iterators.
Intrinsics also force the use of platform-specific code paths. Thus, support
for `bfloat16` should not occur exclusively through intrinsics, but rather
should support both intrinsics and native code generation in the compiler.

In the course of implementing `bf16`, we may end up using some combination of
native code generation in LLVM via a lang item, Rust code invoking portable
LLVM built-in functions, or both. A lang item would require an implementation
that ships with Rust; code invoking portable LLVM built-ins could either ship
with Rust or in a separate library, as long as Rust provided stable versions of
the necessary portable LLVM built-ins.

# Prior art
[prior-art]: #prior-art

See the [`bfloat16` Wikipedia
article](https://en.wikipedia.org/wiki/Bfloat16_floating-point_format) for many
links to other software and hardware with support for `bfloat16`.

# Unresolved questions
[unresolved-questions]: #unresolved-questions

Prior to stabilization, we should have a full implementation of `f16b`
generically for all Rust platforms (based on `f32`), as well as an
implementation of `f16b` for at least one platform with native hardware
support, to shake out any potential corner-cases.

Will allowing an unsuffixed floating-point literal to become an `f16b` through
inference lead to any errors? If so, we could drop this requirement, but that
would reduce the convenience of using `f16b`.

# Future possibilities
[future-possibilities]: #future-possibilities

This RFC does not preclude the possibility of introducing other 16-bit floating
point formats in the future, such as the IEEE `binary16` format (which provides
a smaller range and higher precision). This RFC proposes not defining any type
as `f16`, and instead unambiguously using suffixes on `f16` to distinguish
different 16-bit floating-point types. For instance, `f16h` could represent the
different "half-float" type supported by some CPUs and GPUs, which has a larger
mantissa, smaller exponent, and smaller range.