-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Math support in core #2505
Comments
Previous discussion of this: rust-lang/rust#50145 I don’t understand the difference between option (a) and (b), they seem to be effectively the same. Stabilizing an intrinsic is typically done by adding a stable (and safe) wrapper function. The existing support in
For what it’s worth we have other precedent of inherent methods not being present in libcore, for example |
Seems like this would be a non-problem with a portability lint thing. In
that case the floating point functions could be cfgd away like atomics are.
On Jul 23, 2018 11:27 AM, "Simon Sapin" <[email protected]> wrote:
Previous discussion of this: rust-lang/rust#50145
<rust-lang/rust#50145>
I don’t understand the difference between option (a) and (b), they seem to
be effectively the same. Stabilizing an intrinsic is typically done by
adding a stable (and safe) wrapper function. The existing support in std is
a stable wrapper function/method.
feel built-in because the functionality is provided as inherent methods
For what it’s worth we have other precedent of inherent methods not being
present in libcore, for example [T]::to_vec.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#2505 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AApc0uFh-Fc1p7V6fapv43cqAUcvXg6oks5uJYjagaJpZM4VafqZ>
.
|
A cranelift backend for rustc already needs to gracefully handle a long list of rustc intrinsics. In this case always translating the rustc intrinsic to a |
They are pretty similar but they are not exactly the same. In option (b) Thinking about it some more I think option (a) will also run into the problems that option (b) has Also, I originally thought (but didn't comment above) that having option (a) would good because the TL;DR Option (b) (still) sounds best to me.
I think that's the only inherent method that involves a type that's not built into the language I cc-ed T-portability people because they also look into issues that aim to close the gap between I think math functions should not be cfg-ed away. Even if the target doesn't have a FPU, software Right, other backends will have to port other Rust intrinsics as things stand today. Perhaps, 30 Finally, this is a bit of speculation because I have not tested it but I think that putting the math |
Currently there are 21 inherent methods defined on primitive types outside of This doesn’t invalidate this issue, I think having math support in libcore would be good. I’m only saying it wouldn’t entirely remove the weirdness of inherent methods being "magically" added just by adding a dependency on a non- #[lang = "slice_alloc"]
impl<T> [T] {
pub fn sort(&mut self) where T: Ord {…}
pub fn sort_by<F>(&mut self, mut compare: F) where F: FnMut(&T, &T) -> Ordering {…}
pub fn sort_by_key<K, F>(&mut self, mut f: F) where F: FnMut(&T) -> K, K: Ord {…}
pub fn sort_by_cached_key<K, F>(&mut self, f: F) where F: FnMut(&T) -> K, K: Ord {…}
pub fn to_vec(&self) -> Vec<T> where T: Clone {…}
pub fn into_vec(self: Box<Self>) -> Vec<T> {…}
pub fn repeat(&self, n: usize) -> Vec<T> where T: Copy {…}
}
#[lang = "slice_u8_alloc"]
impl [u8] {
pub fn to_ascii_uppercase(&self) -> Vec<u8> {…}
pub fn to_ascii_lowercase(&self) -> Vec<u8> {…}
}
#[lang = "str_alloc"]
impl str {
pub fn into_boxed_bytes(self: Box<str>) -> Box<[u8]> {…}
pub fn replace<'a, P: Pattern<'a>>(&'a self, from: P, to: &str) -> String {…}
pub fn replacen<'a, P: Pattern<'a>>(&'a self, pat: P, to: &str, count: usize) -> String {…}
pub fn to_lowercase(&self) -> String {…}
pub fn to_uppercase(&self) -> String {…}
pub fn escape_debug(&self) -> String {…}
pub fn escape_default(&self) -> String {…}
pub fn escape_unicode(&self) -> String {…}
pub fn into_string(self: Box<str>) -> String {…}
pub fn repeat(&self, n: usize) -> String {…}
pub fn to_ascii_uppercase(&self) -> String {…}
pub fn to_ascii_lowercase(&self) -> String {…}
} |
Back on topic: @japaric, I think I don’t quite understand how LLVM intrinsics, the libm crate, and system/toolchain-provided
Do you mean that the libm crate would call the intrinsic? Then what would provide the (Note that rust-lang/rust#27823 moved a number of Regardless, I think this is largely independent of what user-facing API we want to stabilize, which at first I thought was what your (a) v.s. (b) was about:
I think the latter API is obviously superior, assuming identical implementations. |
CC @alexcrichton who discussed this in rust-lang/rust#32110 (comment). |
@japaric is sqrtf32 the only intrinsic that needs to be stabilized or is that just an example and there's others for other math functions? |
I do sort of agree with @japaric about the goal here of basically moving everything to libcore, but my main point of hesitation would be performance and accuracy of these intrinsics vs various libm implementations. @japaric would it be possible to collect some data about the efficiency of the various implementations in the Rust libm vs some native libm implementations? I'm less worried about things like |
@jethrogb It’s an example. This is what’s in #[lang = "f64_runtime"]
impl f64 {
pub fn floor(self) -> f64 {…} // intrinsics::floorf64
pub fn ceil(self) -> f64 {…} // intrinsics::ceilf64
pub fn round(self) -> f64 {…} // intrinsics::roundf64
pub fn trunc(self) -> f64 {…} // intrinsics::truncf64
pub fn abs(self) -> f64 {…} // intrinsics::fabsf64
pub fn signum(self) -> f64 {…} // intrinsics::copysignf64
pub fn mul_add(self, a: f64, b: f64) -> f64 {…} // intrinsics::fmaf64
pub fn powi(self, n: i32) -> f64 {…} // intrinsics::powif64
pub fn powf(self, n: f64) -> f64 {…} // intrinsics::powf64
pub fn sqrt(self) -> f64 {…} // intrinsics::sqrtf64
pub fn exp(self) -> f64 {…} // intrinsics::expf64
pub fn exp2(self) -> f64 {…} // intrinsics::exp2f64
pub fn ln(self) -> f64 {…} // intrinsics::logf64
pub fn log2(self) -> f64 {…} // intrinsics::log2f64
pub fn log10(self) -> f64 {…} // intrinsics::log10f64
pub fn sin(self) -> f64 {…} // intrinsics::sinf64
pub fn cos(self) -> f64 {…} // intrinsics::cosf64
pub fn abs_sub(self, other: f64) -> f64 {…} // cmath::fdim
pub fn cbrt(self) -> f64 {…} // cmath::cbrt
pub fn hypot(self, other: f64) -> f64 {…} // cmath::hypot
pub fn tan(self) -> f64 {…} // cmath::tan
pub fn asin(self) -> f64 {…} // cmath::asin
pub fn acos(self) -> f64 {…} // cmath::acos
pub fn atan(self) -> f64 {…} // cmath::atan
pub fn atan2(self, other: f64) -> f64 {…} // cmath::atan2
pub fn exp_m1(self) -> f64 {…} // cmath::expm1
pub fn ln_1p(self) -> f64 {…} // cmath::log1p
pub fn sinh(self) -> f64 {…} // cmath::sinh
pub fn cosh(self) -> f64 {…} // cmath::cosh
pub fn tanh(self) -> f64 {…} // cmath::tanh
// Based on other methods, but not directly on intrinsics or cmatch
pub fn log(self, base: f64) -> f64 {…}
pub fn fract(self) -> f64 {…}
pub fn div_euc(self, rhs: f64) -> f64 {…}
pub fn mod_euc(self, rhs: f64) -> f64 {…}
pub fn sin_cos(self) -> (f64, f64) {…}
pub fn asinh(self) -> f64 {…}
pub fn acosh(self) -> f64 {…}
pub fn atanh(self) -> f64 {…}
} Where $ grep "llvm.*f64" src/librustc_codegen_llvm/intrinsic.rs
"sqrtf64" => "llvm.sqrt.f64",
"powif64" => "llvm.powi.f64",
"sinf64" => "llvm.sin.f64",
"cosf64" => "llvm.cos.f64",
"powf64" => "llvm.pow.f64",
"expf64" => "llvm.exp.f64",
"exp2f64" => "llvm.exp2.f64",
"logf64" => "llvm.log.f64",
"log10f64" => "llvm.log10.f64",
"log2f64" => "llvm.log2.f64",
"fmaf64" => "llvm.fma.f64",
"fabsf64" => "llvm.fabs.f64",
"copysignf64" => "llvm.copysign.f64",
"floorf64" => "llvm.floor.f64",
"ceilf64" => "llvm.ceil.f64",
"truncf64" => "llvm.trunc.f64",
"rintf64" => "llvm.rint.f64",
"nearbyintf64" => "llvm.nearbyint.f64",
"roundf64" => "llvm.round.f64",
|
However, note that (to the best of my knowledge) the vast majority of those intrinsics are lowered to libcalls rather than single instructions on most or all architectures. They are still useful to LLVM for optimizations (constant folding, code motion and dead code elimination based on the fact that they don't access |
The (observable) behavior of the // user writes
fn my_sqrt(x: f32) -> f32 {
intrinsics::sqrtf32(x)
}
// For ARM Cortex-M4F, LLVM lowers `my_sqrt` to
fn my_sqrt(x: f32) -> f32 {
let y;
unsafe {
asm!("vsqrt $0, $1" : "=w"(y) : "w"(x));
}
y
}
// For targets that don't have an instruction for the sqrt operation, LLVM lowers `my_sqrt` to
fn my_sqrt(x: f32) -> f32 {
extern "C" {
fn sqrtf(_: f32) -> f32;
}
unsafe {
sqrtf(x)
}
} On targets like x86_64 Linux
Yes, the libm crate would use
The impl F32Ext for f32 {
fn sqrt(x: f32) -> f32 {
unsafe {
intrinsics::sqrtf32(x)
}
}
}
#[no_mangle]
pub extern "C" fn sqrtf(x: f32) -> f32 {
// Software implementation
}
We can, but we don't have to force our // crate: core
impl f32 {
fn sqrt(self) -> Self {
unsafe { intrinsics::sqrtf32(self) }
}
}
// crate: compiler-builtins
#[cfg(any(target_os = "none", target_os = "unknown"))]
#[no_mangle]
pub fn sqrtf(x: f32) -> f32 {
// Software implementation
} Targets like Also, right off the bat, I can tell you that most of the |
@japaric So your (a) proposal is adding APIs to libcore that, when used, adds a dependency on a symbol being provided externally somehow. The precedent of rust-lang/rust#27823 and rust-lang/rust#32110 (comment) seems to be that libcore should avoid precisely this. |
Agree with japaric that plan (b) is the way to. Normally [very much! Haha] want things to be stable code, and hopefully move to nursery crate, but the case of polyfilling code that may just be generated instead is clearly a special case, where the compiler coupling is inherent to the problem. And tying ourselves to LLVM in stable interfaces is definitely no good. Let my also through out that on the general front of the |
True! We don't have a great way of adding |
hmm, I believe that if the intrinsic is marked as #[inline] the undefined symbol would end up in the libm crate and not in the libcore crate.
Having core inject -lm sounds wrong; core should not depend on C libraries being present on the host. However, I don't think we would need to have core pass the -lm flag to the linker. Keeping the current behavior of having std pass -lm to the linker would be sufficient: even if a #[no_std] crate uses the math support in core the crate will end up being used in a binary that links to std, so the -lm requirement would be satisfied. I don't know of any use case of #[no_std] executables for Do we even officially support #![no_std] programs on |
From the linker’s point of view, sure. But does it matter? From a user’s point of view there would be an API in libcore that, when used, might cause undefined symbol errors. |
@SimonSapin Quite a few of the functions in core::intrinsics have the same behavior; they can produce undefined symbol errors. If you mean to say that we should not stabilize API that has such behavior; I would agree. That policy would also eliminate option (a). |
rust-lang/rust#27823 and rust-lang/rust#32110 (comment) suggest that, at least so far, libcore (or the subset of it reachable through its stable public API) is intended to be "dependency-free". But then based on some of the discussion in #2480, maybe we should rethink the whole libcore / |
The Ideally, all math intrinsics provided would not only work on That would mean, however, that |
I assume that the problem described in the starting issue comment for |
What if we switch LLVM and bootstrap a new transpiler? |
This should also work with rust-lang/rust#57241 const fn support, i.e. ideally writing
|
If anyone is interested, I'm developing a set of core sin, cos, ln, exp etc. for the SIMD library here: They are very simple, mostly a single polynomial evaluation and probably significantly faster With a little extra compiler support, we could also make them const-able - great for FFT evaluation They are generated using the doctor_syn crate which extends syn to enable arbitrary |
This seems obvious and simple to me so why hasn't this been done yet? Math functions like ceil() and floor() are very simple yet are not available in core. I don't understand what the hold up is. If some subset of the functions are problematic, then just leave those out. Getting the trivial functions into core seems like a priority. |
It is not obvious and simple, because |
The discussion in this thread fizzled out 5 years ago. Are we still in the same place? From what I understand, core pulling in either -lm or the libm crate implicitly is not desired. Can we just make the user do that, like with -lc and compiler-builtins? This would improve the ergonomics of writing Rust code, and would move the issue of doing no_std math from library authors to binary builders. |
Why do we need to rely on external system libraries? What's the problem with Rust implementing these directly? |
@mlindner That's approach (b) from the issue description. One possible issue might be that It would be nice to make that fallback official, but having explicit features might be better for reproducibility. |
If I could provide a little context: The Until our own libm is given much more attention, it's unlikely to be put into the default compilation mix. |
Thanks for the context, as a workaround could we add (if it's not already there) a feature flag that says to use Rust's built-in math intrinsics and thus by that means allow them in to no-std? |
The problem here is the use of llvm intrinsics to implemement maths functions. For example https://github.com/rust-lang/rust/blob/master/library/std/src/f32.rs#L610-L612 This is a pragmatic choice as no work needs to be done on many platforms, but LLVM This is very much the B-grade choice as calling any function has a high cold code overhead I'm planning to highlight this in an upcoming book on rust code performance, feedback is welcome. A better choice would be to at least bite the bullet and use x86/ARM specific intrinsics for those platforms It would be interesting to see what Answers on a postcard. |
|
if you enable appropriate target features, |
@programmerjake is correct. When a modern target is enabled, it works well, but the default is always disappointing. https://rust.godbolt.org/z/Kqsf3YG7K It would be lovely if Much of the SIMD group's excellent work is hard to use without this option unless you use I'm not a regular reader of Rust discussions, but would imagine this has been discussed before. |
How it was done =============== First step was to replace the references to the `std` crate by `core` or `alloc` when possible. Which was always the case, except for the `Error` trait which isn't available in `core` on stable[1]. Another change, that may impact the performance a bit, was to replace the usage of Hash{Map,Set} by the B-Tree variant, since the hash-based one aren't available in core (due to a lack of a secure random number generator). There should be no visible impact on default build, since we're still using hashtable when `std` is enabled (which is the default behavior). Maybe I should give a shot to `hashbrown` to bring back the hashtable version into `no-std` as well. The last change was a bit more annoying (and surprising): most of the math functions aren't available in core[2]. So I've implemented a fallback using `libm` when `std` isn't enabled. Note that due to Cargo limitation[3] `libm` is always pulled as a dependency, but it won't be linked unless necessary thanks to conditional compilation. Known limitations ================ Cannot use the `geo` feature without `std`, mainly because the `geo` crate itself isn't `no-std` (as well as a bunch of its dependencies I guess). -------- [1]: rust-lang/rust#103765 [2]: rust-lang/rfcs#2505 [3]: rust-lang/cargo#1839 Closes: #19
How it was done =============== First step was to replace the references to the `std` crate by `core` or `alloc` when possible. Which was always the case, except for the `Error` trait which isn't available in `core` on stable[1]. Another change, that may impact the performance a bit, was to replace the usage of Hash{Map,Set} by the B-Tree variant, since the hash-based one aren't available in core (due to a lack of a secure random number generator). There should be no visible impact on default build, since we're still using hashtable when `std` is enabled (which is the default behavior). Maybe I should give a shot to `hashbrown` to bring back the hashtable version into `no-std` as well. The last change was a bit more annoying (and surprising): most of the math functions aren't available in core[2]. So I've implemented a fallback using `libm` when `std` isn't enabled. Note that due to Cargo limitation[3] `libm` is always pulled as a dependency, but it won't be linked unless necessary thanks to conditional compilation. Known limitations ================ Cannot use the `geo` feature without `std`, mainly because the `geo` crate itself isn't `no-std` (as well as a bunch of its dependencies I guess). -------- [1]: rust-lang/rust#103765 [2]: rust-lang/rfcs#2505 [3]: rust-lang/cargo#1839 Closes: #19
Background
Currently the
core
crate doesn't provide support for mathematical functions likesqrt
orsin
.To do math in a
#![no_std]
program one has the following options:Link to a C implementation of libm, i.e.
libm.a
. This is cumbersome as the programmer needs toobtain a compiled version of libm for their target, or compile libm themselves which implies a C
cross toolchain when the target system and the build system are not the same architecture / OS.
Use a pure Rust implementation of libm, like the
libm
crate. On stable, (a) the performance ofsuch implementation won't be on par with a C implementation, or (b) to achieve the same
performance the user would require a C (cross) toolchain.
To elaborate on (a) and (b). Consider the following contrived program that computes the square root
of a number:
When compiled for the
thumbv7em-none-eabihf
target it produces the following machine code:This is extremely inefficient machine code because the target has a hardware FPU that supports
computing the square root in a single instruction. Ideally, the program should compile down to the
following machine code:
If the target had access to the standard library the program would compile down to that machine code
because the implementation of
f32.sqrt
instd
looks like this:sqrtf32
is an unstable, thin wrapper around an LLVM intrinsic that either compiles down to ahardware implementation of square root if the target architecture supports it in its instruction
set, or it produces a call to the
sqrtf
routine if it doesn't (*).std
makes use of 30+ of suchLLVM intrinsics for performance of math functions.
(*) The
llvm.sqrt.*
LLVM intrinsic, whichsqrtf32
wraps, is not quite specified like that butthat's the observable effect.
The
libm
crate can't make use of this intrinsic on stable because it's unstable and featuregated. However, the
libm
crate could replicate the behavior of thesqrtf32
intrinsic usingconditional compilation and external assembly files as shown below:
But this would heavily complicate the implementation of the
libm
crate, which would likelyintroduce bugs. Also, as it's not possible to use inline assembly (
asm!
) on stable thevsqrt.f32
instruction would have to be invoked via FFI and an external assembly file. External assembly files
mean that the user would require a C (cross) toolchain to build the crate negating the main benefit
of using a pure Rust implementation of libm.
Possible solutions
I see two options for improving the situation here:
a. We stabilize the family of
sqrtf32
LLVM intrinsics. This way crates likelibm
can achieve theperformance of the
std
implementation on stable without requiring complex conditionalcompilation and C toolchains. Or,
b. We move all the existing math support from
std
tocore
. For the user this means that e.g.f32.sqrt
will also work in#![no_std]
programs.Option (a) is kind of bad (maybe?) for alternative backends like cranelift as they would have to
support / implement these LLVM intrinsics to be on parity with the
rustc+LLVM
compiler.Option (b) requires us (*) to provide an implementation of math functions (symbols) like
sqrtf
for targets that do not link to libm by default. If we don't do this those targets will hit
"undefined reference to
sqrtf
" linker errors when using math methods likef32.sqrt
.(*) "us" as in: we must provide symbols like
sqrtf
in thecompiler-builtins
crate. Note that weare already providing such symbols for the
wasm32-unknown-unknown
target, and we areusing the
libm
crate to do that.If we go ahead with option (b) we must be careful to not provide the math symbols in
compiler-builtins
for targets that are currently using system libm (e.g.x86_64-unknown-linux-gnu
). Because if we do provide the symbols then all existing programs willstart using the
libm
crate implementation instead of the system libm implementation -- this is dueto how we invoke the linker:
libcompiler_builtins.rlib
appears before-lm
in the linkerarguments -- and that may degrade performance in some cases where system libm has architecture
optimized implementations of some functions.
With option (b) I believe that
#![no_std]
programs that are currently linking to some Cimplementation of libm for math support will end up using the
libm
crate implementation as a sideeffect. I don't see a way to avoid this: even if we mark the math symbols in
compiler-builtins
asweak the way we invoke the linker will cause the program to use the
libm
crate implementation.Final thoughts
IMO, math support should be in the
core
crate as it doesn't depend on OS, or I/O, abstractionslike other
std
-only API does (e.g.std::fs
,std::net
). Also,std
makes math likesqrt
feel built-in because the functionality is provided as inherent methods -- it feels weird that such "built-in" functionality is not available in#![no_std]
.Thoughts? Should we do (a) or (b)? Or is there some other solution? Or should we leave math out of core?
cc @SimonSapin (T-libs), @jethrogb @Ericson2314 (T-portability), @joshtriplett @korken89 (some stakeholders)
The text was updated successfully, but these errors were encountered: