Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experimentally add ffi_const and ffi_pure extern fn attributes #58327

Closed
wants to merge 11 commits into from

Conversation

gnzlbg
Copy link
Contributor

@gnzlbg gnzlbg commented Feb 9, 2019

These attributes correspond to the const and pure C attributes.

Closes #46046 (the attributes can then be added to jemalloc-sys).

@rust-highfive
Copy link
Collaborator

r? @michaelwoerister

(rust_highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Feb 9, 2019
@LifeIsStrange
Copy link

LifeIsStrange commented Feb 9, 2019

By curiosity, why not allow those attributes for rust code too (not only ffi)?
Also, why not expose all common llvm attributes to rust?
There is a lot of missed advanced features, and missed optimisations in rust by not being able to use compiler attributes unlike C/C++. This make rust far less competitive.

@jonas-schievink
Copy link
Contributor

Also, why not expose all common llvm attributes to rust?

Because that would tie rustc to using LLVM as a backend more than necessary. The plan is to eventually allow using other backends like cranelift.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Feb 9, 2019

By curiosity, why not allow those attributes for rust code too (not only ffi)?

A couple of reasons. First, by restricting them to C FFI we can get away with just saying "they do whatever they do in C". This adds instant value and would result in a pretty uncontroversial RFC. Other languages can interpret C header files and use these attributes (e.g. C++, D, ...) but Rust C FFI wrappers need a way to express them.

Second, we can always allow these in Rust in the future in a backwards compatible way. We would then have to specify what these attributes "mean" for Rust code (Rust is not C), what Rust functions using these attributes are exactly allowed to do, etc. We might also need to give these some other names, since pure and const would be overloaded for Rust. The RFC process for this would be significantly more complicated.

Also, why not expose all common llvm attributes to rust?
There is a lot of missed advanced features, and missed optimisations in rust by not being able to use compiler attributes unlike C/C++. This make rust far less competitive.

We don't want to tie Rust to LLVM by exposing "LLVM-isms" (readnone, readonly, etc.). This PR makes effort in exposing C attributes for C FFI: pure and const. In LLVM they happen to generate readonly and readnone attributes, but if we had a different backed (e.g. GCC) they would generate something else.

If there are some C attributes that you want to use in C FFI, PRs welcome I guess. This PR shows pretty much how to add them. Libraries like bindgen can start using them on nightly as soon as they land. Initially I had more hopes that cross-language inlining would solve all those problems, but there is always going to be foreign code that we want to just link against that won't benefit from that.

src/test/codegen/ffi-const.rs Outdated Show resolved Hide resolved
src/test/codegen/ffi-pure.rs Outdated Show resolved Hide resolved
@nagisa
Copy link
Member

nagisa commented Feb 9, 2019

By curiosity, why not allow those attributes for rust code too (not only ffi)?

While the reasons by @jonas-schievink and @gnzlbg are convincing already, I’d like to point that this attribute is simply not necessary for Rust code, because LLVM is capable of deriving these attributes as part of optimisation process on its own.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Feb 10, 2019

I’d like to point that this attribute is simply not necessary for Rust code, because LLVM is capable of deriving these attributes as part of optimisation process on its own.

Right.

I imagine that, in some cases, it might be useful for Rust code to guarantee some of these properties to their callers. But that would mean that for Rust code, the attribute / language feature goal would be to express this guarantee, e.g., failing to compile if the implementation of the API changes in such a way that these properties don't hold anymore. Since we already have const fn, justifying another similar but not quite feature won't be trivial.

For FFI, Rust can't derive these properties, so the purpose of these attributes is more clear.

@rust-highfive
Copy link
Collaborator

The job x86_64-gnu-llvm-6.0 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
travis_time:end:02cb5120:start=1549796652095474631,finish=1549796724170395726,duration=72074921095
$ git checkout -qf FETCH_HEAD
travis_fold:end:git.checkout

Encrypted environment variables have been removed for security reasons.
See https://docs.travis-ci.com/user/pull-requests/#pull-requests-and-security-restrictions
$ export SCCACHE_BUCKET=rust-lang-ci-sccache2
$ export SCCACHE_REGION=us-west-1
Setting environment variables from .travis.yml
$ export IMAGE=x86_64-gnu-llvm-6.0
---

[00:03:32] travis_fold:start:tidy
travis_time:start:tidy
tidy check
[00:03:33] tidy error: /checkout/src/test/codegen/ffi-pure.rs: too many trailing newlines (3)
[00:03:34] some tidy checks failed
[00:03:34] 
[00:03:34] 
[00:03:34] command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-tools-bin/tidy" "/checkout/src" "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "--no-vendor" "--quiet"
[00:03:34] 
[00:03:34] 
[00:03:34] failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test src/tools/tidy
[00:03:34] Build completed unsuccessfully in 0:00:46
[00:03:34] Build completed unsuccessfully in 0:00:46
[00:03:34] make: *** [tidy] Error 1
[00:03:34] Makefile:68: recipe for target 'tidy' failed
The command "stamp sh -x -c "$RUN_SCRIPT"" exited with 2.
travis_time:start:2c5fb81f
$ date && (curl -fs --head https://google.com | grep ^Date: | sed 's/Date: //g' || true)
Sun Feb 10 11:09:07 UTC 2019
---
travis_time:end:24ed987c:start=1549796948309624766,finish=1549796948314278739,duration=4653973
travis_fold:end:after_failure.3
travis_fold:start:after_failure.4
travis_time:start:209c6b58
$ ln -s . checkout && for CORE in obj/cores/core.*; do EXE=$(echo $CORE | sed 's|obj/cores/core\.[0-9]*\.!checkout!\(.*\)|\1|;y|!|/|'); if [ -f "$EXE" ]; then printf travis_fold":start:crashlog\n\033[31;1m%s\033[0m\n" "$CORE"; gdb --batch -q -c "$CORE" "$EXE" -iex 'set auto-load off' -iex 'dir src/' -iex 'set sysroot .' -ex bt -ex q; echo travis_fold":"end:crashlog; fi; done || true
travis_fold:end:after_failure.4
travis_fold:start:after_failure.5
travis_time:start:06859262
travis_time:start:06859262
$ cat ./obj/build/x86_64-unknown-linux-gnu/native/asan/build/lib/asan/clang_rt.asan-dynamic-i386.vers || true
cat: ./obj/build/x86_64-unknown-linux-gnu/native/asan/build/lib/asan/clang_rt.asan-dynamic-i386.vers: No such file or directory
travis_fold:end:after_failure.5
travis_fold:start:after_failure.6
travis_time:start:00f96935
$ dmesg | grep -i kill

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@nagisa nagisa added the T-lang Relevant to the language team, which will review and decide on the PR/issue. label Feb 10, 2019
@nagisa
Copy link
Member

nagisa commented Feb 10, 2019

The implementation side of things looks good to me, so r+ on that. Somebody from T-lang should sign off on this as well, I think.

@LifeIsStrange
Copy link

LifeIsStrange commented Feb 11, 2019

Sorry, my answer is a little late.

You all raised interesting valid points that I didn't consider first!
"First, by restricting them to C FFI we can get away with just saying "they do whatever they do in C". This adds instant value and would result in a pretty uncontroversial RFC.etc..." absolutely, adding C ffi attributes is a quick, uncontroversial improvment.

"I’d like to point that this attribute is simply not necessary for Rust code, because LLVM is capable of deriving these attributes as part of optimisation process on its own."
Well, while this diminish the needs of optimisations hints through attributes this does not obscolete them. Because compiler are not perfect, rust already has Inline and cold attributes for helping the compiler to make more informed decisions.
There are a lot more attributes for this that are today impossible to do in rust.
E.g alloc_size could be used in theory for jemalloc_sys too.

To make the question "should rust support more llvm attributes (as it already support a few) ?" more actionable, I've found sorting attributes in categories is a great help (ideally one would visualize through Venn diagram).

There is an uncontroversial, low hanging fruit category of attributes to support in rust code.
Those are the few attributes that can't have bad consequences if unsupported by the compiler backend.
That is: most of compiler optimisations hints have this property, and also debug/diagnostics attributes to an extent.
For example gcc support the hot attribute (not the same as Inline) and the malloc attribute, they are not supported to my knowledge by llvm. Yet the code is behaving exactly the same in llvm and gcc, it's just that gcc will be able to be faster and use less memory. Why not implement those seemingly uncontroversial attributes in rust (including pure and const) ?
Yes it must be assured that those attributes have compatible meaning between c/c++ and rust but most of them if not all are language agnostic!
The second category is for attributes that are supported both by gcc and llvm.
Actually their is a big intersection of attributes between the two compiler and this is a goal from clang, to quote them : "aims to support a broad range of GCC extensions." and as you can see in https://clang.llvm.org/docs/AttributeReference.html they document the very rare cases where an llvm attributes behave differently than on GCC.
So for attributes that increase Langage expressiveness and thus change how the code will behave, like return twice for example or alignment constraints or making a whole function volatile but that they are identically supported by both llvm and gcc I think that would be an improvement to support them.
For some, it's even mandatory requirements to be able to use rust.
Yes, others backends that doesn't support the de facto standard intersection of attributes between llvm and gcc would have issues. But by quantifying the pro and cons, it seems that the new langage expressiveness, guarantees and optimisations enabled outweight the user choosen eventual loss of compatibility with backends that will most likely always stay niches. Still, as only advanced users for feature/performance critical code will use this category of attributes, by far most rust codes will still be able to be compatible with those niches backends.
Finally, after all those backends must implement codegen for each rust features, I don't see why those attributes would be a different difficulty for those backend to support than the difficulty of enabling backend support for a new usual rust feature !

And there is the last category, code behavior changing llvmisms not supported by GCC, while I don't consider realistic from rust to switch from llvm to GCC (because of human ressource management efficiency (that would unnecessarily split optimisations work and increase limitations for what advantage other than ironically supporting the expressiveness of GCC only attributes?)) I agree this category is a controversial high hanging fruit.

Note : I guess that would be nice to open a meta issue for increasing rust attributes support, prioritizing inconsequential new compiler space/time optimisations hints.
I propose that the syntax would differentiate universal attributes (the inconsequentials if unsupported by the backend, from the code behavior changing attributes) through a prefix.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Feb 11, 2019

I guess that would be nice to open a meta issue for increasing rust attributes support,

That would be better, discussing it here is turning a bit offtopic.

@michaelwoerister
Copy link
Member

I agree with @nagisa -- the implementation looks good but this should be OK'ed by the lang team first.

src/librustc/hir/mod.rs Outdated Show resolved Hide resolved
src/librustc/hir/mod.rs Outdated Show resolved Hide resolved
src/librustc/hir/mod.rs Outdated Show resolved Hide resolved
src/librustc/hir/mod.rs Outdated Show resolved Hide resolved
@Centril
Copy link
Contributor

Centril commented Feb 11, 2019

I agree with @nagisa -- the implementation looks good but this should be OK'ed by the lang team first.

Since we already have const fn, justifying another similar but not quite feature won't be trivial.

For FFI, Rust can't derive these properties, so the purpose of these attributes is more clear.

I'm wary of the potential for unbounded increases in complexity for the purposes of FFI, especially when it comes to optimization rather than things which are impossible otherwise. I also wonder what role const fn could have here. Potentially, a subset of functions benefiting from ffi_const could also benefit from const fn. However, these musings and concerns should not stop the gathering of data. To this end I propose that #[ffi_pure/const] be added to the compiler experimentally without any commitments to stabilizing one day. I expect an RFC to be filed eventually that includes gathered data (including performance numbers) and other considerations.

@rfcbot merge

@rfcbot
Copy link

rfcbot commented Feb 11, 2019

Team member @Centril has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Feb 11, 2019
@Centril Centril changed the title Add ffi_const and ffi_pure extern fn attributes Experimentally add ffi_const and ffi_pure extern fn attributes Feb 11, 2019
@michaelwoerister
Copy link
Member

I'm not quite clear on what the status of this is. Did @rust-lang/lang sign off on it? I think some boxes still need checking.

@Centril Centril added S-waiting-on-team Status: Awaiting decision from the relevant subteam (see the T-<team> label). and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 21, 2019
@gnzlbg
Copy link
Contributor Author

gnzlbg commented Feb 21, 2019

Did @rust-lang/lang sign off on it? I think some boxes still need checking.

I don't think so, no.

cc @eddyb taking a look at these, I briefly thought about whether we could use them to implement the bench_input / bench_output APIs that you proposed in the black-box RFC, e.g.,

#[c_ffi_const]
fn bench_input<T>(x: T) -> T {
    asm!("" : "=*m"(&x) : "*m"(&x) : : :);
    x
}

but I would be surprised if telling LLVM that a function that it can prove as "not const" is const would be undefined behavior. So I'd rather keep this as C FFI attributes for now, and if we ever need to do that, add a #[rustc_llvm_attr(const)] or similar to specifically do that, clearly stating that such thing is perma-unstable and aiming for LLVM semantics explicitly, instead of reusing a C FFI attribute.

@bors
Copy link
Contributor

bors commented Feb 24, 2019

☔ The latest upstream changes (presumably #58315) made this pull request unmergeable. Please resolve the merge conflicts.

@Centril
Copy link
Contributor

Centril commented Mar 30, 2019

Ping from triage, @aturon @withoutboats @pnkfelix @nikomatsakis

@rfcbot rfcbot added the final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. label Apr 24, 2019
@rfcbot
Copy link

rfcbot commented Apr 24, 2019

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot removed the proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. label Apr 24, 2019
@rfcbot
Copy link

rfcbot commented May 4, 2019

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

The RFC will be merged soon.

@rfcbot rfcbot added finished-final-comment-period The final comment period is finished for this PR / Issue. and removed final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. labels May 4, 2019
@Centril
Copy link
Contributor

Centril commented May 31, 2019

r? @rkruppe

Copy link
Contributor

@hanna-kruppe hanna-kruppe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code and tests look good, but I have concerns about the (documented) semantics of these attributes.

not affected by changes to the observable state of the program.

The behavior of calling a `#[c_ffi_const]` function that violates these
requirements is undefined.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be UB to even have a wrongly annotated declaration, saying it's UB if the function is called doesn't justify speculatively executing it (e.g., hoisting out of branches) .

depend on the values of the function parameters and/or global memory.

The behavior of calling a `#[c_ffi_pure]` function that violates these
requirements is undefined.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for const: it should be UB to have a declaration wrongly annotated c_ffi_pure, even if it's not called.


That is, `#[c_ffi_pure]` functions shall have no effects except for its return
value, which shall not change across two consecutive function calls and can only
depend on the values of the function parameters and/or global memory.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't sound right. GCC docs specifically state pure functions can read arbitrary non-volatile memory. I don't know for sure what you mean by "global memory" is but certainly pure functions can be passed pointers to a caller's local variables and read them through those pointers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A pure function can read statics, including statics that are pointers, and read memory through these pointers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it can do all that and even more. And I don't think this paragraph properly conveys that.

Copy link
Contributor Author

@gnzlbg gnzlbg Jun 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have any suggestion?

Being able to read caller local variables via pointers passed to them is AFAICT already covered by "depend on the values of the function parameters", and by global memory I mean any memory that is globally accessible, e.g., via an static, a magic address, etc. (not necessarily something that's passed to the function as a parameter).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Sorry for the delay.)

I think language like in the GCC docs (arbitrary memory) is better. The phrase "values of the function parameters" is not clear for pointer arguments because it can be read as "only depend on the pointer itself, not the contents of the memory it points to" and indeed that's sort of the case for c_ffi_const functions. For example, IIUC uintptr_t tag_pointer(void *, int tag) that does arithmetic on the address can be c_ffi_const and work with pointers to non-constant memory, but of course reading from the pointed-to memory would generally make it non-c_ffi_const.

since any other function could modify global memory read by `#[c_ffi_pure]`
functions, altering their result. The `#[c_ffi_const]` attribute allows
sub-expression elimination regardless of any operations in between the function
calls.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I see what you're getting at here but this wording is misleading. It is harder for the optimizer to apply CSE because it has to show that no relevant state changed between two calls it wants to CSE, but "only apply across successive invocations" only communicates that if it means "directly adjacent without anything/much in between", which is too strong as stated (e.g., it might be known that the pure function doesn't read certain memory, or that another function called in between doesn't write memory).

I think I'd prefer to weaken this note to just briefly describe the difficulties imposed on the optimizer by the weaker rules of pure vs const and not get into the hairy business of predicting what that actually means for when the optimization fires or not.

A `pure` function that returns unit has no effect on the abstract machine's
state.

A diverging and `pure` C or C++ function is unreachable. Diverging via a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same point about termination as with const.

without side-effects is undefined behavior in C++ and not possible in C. In C++,
the behavior of infinite loops without side-effects is undefined, while in C
these loops can be assumed to terminate. This would cause a diverging function
to return, invoking undefined behavior.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This paragraph confuses me, I think it might incidentally mislead users too, and I don't even think it is technically correct

  • C's "assumed to terminate" is indistinguishable from "UB if non-terminating", so that last sentence is superfluous (and communicates a wrong picture of UB)
  • C doesn't actually say this about all infinite loops, it has an exception carved out for loops controlled by constant expressions, such as while (1) { ... }

I think we should just say as a separate axiom that functions marked with this attribute must not diverge. GCC's docs don't say this as far as I can see (at least explicitly -- arguably non-termination is an observable effect), but it's kind of the safer assumption because who knows what LLVM's implicit assumptions around readnone/readonly functions are at the moment. It also seems useful to assume termination of const/pure functions to be able to execute them speculatively.

@Centril Centril added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-team Status: Awaiting decision from the relevant subteam (see the T-<team> label). labels Jun 13, 2019
@JohnCSimon
Copy link
Member

Hello from triage - @gnzlbg This PR has not seen any activity in a month.

@JohnCSimon
Copy link
Member

@gnzlbg Triage: This PR looks like it's almost done. Giving it one more week before closing for inactivity.

@JohnTitor
Copy link
Member

Ping from triage: @gnzlbg
Unfortunately, we haven't heard from you on this in a while, so I'm closing the PR to keep things tidy. Don't worry though, if you have time again in the future please reopen this PR, we'll be happy to review it again! Thanks for taking the time to contribute.

@JohnTitor JohnTitor closed this Aug 4, 2019
@JohnTitor JohnTitor added S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Aug 4, 2019
@neocturne
Copy link
Contributor

I'm interested in this feature as well. I've started to rebase it onto master and intend to take care of the docs and open a new PR soon.

RalfJung added a commit to RalfJung/rust that referenced this pull request May 21, 2020
Experimentally add `ffi_const` and `ffi_pure` extern fn attributes

Add FFI function attributes corresponding to clang/gcc/... `const` and `pure`.

Rebased version of rust-lang#58327 by @gnzlbg with the following changes:

- Switched back from the `c_ffi_const` and `c_ffi_pure` naming to `ffi_const` and `ffi_pure`, as I agree with rust-lang#58327 (comment) and this nicely aligns with `ffi_returns_twice`
- (Hopefully) took care of all of @hanna-kruppe's change requests in the original PR

r? @hanna-kruppe
bors added a commit to rust-lang-ci/rust that referenced this pull request May 21, 2020
Experimentally add `ffi_const` and `ffi_pure` extern fn attributes

Add FFI function attributes corresponding to clang/gcc/... `const` and `pure`.

Rebased version of rust-lang#58327 by @gnzlbg with the following changes:

- Switched back from the `c_ffi_const` and `c_ffi_pure` naming to `ffi_const` and `ffi_pure`, as I agree with rust-lang#58327 (comment) and this nicely aligns with `ffi_returns_twice`
- (Hopefully) took care of all of @hanna-kruppe's change requests in the original PR

r? @hanna-kruppe
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. S-inactive Status: Inactive and waiting on the author. This is often applied to closed PRs. T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

liballoc_jemalloc nallocx should be ReadOnly