Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stabilize disjoint capture in closures (RFC 2229) #88126

Closed
2 tasks done
nikomatsakis opened this issue Aug 17, 2021 · 16 comments
Closed
2 tasks done

stabilize disjoint capture in closures (RFC 2229) #88126

nikomatsakis opened this issue Aug 17, 2021 · 16 comments
Labels
A-edition-2021 Area: The 2021 edition disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. T-lang Relevant to the language team, which will review and decide on the PR/issue.

Comments

@nikomatsakis
Copy link
Contributor

nikomatsakis commented Aug 17, 2021

RFC 2229 Stabilization Report

Links

Impl blockers

  • Fixing various ICEs and small bugs that arose from public testing period
  • Fixing behavior of #[nonexhaustive] captures across crates

Summary

This feature adjusts the rules for closure captures starting in Rust 2021. In Rust 2018 and before, closures always captured entire local variables. Under these new rules, they capture more precise paths. These paths are always some prefix of the precise paths that are references within the closure body.

As a simple example, in Rust 2018, || println!("{}", a.y) captures a reference to a and not just a.y. Capturing a in its entirety prevents mutation or moves from other fields of a, so that code like this does not compile:

let a = SomeStruct::new();
drop(a.x); // Move out of one field of the struct
println!("{}", a.y); // Ok: Still use another field of the struct
let c = || println!("{}", a.y); // Error: Tries to capture all of `a`
c();

In Rust 2021, however, this closure would only capture a reference to a.y, and hence the code above would compile.

Disjoint capture was proposed as part of RFC 2229 and the RFC contains details about the motivation.

Capture algorithm, guide description

Ref closures

Ref closures generally capture the precise paths that you use within the closure, subject to a few limitations:

  • Unsafe parts of paths, such as the deref of a raw pointer or an access to the field of a union, always occur in the closure. The closure will therefore capture the prefix of the path that stops right before the unsafe operation (e.g., it would capture the raw pointer, not the data the raw pointer points at).
  • When moving particular paths, we only capture data that was present on the stack frame of the closure creator. So if a closure moves data from (*b).f, where b: Box<SomeType>, we would capture all of b. This ensures that closures never capture deceptively large amounts of data that were intentionally being passed via a box.

Move closures

Move closures generally try to take ownership of all their captures. However, in the event that the capture they are taking ownership of is borrowed data, they may borrow the data instead:

  • When the closure only uses a path by reference, and the path passes through a reference, then the path is captured by reference.
    • e.g., a call like x.0.split(0) where x: &([u32], String) will capture x.0 by reference, even though this is a move closure (note that in Rust 2018, we would have captured the reference x itself, which is ery similar).
  • Otherwise, all paths are moved, but as above we only capture data that was present on the stack frame of the closure creator. So if a closure moves data from (*b).f, where b: Box<SomeType>, we would capture all of b. This ensures that closures never capture deceptively large amounts of data that were intentionally being passed via a box.

Optimizations

In addition to the rules above, the compiler also performs some optimizations to reduce closure size, but these do not affect which programs compile or do not compile, or the order of destructor execution, so they are not relevant to understanding how the program works.

Examples

struct Foo { x: i32 }

fn box_mut() {
    let bx = Box::new([0; 1024]);
    let c = move || println!("{:?}", *bx);
    // Closure captures `bx`, even though the body uses `*bx`,
    // because of the rule that we always capture data that resides
    // on the creator's stack frame.
}
struct Foo { x: i32 }

fn box_mut() {
    let mut s = Foo { x: 0 } ;
    
    let px = &mut s;
    let bx = Box::new(px);
    
    
    let c = move || bx.x += 10;
    // Closure captures `&mut (*(*bx)).x`, even though this is a move
    // closure, because of the rule that "moves" of borrowed data
    // ultimately cature by mutable reference.
}

Capture algorithm, precise description

The precise capture algorithm is described here. The high-level idea is as follows:

  1. Collect the set C of all "captures" that appear in the closure body. A capture consists of a place P that is referenced and a mode M:
    • The place P is something like (*a).b.c
    • The mode M is move, ref, or ref mut
    • For example, a closure like || x.0.truncate(0), where x: &mut (String, String), would capture the place (*x).0 with the mode ref mut.
      • This is because x.0.truncate(0) is shorthand for str::truncate(&mut (*x).0, 0)
    • Note that the mode M is independent of whether this is a move closure or not
  2. Truncate the "captures" (P, M) in the C by applying various transforms:
    • unsafe_check(P, M) -> (P, M) truncates the place P so that any unsafe operations occur in the closure, not the creator
      • e.g., instead of capturing (*x.f).m, if x.f: *mut T, we would just capture x.f
    • ref_opt(P, M) -> (P, M) truncates dereferences of &T references so that we capture the entire &T instead of capturing some subfield. This is an optimization that minimizes the size of closures without affecting the set of things that compile.
      • e.g., given x: &(String, String) and || foo(&x.0, &x.1), we would minimize the places (*x).0 and (*x).1 to just x, thus capturing a single reference instead of two
    • For a non-move closure:
      • ref_xform: If this is a "by-value" mode capture, then truncate to the first pointer dereference
        • e.g., given b: Box<[u8; 1024]> and a by-value capture of *b, truncate at b so that we capture the box, not its contents
    • For a move closure:
      • move_xform, which performs 3 truncations:
        • If this is a ref mut mode capture, and the place contains a deref of an &mut reference, then leave it unchanged.
        • If this is a ref mode capture, and the place contains a deref of an & reference, then leave it unchanged.
        • Else, change to a a by-value mode capture, then truncate to the first pointer dereference
  3. Minimization: If the set C contains two places (P1, M1) and (P2, M2) where P1 is a prefix of P2, then just capture (P1, max(M1, M2))
    • e.g., given || { &x; &mut x.y; } we capture x (the prefix of x and x.y) but with the mode ref mut (the greater of the two modes).

What data does a closure use?

Closures are considered to reference data under the following conditions:

  • Shared borrows of a place (&a.b.c) is a ref mode access to the place a.b.c; shared borrows are often introduced via calls to methods like a.b.c.split(0)
  • Mutable borrows of a place (&mut a.b.c) is a ref mut mode access to the place a.b.c; mutable borrows are often introduced via calls to methods like a.b.c.truncate(0)
  • By-value accesses to a place (drop(x)) of Copy type are considered mode ref of the place x
  • By-value of a place (drop(x)) not of Copy type are considered a move mode access to the place x
  • Pattern matching against an enum variant Enum::Variant is considered a read of the enum if:
    • the enum has multiple variants
    • or the enum is marked as #[non_exhaustive] and is defined in another crate (FIXME)

Implementation strategy

The implementation strategy is to modify the desugaring of closures. Whereas Rust 2018 closures like || foo(&mut a.b, &mut c.d, &mut c.e) desugars to creating an anonymous struct with two fields, one for a and one for c:

ClosureStruct { a: &mut a, c: &mut c }

The Rust 2021 struct would contain a field for each unique capture:

ClosureStruct { a_b: &mut a.b, c_d: &mut c.d, c_e: &mut c.e }

This change is focused on MIR construction and does not affect the borrow checker in any significant way (there were changes to the diagnostics code).

Size measurements

We have measured the size of closures on the compiler and other code bases. A broader crater run is in progress. Preliminary measurements suggest that closures are slightly larger in Rust 2021, but not substantially: the average is that closures grow by less than 1 byte. This table breaks down measurements by the size of the original closure:

Rust 2018 min size Rust 2018 max size Number of closures Average size in 2018 Average growth in 2021 in bytes Percentage growth
0 16 17039 3.88 0.22 5.56%
16 32 4105 17.90 0.76 4.26%
32 64 1700 40.79 0.86 2.11%
64 9999999 850 99.40 -0.26 -0.27%
0 9999999 23694 12.39 0.34 2.74%

How the RFC's unresolved questions were resolved

The RFC posed the following unresolved questions:

How to optimize pointers. Can borrows that all reference parts of the same
object be stored as a single pointer? How should this optimization be
implemented (e.g. a special repr, refinement typing)?

We use a naive algorithm apart from a single optimization which captures an & reference instead of creating shared references to the individual fields. Our measurements found that this optimization is sufficient to ensure that closures are generally the same size or smaller in Rust 2018 and avoided any complex changes to the borrow checker.

How to signal that a function is pure. Is this even needed/wanted? Any other
places where the language could benefit?

We do not signal that functions are pure, which means that overloaded derefs and deref-mut implementations capture the base pointer. The one exception is Box, as described in the next question.

Should Box be special?

Yes, we treat Box specially. This is consistent with the borrow checker which genreally treats Box as special. This could in the future be generalized with some sort of "deref pure" trait but we left that for future work.

Drop order can change as a result of this RFC, is this a real stability
problem? How should this be resolved?

This was resolved via edition migrations. Migratons were also used to address some other cases that arose, such as the fact that closures may not implement different auto traits as a result of capturing different things.

Test cases and testing

  • Public testing of the 2021 Edition and migration
  • Unit tests in the 2229_closure_analysis directory:
    • The internal #[rustc_capture_analysis] annotation causes rustc to dump debugging output about what is captured and why, allowing us to test the model directly.
    • For example, this test covers an edge case around the &T capture optimization, and the stderr file identifies the precise path that is captured.
@nikomatsakis nikomatsakis added the T-lang Relevant to the language team, which will review and decide on the PR/issue. label Aug 17, 2021
@nikomatsakis
Copy link
Contributor Author

cc @rust-lang/lang @rust-lang/wg-rfc-2229

@nikomatsakis
Copy link
Contributor Author

nikomatsakis commented Aug 17, 2021

@rfcbot fcp merge

Per discussion in the @rust-lang/lang meeting today, I propose to merge this. We are trying to keep the wheels moving here to try and meet the 1.56 branch deadline for Rust 2021.

@rfcbot
Copy link

rfcbot commented Aug 17, 2021

Team member @nikomatsakis has proposed to merge this. The next step is review by the rest of the tagged team members:

No concerns currently listed.

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. labels Aug 17, 2021
@jackh726
Copy link
Member

Congrats to @arora-aman, @roxelo, and others who worked on this. Great work.

@matklad
Copy link
Member

matklad commented Aug 20, 2021

Found an interesting interaction between disjoint captures and AssertUnwindSafe.

The following program compiles on 2018 but fails on 2021:

use std::panic::AssertUnwindSafe;

struct UnwindUnsafe<'a>(&'a mut u32);

fn main() {
    let mut x = 0;
    let uu = UnwindUnsafe(&mut x);
    let aus = AssertUnwindSafe(uu);
    let _ = std::panic::catch_unwind(move || {
        let AssertUnwindSafe(uu) = aus;
        drop(uu)
    });
}

The issue here is that the closure doesn't actually capture the AssertUnwindSafe wrapper, so it becomes !UnwindSafe. let _ = &aus; fixes that. I would prefer that the original code just work, because it clearly intends to wrap/unwrap the value, but this isn't a big deal for me personally.

The example comes from a real-world code in rust-lang/rust-analyzer#9961

@arora-aman
Copy link
Member

arora-aman commented Aug 20, 2021

Found an interesting interaction between disjoint captures and AssertUnwindSafe.

The following program compiles on 2018 but fails on 2021:

use std::panic::AssertUnwindSafe;

struct UnwindUnsafe<'a>(&'a mut u32);

fn main() {
    let mut x = 0;
    let uu = UnwindUnsafe(&mut x);
    let aus = AssertUnwindSafe(uu);
    let _ = std::panic::catch_unwind(move || {
        let AssertUnwindSafe(uu) = aus;
        drop(uu)
    });
}

The issue here is that the closure doesn't actually capture the AssertUnwindSafe wrapper, so it becomes !UnwindSafe. let _ = &aus; fixes that. I would prefer that the original code just work, because it clearly intends to wrap/unwrap the value, but this isn't a big deal for me personally.

The example comes from a real-world code in rust-analyzer/rust-analyzer#9961

Thank you for bring this up. We are aware of this and this case gets handled by the 2229 migration.

Not sure if these needs to get documented in the report but here's the issue for reference: rust-lang/project-rfc-2229#29

@m-ou-se
Copy link
Member

m-ou-se commented Aug 23, 2021

This FCP needs to start today to make it in time for 1.56.

@m-ou-se
Copy link
Member

m-ou-se commented Aug 23, 2021

@cramertj @pnkfelix @scottmcm First one to click their checkbox wins a free emoji.

@rfcbot
Copy link

rfcbot commented Aug 23, 2021

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot added final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. and removed proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. labels Aug 23, 2021
@m-ou-se
Copy link
Member

m-ou-se commented Aug 23, 2021

@cramertj Here's your free emoji:

🐊

@m-ou-se m-ou-se mentioned this issue Aug 30, 2021
11 tasks
@rfcbot rfcbot added the finished-final-comment-period The final comment period is finished for this PR / Issue. label Sep 2, 2021
@rfcbot
Copy link

rfcbot commented Sep 2, 2021

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

The RFC will be merged soon.

@rfcbot rfcbot added to-announce Announce this issue on triage meeting and removed final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. labels Sep 2, 2021
@m-ou-se
Copy link
Member

m-ou-se commented Sep 3, 2021

What's the status of the second blocker mentioned in this issue?

  • Fixing behavior of #[nonexhaustive] captures across crates

@arora-aman
Copy link
Member

What's the status of the second blocker mentioned in this issue?

  • Fixing behavior of #[nonexhaustive] captures across crates

That's been fixed by #88280

@jackh726
Copy link
Member

jackh726 commented Sep 3, 2021

What's the status of the second blocker mentioned in this issue?

  • Fixing behavior of #[nonexhaustive] captures across crates

That's been fixed by #88280

I ticked the checkbox for this then

@apiraino apiraino removed the to-announce Announce this issue on triage meeting label Sep 18, 2021
@ehuss
Copy link
Contributor

ehuss commented Oct 2, 2021

Since this is now stabilized, should this issue be closed?

@nikomatsakis
Copy link
Contributor Author

Yes.

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Feb 28, 2023
…ler-errors

Remove the `capture_disjoint_fields` feature

As best I can tell, this was stabilized for Edition 2021 in rust-lang#88126 but the feature was never removed.
Dylan-DPC added a commit to Dylan-DPC/rust that referenced this issue Feb 28, 2023
…ler-errors

Remove the `capture_disjoint_fields` feature

As best I can tell, this was stabilized for Edition 2021 in rust-lang#88126 but the feature was never removed.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Feb 28, 2023
…ler-errors

Remove the `capture_disjoint_fields` feature

As best I can tell, this was stabilized for Edition 2021 in rust-lang#88126 but the feature was never removed.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Mar 1, 2023
…ler-errors

Remove the `capture_disjoint_fields` feature

As best I can tell, this was stabilized for Edition 2021 in rust-lang#88126 but the feature was never removed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-edition-2021 Area: The 2021 edition disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

8 participants