Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify deferred destruction #9

Merged
4 commits merged into from Sep 6, 2017
Merged

Simplify deferred destruction #9

4 commits merged into from Sep 6, 2017

Conversation

ghost
Copy link

@ghost ghost commented Aug 30, 2017

Proposal:

  • Remove Scope::defer_free.
  • Optimize Scope::defer.

Rendered

@ghost ghost requested review from arthurprs, jeehoonkang, schets and Vtec234 August 30, 2017 19:57
@arthurprs
Copy link

arthurprs commented Sep 1, 2017

Thanks for the writeup.

I'm not sure I share the sentiment. My impression is that it's not a net-gain as it'll make the most frequent use cases less ergonomic.

The extra Ptr/Owned functions are probably useful on their own, to avoid dirty transmutes and all that.

@ghost
Copy link
Author

ghost commented Sep 2, 2017

@arthurprs

Do you at least agree about removing defer_free, since we have ManuallyDrop now? I thought that was an easy one to sell. :)

My impression is that it's not a net-gain as it'll make the most frequent use cases less ergonomic.

If we leave defer_drop alone, would the RFC then be a net gain for you? -- I'd be okay with that.

This makes `defer_free` unnecessary.
This is the kind of problem `ManuallyDrop` was created to solve.

### Do we need `defer_free`?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe defer_drop?


# Detailed design

### `Deferred`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the title here doesn't match with the content.

In most cases `defer` will be used to just deallocate memory or execute simple destruction
routines. The passed `FnOnce()` closure will almost always fit into, say, 3 words.

I have an [implementation](https://gist.github.com/stjepang/00d63b8febc07b297eae3480e80d0e91)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it can be inlined here so that this RFC is self-contained.

of `Deferred` ready, which is able to store a small closure within itself, falling back
to heap allocation for big closures.

**TL;DR:** `Deferred` is to `FnOnce() + Send + 'static` what `SmallVec` is to `Vec`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very helpful analogy :)


Strictly speaking, this code is possibly incorrect because an `Owned` and `Ptr` pointing to the
same thing exist at the same time. This is okay because the `Owned` will not be used until the
closure executes, but **@RalfJung**'s new unsafe code checker might reject the code as unsound.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel the proposed solution is hacky for the reason that the meaning of Owned or Ptr<'static> is unclear. I think it would be great if the meaning can be explained in RustBelt's lifetime logic.

Also, I doubt that the defer* family of functions is safe. For example, the client should guarantee that the defer_drop()ed object is no longer pointed to by a memory cell. This knowledge cannot be expressed in Rust's type system, so it should be unsafe.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding conversion from Ptr<'scope> to Ptr<'static>, that is definitely hacky. But I think what we actually want here is conversion to Ptr<'unsafe>. Unsafe lifetime was proposed in this RFC, but in the end it was postponed.

Regarding safety, defer_free and defer_drop are definitely unsafe and this RFC doesn't aim to change that. However, defer is currently unsafe, but this RFC proposes to make it a safe function. I suppose you're okay with that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes. I am okay with pub safe fn defer().

We'll introduce two new functions (`Owned::from_ptr` and `Ptr::to_static`). But at the same time
we'll also remove two functions (`Scope::defer_free` and `Scope::defer_drop`).

There might be very small differences in performance when using `defer_free` and `defer_drop` versus
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @schets may have an opinion on the performance. cf: crossbeam-rs/crossbeam-epoch#3 (comment)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.

@arthurprs
Copy link

Is it reasonable to assume that defer_drop is the most frequent use case? I can get behind removing defer_free.

@ghost
Copy link
Author

ghost commented Sep 4, 2017

I reduced the scope of the RFC. Now it only proposes to remove defer_free and optimize defer.

Copy link

@arthurprs arthurprs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm gonna go ahead and approve it. I'll defer more specific implementation details to the implementation PR.

@Vtec234
Copy link
Member

Vtec234 commented Sep 4, 2017

I tacked Deferred onto the Garbage implementation here and then ran these benchmarks on it to compare it with the previous implementation (EPOCH). The results are as follows:

EPOCH:
test bench::bench_defer       ... bench:         487 ns/iter (+/- 19)
test bench::bench_defer_multi ... bench:     139,416 ns/iter (+/- 12,116)
test bench::bench_defer_n     ... bench:      20,005 ns/iter (+/- 314)
test bench::bench_drop        ... bench:         518 ns/iter (+/- 19)
test bench::bench_drop_multi  ... bench:     128,332 ns/iter (+/- 10,473)
test bench::bench_drop_n      ... bench:      42,413 ns/iter (+/- 727)
test bench::bench_free        ... bench:         514 ns/iter (+/- 10)
test bench::bench_free_multi  ... bench:     127,119 ns/iter (+/- 11,145)
test bench::bench_free_n      ... bench:      40,721 ns/iter (+/- 1,411)

DEFERRED:
test bench::bench_defer       ... bench:         687 ns/iter (+/- 14)
test bench::bench_defer_multi ... bench:     186,959 ns/iter (+/- 17,204)
test bench::bench_defer_n     ... bench:      20,119 ns/iter (+/- 1,056)
test bench::bench_drop        ... bench:         692 ns/iter (+/- 13)
test bench::bench_drop_multi  ... bench:     193,229 ns/iter (+/- 15,993)
test bench::bench_drop_n      ... bench:      47,453 ns/iter (+/- 1,640)
test bench::bench_free        ... bench:         688 ns/iter (+/- 18)
test bench::bench_free_multi  ... bench:     192,716 ns/iter (+/- 13,700)
test bench::bench_free_n      ... bench:      45,798 ns/iter (+/- 2,760)

Unfortunately, it seems that in threaded benchmarks this version is noticeably slower, presumably due to slower migration of garbage. I might've made some silly mistake and written a nonoptimal implementation, but if not, we have to investigate the causes of this.

@jeehoonkang
Copy link
Contributor

jeehoonkang commented Sep 4, 2017

On defer_free

My first impression is that defer_free might be faster for some cases. However, now I think defer_free is a strange method since we usually don't want to free an object without dropping it. It would be good to have defer_drop only.

For performance, we may need to find a way to determine whether a given type has a trivial drop impl (cf crossbeam-rs/crossbeam#57 (comment)). But we don't have a stable way to do it.

I agree with the proposed change, and want Rust to stabilize fn needs_drop().

On Deferred

I like the idea, but want to see the benchmark result. @Vtec234's result looks bad...

@Vtec234
Copy link
Member

Vtec234 commented Sep 4, 2017

I changed my mind and removed a previous comment.

With regards to the API, we have to notice that removing defer_free imposes the requirement of T: Send + 'static on all objects to be removed, even those with trivial destructors, while defer_free allowed recollection of objects without this restriction. I think the Send part is perfectly fine - all types used in threaded contexts should be Send anyway.

The 'static part is slightly more troublesome, and a problem we currently have no satisfactory solution to. For now I can't see a usecase where one would have a trivially destructible non-'static type and where this change would prevent the usecase from being implemented. However, if one exists, this change would have to be reconsidered.

As the RFC mentions, defer_drop could also be removed in favour of defer, but that would require adding a few unsafe conversion methods to satisfy trait requirements. This might be a good solution, since it makes explicit the invariants of lifetime that the user has to guarantee, but on the other hand it does make the API slightly more cumbersome to use.

In general, I'm in favour of this RFC, but only if we resolve the performance problems I mentioned in the comment above.

@ghost
Copy link
Author

ghost commented Sep 4, 2017

@Vtec234

Thank you for writing these benchmarks! I can confirm the performance issues.

It turns out that the size of Deferred (as in std::mem::size_of) is larger than the size of Garbage, which makes it a bit slower. After tweaking the layout of Deferred a bit to make it smaller, the issue is resolved:

EPOCH
test bench::bench_defer       ... bench:         474 ns/iter (+/- 28)
test bench::bench_defer_multi ... bench:     132,339 ns/iter (+/- 56,620)
test bench::bench_defer_n     ... bench:      21,147 ns/iter (+/- 2,083)
test bench::bench_drop        ... bench:         479 ns/iter (+/- 14)
test bench::bench_drop_multi  ... bench:     124,968 ns/iter (+/- 77,483)
test bench::bench_drop_n      ... bench:      39,859 ns/iter (+/- 4,453)
test bench::bench_free        ... bench:         489 ns/iter (+/- 16)
test bench::bench_free_multi  ... bench:     117,083 ns/iter (+/- 34,990)
test bench::bench_free_n      ... bench:      38,173 ns/iter (+/- 1,925)

DEFERRED
test bench::bench_defer       ... bench:         476 ns/iter (+/- 10)
test bench::bench_defer_multi ... bench:     118,214 ns/iter (+/- 36,568)
test bench::bench_defer_n     ... bench:      16,636 ns/iter (+/- 2,985)
test bench::bench_drop        ... bench:         534 ns/iter (+/- 128)
test bench::bench_drop_multi  ... bench:     114,980 ns/iter (+/- 81,091)
test bench::bench_drop_n      ... bench:      40,418 ns/iter (+/- 5,847)
test bench::bench_free        ... bench:         496 ns/iter (+/- 13)
test bench::bench_free_multi  ... bench:     115,312 ns/iter (+/- 86,908)
test bench::bench_free_n      ... bench:      37,695 ns/iter (+/- 3,172)

I've pushed the updated implementation of Deferred to this RFC.

This is the kind of problem `ManuallyDrop` was created to solve.

# Can we make `defer` safe and faster?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe three sharps?

Copy link
Member

@Vtec234 Vtec234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After tweaking the layout of Deferred a bit to make it smaller, the issue is resolved

That's great! If it was this simple to improve performance, surely it will also be possible to tweak it to specific hardware. I think it's fine if we leave the inline implementation of Deferred in this RFC as a working PoC and then modify it further in the actual crate code.

@ghost
Copy link
Author

ghost commented Sep 6, 2017

The 'static part is slightly more troublesome, and a problem we currently have no satisfactory solution to. For now I can't see a usecase where one would have a trivially destructible non-'static type and where this change would prevent the usecase from being implemented. However, if one exists, this change would have to be reconsidered.

This is a good point and a real concern. I'll open a new issue to discuss how to deal with those lifetime constraints.

@ghost ghost merged commit 7919cbe into crossbeam-rs:master Sep 6, 2017
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants