-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Garbage collection #415
Comments
I don't think forcing libraries to worry about tracing is worth it. It will a significant amount of complexity and with that comes new memory safety issues. The need to add overhead to trait objects is unacceptable, as is forcing more bloat into every crate. Features that impose a cost whether or not you use them are not a good fit with the language. Rust has been steadily dropping features like segmented stacks and green threads not adhering to pay-for-what-you-use. |
Niche features with a performance cost should be opt-in at compile-time and anyone who wants it can build a new set of standard libraries with it enabled. I still don't think the complexity would be worth it even in that scenario. I'm strongly against adding any form of tracing to the language / libraries and I intend to build a lot of community resistance against these costly, complex features. If it ends up being added, then it's going to be more great ammunition for a fork of the language. |
@thestinger In either case it would be possible to avoid any kind of overhead from garbage collection support for code that doesn't want it (at least how I would do things; can't speak for others). That was actually one of my foremost priorities. Then it mainly boils down to the question of opt-in vs. opt-out. (But even in the opt-out case, it would be possible to opt out.) Basically in one universe, garbage collection support is provided by default and you write:
and
to disallow the given types from containing managed data, and thereby avoid any overhead from tracing support (including having to consider the possibility in
and
to enable tracing support, and thereby allow storing managed data. Obviously you would prefer the latter. (I don't personally have a preference yet.) But once the infrastructure is in place (which is the same in either case), there would be lots of room to figure out the best way to expose it, and plenty of time to litigate the opt-in vs. opt-out debate. (Again, I'm speaking only for myself here and have no idea what anybody else, not least the core team, wants to do.) |
If the standard libraries support it, then it imposes overhead on everyone. |
Simply outputting the metadata by default slows down compiles and results in more bloated binaries. If it's not opt-in via a compiler switch, then you're forcing costs on everyone. Either way, it forces a huge amount of complexity on the standard libraries because they need to cope with tracing. It will decrease the quality of the code for the common case where the niche feature isn't used. |
@thestringer, if it's opt in (which it probably should be). The compiler time overhead in the don't use should be no more than that of any other unused trait with many impls. The runtime overhead should be non whatsoever. By "should be" I mean something that I feel is a mandatory goal shared by just about everything interested, and an attainable goal too. |
@Ericson2314: That's not at all true, as I explained above. |
@thestinger I have read everything you wrote, and I am not convinced.
The standard library need not to support GC types from the get go. It seems reasonable to support trying to nail down the GC abstractions first, and then merge them into the standard library. The problem of making a lot more functions generic ocurs ONLY when the abstractions are used pervasively in the standard library. This problem is also triggered by making those functions allocator-agnostic without GC. My solution is to speculatively compile generic functions instanciated with their defaults in rlibs. This will mean if your program uses jemalloc and no GC (the default args), compile times would be similar today. |
@glaebhoerl With the dynamic registering of stack variables as you propose (which, because a pointer is registered, I think will prevent the variables from going in registers), I'm hopeful that a rough prototype could be made without any rustc or llvm support. Do you agree? |
Yeah, that's how confirmation bias works.
So you didn't actually read my comments, because you're ignoring the problems with trait objects. You're also not countering the point about the increase in metadata at all.
If the standard library ever supports garbage collection, it will add unacceptable overhead in terms of metadata and bloat.
You're drawing a false equivalence here. Allocator support on collections would not result in bloated metadata, bloated code or slower compile-time. It would be a pay-for-what-you feature as it would only generate extra code for custom allocators. I don't see how speculative compilation is a good idea, considering that types like collections need to be instantiated for each set of type parameters. Since nearly all of the code is supposed to be inlined, there's very little that can actually be reused in any case.
No, adding metadata will significantly slow down compile times. |
It's amusing that people are unable to have an honest debate about this. I've had productive debates about it with @pnkfelix and he never felt the need to deny that there are costs to supporting tracing. The only way of completely avoiding a runtime / cost size cost is making it a compile-time option and not building any of the standard libraries with it enabled by default. It will still introduce a significant amount of complexity into the standard libraries and get in the way of implementing optimizations. The compile-time switch would result in there being 4 dialects of Rust to test and support (tracing is one bit of diversity, unwinding is another - and surely there will be more proposals for costly, complex niche features). |
Doing it without rustc support seems like a tall order, but maybe at the "rough prototype" level something might be possible (after all, the Servo folks already did something vaguely similar). @huonw also had a prototype back at the discussion in the other repository. But yes, although I'm not a GC expert, unless I'm missing something, avoiding having to rely on LLVM seems like it should be possible (and probably advisable, at least in the short term). |
@glaebhoerl I think it would be an interesting thing to make, if for nothing else to demonstrate that at least tracing can be done without any cost to non-users. @thestinger If you find this conversation unproductive I am sorry. I value your insistence on features not costing non-users. If the bloat imposed by GC is as unavoidable and significant as you claim it is, then I will agree with you that GC shouldn't be added. I don't mean to be deceptive -- If @pnkfelix admits there will be some cost, perhaps you both are aware of something I am missing.
Just to be sure, I searched for "trait object" and I got your sentence:
and @glaebhoerl 's sentence:
The bloat you are referencing I assume is the extra trace method in every vtable -- and to be clear I consider that bloat too. My previous understanding, which is what I thought @glaebhoerl followed up with, was that this was due to trace being an opt-out in his original comment. If we make it opt-in, then while
Again, what metadata. I absolutely agree stack maps are extra metadata to clutter up the rlibs. But in this current proposal, there are no stack maps. But in @glaebhoerl's proposal for the first iteration, there are no stack maps. Either the registering of roots would be explicit, or it would exist 1-1 with the explicit calls to create or clone a GC root ptr, so it would be the next best thing. Both options are very explicit on costs, and would seem not to impact those that don't use GC. Even if/when stack maps are added, I'd assume they can be enabled/disabled without affecting the semantics of code that does not use it. So while yes, there is another build target, there is no new dialect of Rust.
and
As illustrated above, The only metadata and bloat I am aware of is stack maps and the trace method in vtables. I have tried to explain my reasoning leading me to believe that they both can be avoided in programs that do not use GC without changing the semantics of Rust / forking a new dialect.
If I remember correctly, my concern is something is not my own, but something I read elsewhere, perhaps basically in a meeting minutes. Perhaps my recollection is wrong, and there is no problem. The concern is right now, Rust only compiles the monomorphizations of generic code that are actually used. The problem is that if one has a library where everything takes a type parameter, that effectively means that one gains nothing from compiling the library separately from the program it is used in, because in the library nothing is instantiated with a "concrete" type. If all the libraries the application developer use have a high proportion of generic code, the developer is forced to basically rebuild every time. In the long run, I think this is just yet another reason why all compilers / build systems should support much more fine-grained caching---on individual functions even. In the short run, speculatively compiling code instantiated with its default parameters seems like an adequate solution. Allocators (with or without GC) are just example of features that might make a far higher percentage of code polymorphic. |
re. opt-in vs opt-out: IMO, having GC is fine but then it should be opt-in… |
Render Element Modifiers
As of 2023 this seems like an unlikely direction for rust to go into. Given that there has been no discussion in years I'm taking the initiative to close this. Anyone is welcome to reopen if you want to explore this possibility again (also I haven't looked but I'm pretty sure there are a some more detailed proposals along these lines that already exist, either in this repo or on https://internals.rust-lang.org). |
We want to add support for garbage collection at some point.
We had a really long discussion about this back on the
rust
repository here. It also implicates the design for allocators.My own belief is that the best plan would be precise tracing piggybacked off the existing trait and trait object system, i.e. compiler-derived trace routines (
Trace
impl
s) for each type, as outlined in my comment here. This would likely be very performant and avoid the need for any kind of headers on allocations, except for existentials (trait objects), which could/would have aTrace
vtable pointer similarly to howDrop
is currently done, i.e. this would also "just fall out" of the trait-based mechanism. By avoiding headers, we could also avoid imposing any costs on code which doesn't use GC.(I am also not sure that we need to involve LLVM in any way, at least in the first round. My suspicion is that via the borrow checker and the type system (at least once we have static drops), we already have more information than would LLVM. Instead of stack maps, at least in the first iteration, in GC-using code we could have the compiler insert calls to register/unregister stack variables which may potentially contain managed data with the GC, based on borrow checker information.)
The text was updated successfully, but these errors were encountered: