Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#[derive(Default)] on enum variants with fields #3683

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

estebank
Copy link
Contributor

@estebank estebank commented Aug 25, 2024

Support the following:

#[derive(Default)]
enum Foo {
    #[default]
    Bar {
        x: Option<i32>,
        y: Option<i32>,
    },
    Baz,
}

Rendered

Support the following:

```rust
enum Foo {
    #[default]
    Bar {
        x: Option<i32>,
        y: Option<i32>,
    },
    Baz,
}
```
@estebank
Copy link
Contributor Author

Follow up to #3107. Noticed as an easy gap to fill during implementation of #3681.

@Lokathor
Copy link
Contributor

Under drawbacks, I'd argue that this actually reduces complexity of the standard library.

People know about derive(Default) giving some sort of basic Default impl. When there's less special cases when we have to say "except it won't work here, here, and here", that means there's less complexity in the standard library.

@estebank
Copy link
Contributor Author

@Lokathor I'm unsure if I'm reading you right, but that sounds like a positive to me 😄

@Lokathor
Copy link
Contributor

Yes, exactly my point. The RFC text says this is a drawback, to have increased complexity. That might be true from a compiler internals perspective, or whatever. However, from an end user perspective, things that "just work" without needing exceptions listed for when they won't work in rare cases, those things are less complex.

So, in other words, I'd delete that part of the Drawbacks section. This doesn't increase complexity from a user's perspective.

Copy link
Contributor

@jplatte jplatte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the reason this was excluded in the previous RFC that added #[default] is that its interactions with generics aren't obvious - this RFC should likely be updated to address that.

text/0000-derive-default-enum-with-fields.md Outdated Show resolved Hide resolved
@Lokathor
Copy link
Contributor

@jplatte that's listed in the Drawbacks section i think, could you be more specific about what you expect the RFC to say about generics?

@jplatte
Copy link
Contributor

jplatte commented Aug 25, 2024

I was wondering what the generated bounds would be on a generic enum that uses this feature.

Currently, #[derive(Default)] on a generic enum never emits any bounds: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=efec38bf6b014835f97773e9cc30067c

Let's consider this

#[derive(Default)]
enum MyEnum<T, U> {
    #[default]
    Foo {
        field: Option<T>,
    }
    Bar(U),
}

I guess there wouldn't be any bounds on U, otherwise that would be inconsistent with the existing behavior. But would there be a T: Default bound, or an Option<T>: Default bound? I think the latter could be justified with enums not supporting private fields, unlike structs.

@Lokathor
Copy link
Contributor

There would have to be a bound on Option<T> being default, because the expansion is to have the code be Foo{ field: Default::default() }, and field has the Option<T> type. That's the minimum amount of bounding.

It's possible that additional bounds might end up in the mix, because we don't always have perfect derives, as mentioned in the Drawbacks section. For example, the "current behavior" could arguably only apply to unit variants of enums being the default, and field-having variants would (accidentally) drag in all the generics.

@jplatte
Copy link
Contributor

jplatte commented Aug 25, 2024

There would have to be a bound on Option<T> being default, because the expansion is to have the code be Foo{ field: Default::default() }

I don't think there would have to be. The more "obvious" non-perfect-derive expansion would be T: Default, just like for structs.

@Lokathor
Copy link
Contributor

Lokathor commented Aug 25, 2024

Well, T being Default has nothing really to do with if Option<T>, the actual type of the field, is also Default.

Or to be more general, if we have a field type using a generic, F<T> say, then we only care that F<T> is Default, not T itself.

  • It could be the case that F<T> is not Default even when T is. (eg &T)
  • It could be the case that F<T> is always Default even when T is not. (eg Option<T>)
  • It could be the case that F<T> is Default or not depending on some bounds for T. (eg [T; 3])

And the Proc Macro doesn't know which case it is when it generating the impl. So, just generating a bound based on just T itself is almost fundamentally incorrect, and will generate bad impls in extremely trivial cases.

@jplatte
Copy link
Contributor

jplatte commented Aug 25, 2024

Or to be more general, if we have a field type using a generic, F<T> say, then we only care that F<T> is Default, not T itself.

(emph mine)

I know all about this, but the "we only care" statement does not apply to any derives in std currently. That's why I think this RFC should make an explicit choice as to whether the Default derive should care (for enums) and emit bounds like [T; 3]: Default when a field of type [T; 3] exists in the enum variant, breaking consistency with existing derive logic but likely being useful in more places.

Might also be worth discussing potential interaction with private enum fields (not sure whether there's an RFC), as IIRC one of the things perfect derive has been criticized for is that it adds more cases of internal changes (adding a private field to a struct) possibly leaking out into public interfaces (bounds).

@estebank
Copy link
Contributor Author

One thing I realized while I was out is that #3681 would allow us to side-step the lack of perfect derive. The only way of doing perfect derive is with type system access. The other two alternatives are to add a way to specify bounds with an attribute, or force the struct to specify the bounds, which would make them more restrictive than necessary. But with default field values, if a field is given an explicit default then Default::default() on it doesn't need to be called. Which means that we can determine, purely with the AST, which type parameters might need the Default bound. So you could write

#[derive(Default)]
struct S<T> {
    a: Option<T> = None,
}

and the expansion would need no bounds on T. So that RFC would reduce the pressure to get perfect derives for a bit. This of course doesn't need to be part of that RFC, it can be "future work" because it would purely make more code compile.

@jplatte
Copy link
Contributor

jplatte commented Aug 26, 2024

The only way of doing perfect derive is with type system access.

How so? You could generate where Option<T>: Default as a bound if that's the field type, no?

But with default field values, if a field is given an explicit default then Default::default() on it doesn't need to be called

Unfortunately this doesn't seem true to me. I would actually argue that #3681 makes things more complicated: Imagine if Option had a const fn default_some() -> Self where T: Default associated fn that returns Some(T::default()). There would be no way for the Default derive to know that field: Option<T> = Option::default_some() requires the bound. I guess for the = None case, things are indeed simpler though because that's not a function call so it can't have any bounds of its own.

@ehuss ehuss added the T-lang Relevant to the language team, which will review and decide on the RFC. label Aug 26, 2024
@estebank
Copy link
Contributor Author

How so? You could generate where Option<T>: Default as a bound if that's the field type, no?

The only issue that I could think of when doing that blindly is the case of a recursive type, like the following case:

#[derive(Clone)]
struct A<T> {
    x: Option<Box<A<T>>,
}

Currently the expansion in that case is

impl<T: Clone> Clone for A<T> {
    fn clone(&self) -> A<T> {
        A {
            self.x.clone(),
        }
    }
}

The naïve expansion of this with an attempt at perfect derives could be

impl<T> Clone for A<T> where A<T>: Clone {
    fn clone(&self) -> A<T> {
        A {
            self.x.clone(),
        }
    }
}

which would be accepted by the compiler, but never allow anyone to call <A<T> as Clone>::clone for any T. A simple side-step might be to identify that A is used in any of the fields and then revert to the current behavior of imperfect derives. It would make #[derive(X)] imperfect for any recursive type (even if unneeded), but allow for more accurate derives in every other case.

Imagine if Option had a const fn default_some() -> Self where T: Default

FWIW, that still requires Default to be const, which it isn't now, but point taken. I need to think more about that case, but I believe that the obligations are actually correctly propagated: you couldn't have an expression that has an implicit transitive bound that references a type parameter without that type parameter having that bound in the item definition.

@Lokathor
Copy link
Contributor

I'm not sure that the fact that the derive might do a weird thing in a small fraction of cases is even particularly a problem, because people are never required to use the derive for Default.

Slightly separately i also think less cases would be weird with Default than with the Clone example you give.

@Lokathor
Copy link
Contributor

@ehuss I just noticed that you tagged this T-lang, but isn't this more like a T-libs-api concern?

@ehuss ehuss added the T-libs-api Relevant to the library API team, which will review and decide on the RFC. label Aug 26, 2024
@ehuss
Copy link
Contributor

ehuss commented Aug 26, 2024

Attributes are handled by the lang team. I added t-libs-api (similar to #3107). In general, teams should feel welcome to add themselves or other teams (and remove themselves if they want). I also expect teams to exercise good judgement in case other teams should be involved.

@estebank
Copy link
Contributor Author

After a quick conversation with niko, he mentioned that the general version of perfect derives is blocked on resolving the cycle case when it is indirect:

struct A<T> {
    x: Option<Box<B<T>>,
}
struct B<T> {
    x: Option<Box<A<T>>,
}
impl<T> Clone for A<T> where B<T>: Clone {
    fn clone(&self) -> A<T> {
        A {
            self.x.clone(),
        }
    }
}
impl<T> Clone for B<T> where A<T>: Clone {
    fn clone(&self) -> B<T> {
        B {
            self.x.clone(),
        }
    }
}

This case is not detectable purely from the token tree/AST. @compiler-errors you're looking at the trait resolution side of this, right?

@compiler-errors
Copy link
Member

@estebank: Not sure why I got a ping directly lol, but this is blocked on coinduction which is blocked on the new trait solver.

@matthieu-m
Copy link

As mentioned by @Lokathor above, reducing the number of special-cases is generally a win with regard to complexity (for the end-user).

Following on this, I would suggest moving on with the imperfect derive, as a struct would. Or any other derive than Default on an enum. Specifically, I would expand:

#[derive(Default)]
enum MyEnum<T, U> {
    #[default]
    Foo {
        field: Option<T>,
    }
    Bar(U),
}

to

impl<T: Default, U: Default> Default for MyEnum<T, U> {
    fn default() -> Self {
        Self::Foo { field: Default::default() }
    }
}

It's very much imperfect:

  • U is dragged in even though it doesn't even appear in this variant.
  • T is dragged in even though Option<T> doesn't require T: Default to be Default.

But it's consistent, and while consistency may be the hobgoblin of little minds, it does make it easier to learn and use the language.

It's simple enough to write the implementation manually in the few cases where it matters, and it's not like we all aren't used to it. A later RFC/feature can take care of handling smarter derives for all.

Deriving `Clone` on enums.
#[derive(Clone)]
pub enum Enum<T> {
    Foo(PhantomData<T>),
}

Expands to:

pub enum Enum<T> { Foo(PhantomData<T>), }

#[automatically_derived]
impl<T: ::core::clone::Clone> ::core::clone::Clone for Enum<T> {
    #[inline]
    fn clone(&self) -> Enum<T> {
        match self {
            Enum::Foo(__self_0) =>
                Enum::Foo(::core::clone::Clone::clone(__self_0)),
        }
    }
}

When PhantomData is cloneable regardless of whether its generic type is. That's how it is...

@jplatte
Copy link
Contributor

jplatte commented Aug 29, 2024

@matthieu-m It wouldn't be consistent with how #[derive(Default)] works on generic enums right now. I think the least surprising / most consistent option would rather be to add Default bounds for all generic parameters that are mentioned within the #[default] variant's field types.

@Lokathor
Copy link
Contributor

Lokathor commented Aug 29, 2024

Just for reference, and to expand on what jplatte said, when you have a unit variant as the default you get a default impl that doesn't depend on all the generics being Default, because the generics aren't fields of the unit type. https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=be6e8ae45184fddb99d916ec6a237a25

So yes, a default bound on all generics regardless of where they're used is definitely too much bounding, and would not be consistent.

But again, I don't think we need a bound on the generics used by the default variant, what we would need is a bound on the exact field types used in the variant. The way that the generic is wrapped in other types is important. i32 is Default, but &i32 is not, so if a field is &T then a bound on T: Default is the wrong bound.

# Unresolved questions
[unresolved-questions]: #unresolved-questions

- Should we wait until [perfect derives] are addressed first?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But again, I don't think we need a bound on the generics used by the default variant, what we would need is a bound on the exact field types used in the variant. The way that the generic is wrapped in other types is important. i32 is Default, but &i32 is not, so if a field is &T then a bound on T: Default is the wrong bound.

@Lokathor but it is the same wrong bound we have for structs. What you want is perfect derives, which is explicitly called out in the text: do we wait for perfect derives or land support for the same imperfect behavior we have today for structs.

The reason we can't do Option<T>: Default is because we only have AST information on these macros, and the only way to provide those bounds correctly is to detect cycles, both direct (we can detect those) and indirect (we can't detect those unless we have resolve/typeck info).

As I'm writing this though, I'm thinking that it could be that we could change the desugaring of derive to use the more accurate where clauses, and have a late lint check for cycles after the fact and tell people to instead write an explicit impl instead 🤔

But that would be its own RFC for "get closer to perfect derives without trait solver changes", and I'm gonna be off-line for several weeks, so I can't drive that now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we wait for perfect derives or land support for the same imperfect behavior we have today for structs.

If we can have it be better than before, I don't see a reason to not make it better than before. If it means that enum can (temporarily) derive more precisely than struct that seems completely fine to me.

I mean we could start with a bound on all the field types and also on the generics (that's what structs do now), but we literally can't have a bound just on the generics, without the bound on the field types. That actually won't work, which is my point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty confused, what do you mean when you say we can't do it / it doesn't work? It is what we have for structs now, why couldn't we do the same thing here? Yes, it's not going to always generate an implementation that passes typecheck. But the same is true for structs today, and AFAIK this is not considered a major problem.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is basically what i mean, yes. That the code will have a build error.

@matthieu-m
Copy link

@matthieu-m It wouldn't be consistent with how #[derive(Default)] works on generic enums right now. I think the least surprising / most consistent option would rather be to add Default bounds for all generic parameters that are mentioned within the #[default] variant's field types.

It's another possibility yes.

It's inconsistent with the derivation of every other trait, but no other trait has a #[default] annotation on a specific variant, so maybe it's good enough.

@RalfJung
Copy link
Member

@estebank the enum in the PR description has no #[derive(Default)]. ;)

@RalfJung
Copy link
Member

But again, I don't think we need a bound on the generics used by the default variant, what we would need is a bound on the exact field types used in the variant. The way that the generic is wrapped in other types is important. i32 is Default, but &i32 is not, so if a field is &T then a bound on T: Default is the wrong bound.

That's true, but it's not how our derives work, and this was discussed above.

I would find it quite surprising if we had two fundamentally different ways to generate code for derives of "things with fields". If we have a way to generate better derives (adding bounds on the fields, not the generic types), we should do that consistently.

The "enum variant without field" case is fairly harmless, but once there are arbitrary fields this is just going to look like more special cases, that work better in some situations but worse in others.

@PoignardAzur
Copy link

PoignardAzur commented Oct 3, 2024

But with default field values, if a field is given an explicit default then Default::default() on it doesn't need to be called. Which means that we can determine, purely with the AST, which type parameters might need the Default bound.

Note that we have a discussion right now in #3681 about whether this is even possible.

Also, I think people are assuming that only the fields in the default branch would generate a default bound, but existing derive macros have no mechanism for being that specific.

That is:

Let's consider this

#[derive(Default)]
enum MyEnum<T, U> {
    #[default]
    Foo {
        field: Option<T>,
    }
    Bar(U),
}

I guess there wouldn't be any bounds on U, otherwise that would be inconsistent with the existing behavior. But would there be a T: Default bound, or an Option<T>: Default bound?

You guessed wrong. It wouldn't be either of those, it would be T: Default, U: Default. (See discussion in #3681 for details.)

I think this distinction should be made clear in the RFC, because I think a lot of people would make the same assumption as the one in the quote.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC. T-libs-api Relevant to the library API team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants