Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

discriminant_value intrinsic -- tracking issue for 639 #24263

Closed
2 of 5 tasks
nikomatsakis opened this issue Apr 10, 2015 · 82 comments
Closed
2 of 5 tasks

discriminant_value intrinsic -- tracking issue for 639 #24263

nikomatsakis opened this issue Apr 10, 2015 · 82 comments
Labels
B-RFC-implemented Blocker: Approved by a merged RFC and implemented. B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. P-medium Medium priority T-lang Relevant to the language team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Comments

@nikomatsakis
Copy link
Contributor

nikomatsakis commented Apr 10, 2015

Tracking issue for rust-lang/rfcs#639.

Some things to nail down before stabilization:

  • prototype implementation
  • signature needs to have a <T:Reflect> bound
  • documentation should reflect the inherent instability in this value and describe valid uses
  • perhaps signature should require input T to actually be an enum instance?
  • do we want to adjust the return type in any way to enforce the above rules?
@pnkfelix
Copy link
Member

The prototype implementation at #20907 is very easy to rebase (I have already done so). I plan to land that prototype as part of work towards #15523

pnkfelix pushed a commit to pnkfelix/rust that referenced this issue Apr 10, 2015
Implements an intrinsic for extracting the value of the discriminant
enum variant values. For non-enum types, this returns zero, otherwise it
returns the value we use for discriminant comparisons. This means that
enum types that do not have a discriminant will also work in this
arrangement.

This is (at least part of) the work on Issue rust-lang#24263
@steveklabnik steveklabnik added the B-RFC-approved Blocker: Approved by a merged RFC but not yet implemented. label Apr 10, 2015
@pyfisch
Copy link
Contributor

pyfisch commented Apr 26, 2015

How is the progress here? I want to use it in a package, but I don't think it will be 1.0, right? Will it be included in 1.1?

@alexcrichton alexcrichton added T-lang Relevant to the language team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. labels Aug 11, 2015
@alexcrichton alexcrichton added B-unstable Blocker: Implemented in the nightly compiler and unstable. and removed B-RFC-approved Blocker: Approved by a merged RFC but not yet implemented. labels Feb 18, 2016
@briansmith
Copy link
Contributor

RE the documentation issue: The value wouldn't be unstable for the case of a C-like enum, so the statement about the value being unstable should be qualified as such. And, it would also be useful to document that this feature can/should be used instead of as for safer conversions of C-like enums. A new lint that does that would be great.

However, the RFC says that the type is always u64 which means that I'd have to use as in my code anyway, as I need u8-typed discriminant values (e.g. for parsing ASN.1). Accordingly, I think that the part of the design where the return type is fixed at u64 should be fixed.

@durka
Copy link
Contributor

durka commented Jun 13, 2016

I previously suggested a way to solve the return type problem: introduce a trait instead of (or in addition to) the intrinsic function.

trait MagicEnumTrait {
    type Repr;
    fn value(&self) -> Self::Repr;
}

This trait would be auto-implemented for every enum and Repr defined to be the type of the discriminant (in case the discriminant is packed away, Repr is the smallest integer type with enough bits). Also, Repr corresponds to the signedness of the enum in the case of things like #[repr(i8)] which makes it easier to compare discriminants for ordering.

@SimonSapin
Copy link
Contributor

For Stylo we want to build with a stable Rust compiler. For Servo’s style crate, the only thing missing is discriminant_value. (servo/servo#11816)

We use it with an enum that represents a property declaration: there’s one variant per supported CSS property, and each variants has a field with the property value. (The Rust types for values can be different for each property.) During the cascade, we get an sequence of declarations ordered by precedence, and we want to only keep one per property. We use discriminant_value to determine which property a given enum value represents.

There’s already some code generation going on in this crate, so we could write a function or method that behaves similarly to discriminant_value and uses match internally, but I don’t know if this would optimize as well. (This is used in a hot loop.)

@ranma42
Copy link
Contributor

ranma42 commented Jun 22, 2016

@SimonSapin if the results of the internal function are exactly the same as discriminant_value, LLVM should be able to optimise away the match completely and replace it with the same memory load that would be performed by discriminant_value. This has worked for me with several C-like enums (hand-checking the assembly output), so this might be a viable alternative if you do not want to wait for the stabilisation. Nonetheless I would recommend to verify if LLVM optimisations also work as intended in your case.

@SimonSapin
Copy link
Contributor

These enums are not C-like though, otherwise I could use enum_value as usize. I’ll try it anyway.

@pnkfelix
Copy link
Member

My questions at this point:

  • has the signature of the intrinsic gotten a bound, as listed as a work item in the description by Niko?
    • my understanding is that the answer here is "no."
  • should we provide a stable function that is merely a shim over the existing intrinsic, or should we hold off stabilization in order to attempt to implement some way to vary the return type based on the particular enum (which has been discussed previously, including in the comments here).
    • I'm personally ambivalent here; I can see the motivation to stabilize the thing we have, but I also respect people concerned about prematurely stabilizing before exploring the alternatives.
  • if we decide to hold off stabilization to explore an alternative API: do such alternatives require a new RFC (or at least RFC amendment)?
    • I think the answer here is "yes."

@aturon
Copy link
Member

aturon commented Jun 23, 2016

Given that we can always deprecate an API in favor of an improved version later, and that the functionality is working fine (and is something we definitely want to provide), I see little downside to stabilizing a wrapper function right now. (Lang team discussed this and all feel the same way.)

@brson
Copy link
Contributor

brson commented Jun 23, 2016

OK, so the concrete next step here is to write a wrapper function with the same signature and tag it stable. What module is it exported from? What's the process? Does it need an FCP period?

@brson brson added the E-easy Call for participation: Easy difficulty. Experience needed to fix: Not much. Good first issue. label Jun 23, 2016
@briansmith
Copy link
Contributor

Also, Repr corresponds to the signedness of the enum in the case of things like #[repr(i8)]

In that case, I would expect Repr to be exactly the time mentioned in #[repr(...)]. For example, #[repr(i16)] would have Repr = u16 even if a smaller representation could be used. This would provide some level of ABI stability in the case the enum is to be shared with non-Rust code and the enum is expanded in the future (e.g. from 250 to 300 choices).

@durka
Copy link
Contributor

durka commented Jun 23, 2016

@briansmith agreed. Does rustc sometimes shrink enums even if you specify
the repr with the attribute? I would expect it not to.
On Jun 23, 2016 19:48, "Brian Smith" [email protected] wrote:

Also, Repr corresponds to the signedness of the enum in the case of things
like #[repr(i8)]

In that case, I would expect Repr to be exactly the time mentioned in
#[repr(...)]. For example, #[repr(i16)] would have Repr = u16 even if a
smaller representation could be used. This would provide some level of ABI
stability in the case the enum is to be shared with non-Rust code and the
enum is expanded in the future (e.g. from 250 to 300 choices).


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#24263 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AAC3n2un5WmfnGg9JMv_eyuD44-yOmxAks5qOxthgaJpZM4D96D5
.

@briansmith
Copy link
Contributor

@briansmith agreed. Does rustc sometimes shrink enums even if you specify
the repr with the attribute? I would expect it not to.

I don't know. I just read your comment above as suggesting that the smallest type should be used, insted of the type mentioned in #[repr(...)], and I just wanted to clarify that the smallest type shouldn't be used if the type wa explicitly given in #[repr(...)].

@nikomatsakis
Copy link
Contributor Author

@brson

What's the process? Does it need an FCP period?

Usually unstable things undergo a "release cycle"-long FCP period, yes.

@nikomatsakis
Copy link
Contributor Author

has the signature of the intrinsic gotten a bound, as listed as a work item in the description by Niko?

This seems like an important point. Now, in the meantime since we added this intrinsic, we also adopted specialization, which may make the need for such a bound moot. Do we all agree we "just don't care" about a bound?

@durka
Copy link
Contributor

durka commented Jun 27, 2016

Well, if there's no bound, what happens when you call the function on a
value that isn't an enum variant? Is it UB, unspecified, defined to return
0 or -1 or...? Currently it seems you get 0: https://is.gd/odCsAO

I'll try to write an RFC for my trait idea because I think being able to
find out the repr type of an enum is valuable.

On Mon, Jun 27, 2016 at 2:15 PM, Niko Matsakis [email protected]
wrote:

has the signature of the intrinsic gotten a bound, as listed as a work
item in the description by Niko?

This seems like an important point. Now, in the meantime since we added
this intrinsic, we also adopted specialization, which may make the need for
such a bound moot. Do we all agree we "just don't care" about a bound?


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#24263 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AAC3n6rGfeK1Aa0IppTD5U7CSK1nGpCfks5qQBNHgaJpZM4D96D5
.

@SimonSapin
Copy link
Contributor

SimonSapin commented Jun 27, 2016

Servo update: TL;DR we’re not blocked on stabilization anymore, but it’d still be nice to have it eventually for debug build performance.


In servo/servo#11884 I’ve worked around this being unavailable on stable with generated code like this:

impl PropertyDeclaration {
    fn discriminant_value(&self) -> usize {
        match *self {
            PropertyDeclaration::AlignContent(..) => 0,
            PropertyDeclaration::AlignItems(..) => 1,
            PropertyDeclaration::AlignSelf(..) => 2,
            // ...
            PropertyDeclaration::ZIndex(..) => 147,
        }
    }
}

And made a test crate to look at the generated code:

#![feature(core_intrinsics)]
extern crate style;
use std::intrinsics::discriminant_value;

pub fn with_match(decl: &style::properties::PropertyDeclaration) -> usize {
    decl.discriminant_value()
}

pub fn with_intrinsic(decl: &style::properties::PropertyDeclaration) -> usize {
    unsafe { discriminant_value(decl) as usize }
}

With cargo rustc --release -- --emit=llvm-ir, this giant match compiles to something very close to discriminant_value:

; ModuleID = 'style_stable.0.rs'
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

%"11.style::properties::PropertyDeclaration" = type { i64, [0 x i64], [60 x i64] }

; Function Attrs: norecurse nounwind readonly uwtable
define i64 @_ZN12style_stable10with_match17h5ee51730fa1d4ed4E(%"11.style::properties::PropertyDeclaration"* noalias nocapture readonly dereferenceable(488)) unnamed_addr #0 {
entry-block:
  %.idx = getelementptr %"11.style::properties::PropertyDeclaration", %"11.style::properties::PropertyDeclaration"* %0, i64 0, i32 0
  %.idx.val = load i64, i64* %.idx, align 8
  ret i64 %.idx.val
}

; Function Attrs: norecurse nounwind readonly uwtable
define i64 @_ZN12style_stable14with_intrinsic17h38553922905ee296E(%"11.style::properties::PropertyDeclaration"* noalias nocapture readonly dereferenceable(488)) unnamed_addr #0 {
entry-block:
  %1 = getelementptr inbounds %"11.style::properties::PropertyDeclaration", %"11.style::properties::PropertyDeclaration"* %0, i64 0, i32 0
  %2 = load i64, i64* %1, align 8, !range !0
  ret i64 %2
}

attributes #0 = { norecurse nounwind readonly uwtable }

!0 = !{i64 0, i64 129}

But without --release the .ll file is 6843 lines long, so I’m guessing the giant match expression takes O(n variants) time in debug mode. Since this is used in a hot function, this might still affect performance of debug builds (which are already slow).

@ranma42
Copy link
Contributor

ranma42 commented Jun 27, 2016

@SimonSapin your comment seems to be truncated

@brson brson removed the E-easy Call for participation: Easy difficulty. Experience needed to fix: Not much. Good first issue. label Jun 28, 2016
@rfcbot rfcbot removed the proposed-final-comment-period Proposed to merge/close by relevant subteam, see T-<team> label. Will enter FCP once signed off. label Aug 9, 2017
@durka
Copy link
Contributor

durka commented Aug 9, 2017

There's an open ICE: #43398. Possibly just an assertion in trans needs to be removed, but I don't know anything about that code.

However, as a larger point, currently Discriminant is only 64 bits wide. Assuming we want to support 128-bit discriminants, it should be expanded or made into an enum or something, which would change the size. (If we say it's an OS-specific type (sort of a stretch) then that doesn't matter.)

@sfackler
Copy link
Member

sfackler commented Aug 9, 2017

There are like 3 types about which we provide guarantees around their size (e.g. Option<extern fn()> is pointer sized). We can totally expand the type's internal size in the future.

@rfcbot
Copy link

rfcbot commented Aug 19, 2017

The final comment period is now complete.

@durka
Copy link
Contributor

durka commented Sep 1, 2017

New summary comment

Quoting from the original checklist:

Some things to nail down before stabilization:

  • prototype implementation
  • signature needs to have a <T:Reflect> bound

Reflect is gone.

  • documentation should reflect the inherent instability in this value and describe valid uses
  • perhaps signature should require input T to actually be an enum instance?

We decided against this.

  • do we want to adjust the return type in any way to enforce the above rules?

IMO the return type is currently "opaque enough".


Concerns raised during FCP:


Let's get this stabilized!

@durka
Copy link
Contributor

durka commented Sep 2, 2017

Given that we can always add an impl of PartialOrd and the other two checkboxes have to do with #[repr(i128)], itself unstable, I don't see any roadblocks. Did I miss anything?

bors added a commit that referenced this issue Sep 3, 2017
@bors bors closed this as completed in d516434 Sep 4, 2017
@cramertj
Copy link
Member

cramertj commented Sep 7, 2017

I know this issue has already been closed, and the function is on track for stabilization, but I thought I should bring this up anyway since it seems odd to me: mem::discriminant returns 0 for all non-enums, including enum constructors. That is, the following code prints 0 and then 1:

use std::mem;

enum MyEnum {
    First,
    Second(i32)
}

fn main() {
    println!("{:?}", mem::discriminant(&MyEnum::Second)); // prints `0`
    println!("{:?}", mem::discriminant(&MyEnum::Second(2))); // prints `1`
}

IMO, it'd be nice to have the discriminant of the enum constructor match the discriminant of the enum value. I'm not sure if I've missed the boat here, but this seems like it would be really valuable for getting the discriminant without having to fully construct a value (this doesn't work for struct constructors, but that's something we could revisit).

@durka
Copy link
Contributor

durka commented Sep 7, 2017 via email

@nikomatsakis
Copy link
Contributor Author

Hmm. I agree with @durka that at first it surprised me (because I think of them as a function), though really the type of a variant is (I believe...) a zero-sized type that uniquely identifies the enum variant. (Unless of course you coerce it into a function.) So it seems like it should be possible to make it work, and I do find @cramertj's use case compelling.

@nikomatsakis
Copy link
Contributor Author

But what about "struct" variants?

enum Foo { ... Bar { ... } }`

@eddyb
Copy link
Member

eddyb commented Sep 7, 2017

@nikomatsakis We employ the nuclear option (Foo::Bar is a function which you call with braces instead of parens and any function can be defined like that - named arguments!).

@cramertj
Copy link
Member

cramertj commented Sep 7, 2017

@eddyb I suggested that exact thing on IRC in the lang channel:

fn foo { a: i32, b: i32, c: i32 } -> i32 { a + b + c }

assert_eq!(foo { a: 1, b: 2, c: 3 }, 6);

@cramertj
Copy link
Member

cramertj commented Sep 21, 2017

We didn't ever close this conversation, but mem::discriminant is set to stabilize in 1.21. Can we at least adjust the docs to say that the return value for non-enums is unspecified?

@durka
Copy link
Contributor

durka commented Sep 21, 2017 via email

@cramertj
Copy link
Member

@durka Ah, I was referring to your comment above where you said

In theory we can change that since the docs say the return value for
non-enums is unspecified.

@durka
Copy link
Contributor

durka commented Sep 21, 2017

We could try to head off the mistake by writing "If T is not an enum (including tuple variant constructors!)" or including your example from above.

@joshtriplett
Copy link
Member

@cramertj @durka Is there a fundamental reason we can't produce a compiler error if calling it on a non-enum (including an enum "constructor")?

@durka
Copy link
Contributor

durka commented Sep 21, 2017

@joshtriplett see previous discussion.

@Twey
Copy link

Twey commented Oct 3, 2017

The current Discriminant<T> contains a PhantomData<*const T>. This means it's non-Send and non-Sync, but I can't think of a good reason it should be. Is there one?

@sfackler
Copy link
Member

sfackler commented Oct 3, 2017

Nope, we should impl those traits.

@omaskery
Copy link

Hullo, does anybody know if any discussion/work is ongoing in line with @cramertj 's suggestion regarding calling std::mem::discriminant on an enum "constructor"?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
B-RFC-implemented Blocker: Approved by a merged RFC and implemented. B-unstable Blocker: Implemented in the nightly compiler and unstable. C-tracking-issue Category: A tracking issue for an RFC or an unstable feature. final-comment-period In the final comment period and will be merged soon unless new substantive objections are raised. P-medium Medium priority T-lang Relevant to the language team, which will review and decide on the PR/issue. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests