Implement a 'discriminant_value' intrinsic #639

Aatch · 2015-01-21T06:38:17Z

Implement an intrinsic for extracting the discriminant from a value.

sfackler · 2015-01-21T06:49:53Z

text/0000-discriminant-intrinsic.md

+
+# Alternatives
+
+* More strongly specify the values returned. This would allow for a broader range of uses, but


We could also return an opaque type that implements PartialOrd, Ord, PartialEq and Eq, but doesn't expose the raw discriminant value. This gives us flexibility with the implementation, and avoids some pitfalls with returning an explicit u64, for example, what happens if the enum is #[repr(i64)]?

Since the return value is unrelated to ordering, it shouldn't implement (Partial)Ord. But it should implement Hash.

It is in fact related to ordering of variants of the enum: rust-lang/rust#15523.

I mean the RFC explicitly says "Does not allow for the value to be used as part of ordering.". Perhaps require impl<T: PartialOrd> PartialOrd for DiscriminantValue<T>. (That ordering is wrong if the enum uses a custom ordering though.)

Ah. @Aatch why couldn't the discriminant be usable in part of an Ord impl?

As the RFC is currently written, the only guarantee you have is that two values of the same type produce the same value, iff the variants are the same. While you could implement a ordering with this, I deliberately restricted such that the ordering would be effectively unspecified. The restriction could be lifted, but it would require a more rigorous specification of discriminant values.

If we were to take the route of specifying the ordering respecting the ordering of definition or possibly the ordering of the values of C-like enums, could you elaborate on some of the cons of that strategy? e.g. do you know of some optimizations that may block us from in the future?

+1 impl<T: PartialOrd> PartialOrd for DiscriminantValue<T>. Added benefit of not being able to mix discriminant values of different types.

sfackler · 2015-01-21T07:04:11Z

text/0000-discriminant-intrinsic.md

+
+# Unresolved questions
+
+* Should `#[derive]` use this intrinsic to improve derived implementations of traits? While


It seems pretty crucial that this is at least used by deriving, and preferably exposed with a #[stable] wrapper function. If it's not used inside of the stdlib and not stable, it seems kind of useless.

pyfisch · 2015-01-25T12:42:27Z

I try to write a language tag library in Rust, that uses big (8000 variants) enums to store all languages.
https://github.com/pyfisch/rust-language-tags The problem is that enums do not provide efficient ordering and comparison. It would be great if this intrinsic could be used to provide ordering and the efficient ordering would be added to std. Currently it takes really long to compile.

Ericson2314 · 2015-01-25T21:37:16Z

I think adding this a good idea whether or not the compile gets smarter. That said, has anyone tried telling LLVM that other possible discriminant values are all unreachable?

taralx · 2015-01-28T02:58:46Z

Can someone explain why an intrinsic is preferable to a compiler-generated trait implementation?

pnkfelix · 2015-01-28T03:30:46Z

@taralx assuming you mean a compiler-generated match that has an arm for each discriminant value, the main objection there (that I am aware of) is a mix of run-time and/or compile-time efficiency: You're either going to have that method call compiled out-of-line (slower than an intrinsic), or you will inline the method call, which will then cause code size blow up (with all of its associated effects), or will cause compile-time to become even slower as the LLVM optimizer struggles to recover the implementation that we could have just directly emitted (namely: extracting the discriminant value that is already directly embedded in the enum value).

taralx · 2015-01-28T03:55:53Z

No, not a generated match. I mean an actual trait:

#[lang="enumtrait"]
pub trait Enum {
  pub fn discriminant(&self) -> usize
}

This would allow type-level assertions of the form "T: Enum" as well.

rkjnsn · 2015-01-28T07:27:31Z

@taralx I think using an opaque return type, e.g., pub fn discriminant(&self) -> DiscriminantValue<Self>, would make more sense here, as it would allow uses of the discriminant to be statically limited to comparing for equality with another discriminant of the same type (and possibly some form of ordering, if that turns out to be desirable).

I do think it would beneficial to have such a trait (or possibly a free function) that can potentially be stabilized and called from safe code, as determining whether two enums of the same type are also of the same variant doesn't seem like it should require unsafe. A safe trait or function could certainly be implemented in the library using the intrinsic outlined in this RFC, but I think the intrinsic should be more of an implementation detail at that point, rather than something that would be used directly by a library like postgres.

kennytm · 2015-01-28T13:00:17Z

@taralx That trait method can be built on top of the intrinsic, just like how the TypeId struct is based on std::intrinsics::type_id.

taralx · 2015-01-28T16:50:49Z

Would it help to start a separate RFC for it?

Aatch · 2015-01-29T22:19:54Z

I'll probably update this RFC soon to address some of the points that have come up. One important question I'd like more feedback on is how well-specified the value returned from the intrinsic is. I went for the most conservative/least specified version, but I have no problem specifying it more if people think it is reasonable.

sfackler · 2015-01-29T22:23:49Z

I think we can at least specify that the discriminant ordering can be relied on for use in Ord impls. If you're going to keep the return value as a u64, I'd like to make sure that we have a good way of handling #[repr(i64)] enums. Might be best to return an opaque type instead.

Aatch · 2015-01-29T22:33:11Z

@sfackler I'm not sure how an opaque type would work though. How do you do anything with a type you know nothing about? It's not difficult to just cast from a u64 to an i64 if you need signed comparison.

This intrinsic is intended primarily for proper ADT enums, where you can't control the discriminant values anyway. C-Like enums can be casted directly to get their value.

sfackler · 2015-01-29T22:40:27Z

@Aatch it would implement Ord and Eq, but that's it. By "opaque", I just mean "hides the exact representation of the discriminant".

taralx · 2015-01-30T18:05:17Z

I was hoping to use this to implement EnumSet using BitvSet. Hence the trait suggestion.

With a trait you could make the type of the discriminant associated...

pnkfelix · 2015-02-05T14:22:40Z

@Aatch are there other revisions you're planning on making in the near future? (I haven't read the discussion carefully yet.)

aturon · 2015-03-05T06:18:14Z

ping @pnkfelix

pnkfelix · 2015-03-05T19:22:55Z

I have now read the discussion.

In principle I would prefer for the RFC to actually document some of the concrete trait-based alternatives that have been laid out over the course of the discussion (though I personally agree with @Aatch that such alternatives could be written atop the intrinisic as specified).

pnkfelix · 2015-03-05T19:28:30Z

However, having said that, I think the RFC is actually perfectly fine as written, and if presented the choice between merging it as-is versus not merging it at all, I would choose the former.

So I think we should merge it. A motivated individual can always submit a PR documenting the alternatives, if that is of interest.

aturon · 2015-03-05T19:36:07Z

@pnkfelix

However, having said that, I think the RFC is actually perfectly fine as written, and if presented the choice between merging it as-is versus not merging it at all, I would choose the former.

So I think we should merge it. A motivated individual can always submit a PR documenting the alternatives, if that is of interest.

Thanks! Given how long it's taken to move on this RFC, I don't think we should ask the author to do further revisions of that kind before we merge.

nikomatsakis · 2015-03-05T20:01:47Z

Sorry, I'm late to the party. I am basically in favor of this RFC, but I do want to note that the intrinsic as defined introduces some forwards-compatibility hazards. In particular, reordering variants can potentially break code even for enums that don't implement PartialOrd. I assume that guaranteeing relative ordering is needed precisely so that we can use when deriving PartialOrd -- are there other reasons?

That said, I feel good about this fn since it is going to be unsafe and in the documentation we can spell out valid and invalid use cases -- basically, using this on enums that you don't control is a Bad Idea and may introduce hazards if the maintainers of those enums make changes.

(On a related note, it is not clear to me whether one should consider PartialOrd to be preserved across minor version updates.)

Regarding staging, I think we should follow the policy we've been mostly doing for all RFCs. That is, land the new feature unstable, and make an express decision by core team to move it to stable. I of course agree with @sfackler that it'd be nice to make this available to deriving sooner rather than later though.

brson · 2015-03-11T21:22:03Z

This seems like a useful intrinsic but I don't find the motivating example in the RFC to be very convincing - replacing the obvious safe code with an unsafe intrinsic. That is such a simple piece of code surely there must be a way to write it safely that is performant.

sfackler · 2015-03-11T21:27:11Z

How is the intrinsic unsafe (other than the fact that all intrinsics are tagged as unsafe)? All it does is return a u64.

In any case, "surely there must be a better way" is not a very convincing counter-example, IMO. Comparing disciminants seems to me at least like the obvious way to make this kind of function performant.

lilyball · 2015-03-18T00:34:30Z

I'm in favor of this RFC, and I'm in favor of updating #[deriving] to use it. Since #[derive] is compiler-generated, there's no worry about the intrinsic changing out from under it, and if we decide to expose a public safe wrapper then #[derive] can easily transition to that.

That is such a simple piece of code surely there must be a way to write it safely that is performant.

You mean like using an intrinsic? The intrinsic seems to actually be completely safe to call on all inputs, it just requires unsafe because all intrinsics do. It would be trivial to expose a safe wrapper that merely calls the intrinsic (but I don't think that's necessary to move forward with this RFC, or required for #[deriving] to use it).

The only other approach I can think of is teaching Rust to try to optimize match statements such that it can detect when a bunch of match arms are equivalent to simply comparing the discriminant, but that seems like a lot of work that's only necessary because we aren't allowing ourselves to look at the discriminant, and it's work that would negatively impact compilation time for probably no benefit at all on code outside of #[deriving].

As for u64 vs opaque type, this is an intrinsic we're talking about here. I think it's fine to let it return u64 (and in the case of #[repr(i64)] it would just be the u64 with the same bit pattern as the i64 value). If we expose a safe wrapper function, then it's worth considering the opaque type idea. But trying to have an unsafe intrinsic return a safe opaque type seems pointless.

sfackler · 2015-03-18T00:41:55Z

If we have it return a u64 with the same bit-pattern as the i64 discriminant in the #[repr(i64)] case seems like it'll complicate things a lot. How would #[derive(PartialOrd)] work in that case? It would need to have a special case for i64 discriminants but I'm not sure how it would even know if it's in that case or not.

sfackler · 2015-03-18T00:43:16Z

Another option would be to have the i64 case shift the discriminant up by 2^63 which would preserve ordering but not the raw value.

EDIT: same problem for all signed discriminant types, really.

mahkoh · 2015-03-18T00:45:40Z

Why not just use (sign: i8, abs: u64)?

sfackler · 2015-03-18T00:48:14Z

Oh, yeah, that'd be a decent way to deal with it.

mahkoh · 2015-03-18T00:49:51Z

Not quite. If sign is negative, then abs has to be u64::MAX - abs to preserve the ordering.

lilyball · 2015-03-18T00:58:04Z

Given that this RFC includes PartialOrd I think @sfackler is right, that using u64 is problematic. It's too bad we can't just use an LLVM i65.

In the meantime, how about (value: u64, sign: bool). For unsigned types the value is obviously the value (and the sign is false). For signed types, the value is the signed value with the sign bit set to 0 (and the sign is the sign bit). Since we use 2's-complement integers, this preserves the ordering.

lilyball · 2015-03-18T01:12:16Z

On the "alternative ideas" front, if we're ok with having the compiler define traits for us, we could have the trait Enum idea, but give it an associated type Repr that is the numeric type of the discriminant. That way we'd have something like

#[lang="enumtrait"]
pub trait Enum {
    type Repr;
    fn discriminant(&self) -> Self::Repr;
}

This way we don't have to worry about signed vs unsigned, and code can compare discriminant values correctly.

We could also then consider doing things like making this unsafe instead and adding a second trait for C-like enums that declares the safe version:

#[lang="enumtrait"]
pub trait Enum {
    type Repr;
    unsafe fn unsafe_discriminant(&self) -> Self::Repr;
}

#[lang="clikeenumtrait"]
pub trait CLikeEnum: Enum {
    fn discriminant(&self) -> Self::Repr;
    fn from_discriminant(value: Self::Repr) -> Option<Self>;
}

That way #[derive] can use unsafe_discriminant() in order to efficiently implement comparison on enums like the example one that has a single variant with an associated item and a lot of variants without. And user code can use the methods of CLikeEnum to do conversions between enums and their repr.

All that said, I think having an intrinsic is a good first step. And if we're planning on writing a safe wrapper around it, then it doesn't really matter very much what the actual return type is.

glaebhoerl · 2015-03-18T08:35:16Z

I'd like to have at least either the unsafe requirement or a trait bound requirement (e.g. @kballard's Enum). I don't like the potential this has for abstraction violations otherwise, when used on abstract types or (not explicitly Enum-bounded) type parameters. (I guess abstract types can't currently be enums, and I know any parametricity guarantees we might have had are already shot due to Any, but it still doesn't seem like a good idea to encourage this kind of thing.)

nikomatsakis · 2015-03-26T10:25:52Z

@glaebhoerl a good point, for some reason it took me a bit to recognize what you were saying, but it seems obvious now. To spell it out:

fn foo<T>(x: &T, y: &T) { ... }

is able to do a kind of "partial equality" comparison on x and y in terms of determining if they are the same variant or not. Perhaps the Reflect bound would be suitable here?

glaebhoerl · 2015-03-26T11:25:03Z

@nikomatsakis Yes, that might be a good fit. :)

pnkfelix · 2015-04-02T22:04:14Z

@nikomatsakis at this point I would like to see us move forward on this in some fashion, if only via a feature-gated intrinsic, because that would be very helpful in resolving rust-lang/rust#15523

pnkfelix · 2015-04-10T09:51:35Z

reassigning to @nikomatsakis so that he remembers to update its state (or merge it outright).

nikomatsakis · 2015-04-10T10:08:16Z

This RFC is provisionally accepted. I've included some unresolved concerns at the end of the RFC. These can be worked out over the course of implementation. Thanks all.

Add discriminant_value intrinsic RFC

ea48cbe

sfackler reviewed Jan 21, 2015
View reviewed changes

Add '_value'

9a12a51

sfackler reviewed Jan 21, 2015
View reviewed changes

nrc assigned pnkfelix Jan 22, 2015

alexcrichton mentioned this pull request Jan 27, 2015

Implement a discriminant_value intrinsic rust-lang/rust#20907

Closed

Specify values to allow for ordered comparison

3f8194f

pnkfelix mentioned this pull request Mar 8, 2015

enum variants aren't considered to be const rust-lang/rust#5873

Closed

pnkfelix assigned nikomatsakis and unassigned pnkfelix Apr 10, 2015

nikomatsakis mentioned this pull request Apr 10, 2015

discriminant_value intrinsic -- tracking issue for 639 rust-lang/rust#24263

Closed

5 tasks

nikomatsakis merged commit 3f8194f into rust-lang:master Apr 10, 2015

strega-nil mentioned this pull request Jul 31, 2016

[RFC accepted] add wrapper for discriminant_value intrinsic rust-lang/rust#34785

Closed

Centril added A-discriminant Discriminant related proposals & ideas A-intrinsic Proposals relating to intrinsics. labels Nov 23, 2018


		# Alternatives

		* More strongly specify the values returned. This would allow for a broader range of uses, but


		# Unresolved questions

		* Should `#[derive]` use this intrinsic to improve derived implementations of traits? While

Implement a 'discriminant_value' intrinsic #639

Implement a 'discriminant_value' intrinsic #639

Conversation

Aatch commented Jan 21, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pyfisch commented Jan 25, 2015

Ericson2314 commented Jan 25, 2015

taralx commented Jan 28, 2015

pnkfelix commented Jan 28, 2015

taralx commented Jan 28, 2015

rkjnsn commented Jan 28, 2015

kennytm commented Jan 28, 2015

taralx commented Jan 28, 2015

Aatch commented Jan 29, 2015

sfackler commented Jan 29, 2015

Aatch commented Jan 29, 2015

sfackler commented Jan 29, 2015

taralx commented Jan 30, 2015

pnkfelix commented Feb 5, 2015

aturon commented Mar 5, 2015

pnkfelix commented Mar 5, 2015

pnkfelix commented Mar 5, 2015

aturon commented Mar 5, 2015

nikomatsakis commented Mar 5, 2015

brson commented Mar 11, 2015

sfackler commented Mar 11, 2015

lilyball commented Mar 18, 2015

sfackler commented Mar 18, 2015

sfackler commented Mar 18, 2015

mahkoh commented Mar 18, 2015

sfackler commented Mar 18, 2015

mahkoh commented Mar 18, 2015

lilyball commented Mar 18, 2015

lilyball commented Mar 18, 2015

glaebhoerl commented Mar 18, 2015

nikomatsakis commented Mar 26, 2015

glaebhoerl commented Mar 26, 2015

pnkfelix commented Apr 2, 2015

pnkfelix commented Apr 10, 2015

nikomatsakis commented Apr 10, 2015