-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: "Closed" enum types #3179
Comments
If you add There must be a way to convert an integer to the closed enum (unsafe code). |
I imagine you could add |
Since // if you can do this
enum struct Option<T> { Some(T), None }
// there's nothing to stop you from doing this
enum struct Bool { False, True } I think this is more of a lowering strategy question - whether or not this should be emitted as a proper enum. If we go with proper enums, we can't manage external values, so I think we will need to emit it as a struct anyways. |
Why? I see no reason for closed enums to have any sort of fixed underlying value. If you want to convert, then use a switch expression: Bool b = i switch {
0 => False,
_ => True
}; @alrz, However, isn't there still an issue pattern matching on those as I'd imagined DU pattern matching would be type based, ie I'd write something like: var x = option switch {
Some v => v,
None => default
}; But with I can invent some syntax here: enum struct ShapeName { Point, Rectangle, Circle }
enum struct Shape
{
Point,
Rectangle(double Width, double Length),
Circle(double Radius)
} and I'm happy that this is consistent: a closed enum is a DU where the values have no parameters. And I can imagine how I'd pattern match this stuff: var area = shape switch {
Point => 0,
Rectangle { Width: var w, Length: var l } => w * l,
Circle { Radius: var r } => Math.PI * r * r
}; But, I've no idea how this lot gets lowered by the compiler. For a struct, |
Does it mean that existing |
@dsaf, existing enums are already open. For: enum Shape { Point, Rectangle, Shape }
But it would indeed be prudent to talk of them as being open enums to clarify the difference between existing enums and the closed variety. |
No they wouldn't be types (or members), if it's a struct DU, all parameters get emitted as flat fields into the struct, the rest is compiler magic. See what F# produces for your example as a value DU. Though I can imagine C# could do a better job utilizing runtime support for |
That's inefficient unless it's a trivial enum like the above Bool enum. Deserialization should be quick. It should be as simple as it is today, except the programmer must verify that it's a valid value (or it's undefined behavior) int value = 1;
if (InvalidValue(value)) throw ...;
unsafe { return (MyEnum)value; } |
@0xd4d You can add it yourself, or have a generator to do it. enum struct Bool {
False, True;
public static explicit operator Bool(int v)
=> v switch { 0 => False, 1 => True, _ => throw InvalidCastEx };
} I think unlike enums the corresponding "integer tag" is solely an implementation detail for DUs, so I don't think it's a good idea for the compiler to generate that out of the box. |
@alrz Your code is the same as @DavidArno's code, a big switch statement which generates a lot of code and causes extra CPU usage at runtime. My solution to use a cast in an unsafe block has the same perf and code size as code we can already use today. |
@alrz, "compiler magic" suits me. It's then @gafter's job to solve how to have struct-based DUs (including closed enums) work with @0xd4d, it's very unlikely that the solution to serialization of closed enums and other DUs is to convert them to ints. I guess it all depends on whether closed enums are implemented as a completely different solution to DUs or not. Doing so would be a mistake in my view. So I'd argue that any solution to serializing enum struct Bool { False, True } also has to be able to serialize enum struct Option<T> { Some(T), None } |
The only difference with the original proposal would be that False and True in that example wouldn't be compile-time constants so we can't use That is because an |
Because open is the default today, and we can't change that. |
@gafter would your intent to be that adding a new value to a "closed" enum is a breaking change? would this be binary compat breaking change, or just a "switch statement which must pick a value entered unknown territory by not finding a match" exceptional territory. eg: what if you wanted to add trinary logic "not provided by customer" to the bool enum (bad naming example) |
@AartBluestoke I believe its true, false and filenotfound. |
@AartBluestoke I imagine that adding a new value to a closed enum would be a binary breaking change. Just like |
How would this be enforced without the C# compiler generating an under the hood default case? Something like: public closed enum ObjectAccess
{
Read,
ReadWrite,
Write
} Consumer of Library A code in assembly B: switch objectAccess
{
case ObjectAccess.Read:
return new ReadOnlyFooBar(...);
default :
return new ReadWriteFooBar(...);
} I asume the C# generated default case would actually look like: switch objectAccess
{
case ObjectAccess.Read:
return new ReadOnlyFooBar(...);
default :
{
if (objectAccess != ObjectAccess.ReadWrite && objectAccess != ObjectAccess.Read)
throw new InvalidOperationException($"Unexpected value: {objectAccess:g}");
return new ReadWriteFooBar(...);
}
} When library Author adds a new value the the closed enum: public closed enum ObjectAccess
{
Read,
ReadWrite,
Write,
ReadWriteExclusive
} This will indeed break binary compatibility, however So this feature can only work if the default clause is not allowed for closed enums, which I think would make the feature impractical for any enums with more than a few members. |
Other languages seem to get by just fine. In e.g. Rust, the Currently, if I have a With closed enums, I still need to have a |
I believe the following spec would match the desired behavior: The compiler will enforce that: No other compiler or runtime changes are needed. |
That seems unnecessary. What if the closed enum has 20 values but I only care about 2 of them? Why can't I have a default that returns some value instead? |
The intent was not to require that switches be exhaustive. We'd still permit a switch expression to be non-exhaustive, with a warning. And I don't think we'd add any requirements for switch statements, other than the fact that we can infer that a default case isn't needed (e.g. for definite assignment) if you have cases for every enumeration constant. |
if you are only using 2 out of 20 values, then you effectively have open use of a closed enum; you're already saying i don't care if new values get added to this. If that is true then the enum shouldn't be closed. If it is true, you should handle the remaining values. Perhaps the contextual keyword "default" becomes "and the rest of the values" within the compiler, with the compiler generating a true default branch for exceptions for if non-standard values get into the enum somehow? My thoughts when i was writing that was if you add a new value to the closed enum, it has to explode, otherwise, "closing" has no meaning other than telling roslyn "don't generate a warning for this one specific situation", and preventing implicit cast from enum to int. and that doesn't seem like a necessary language feature; feels like a job of an analyzer. |
This is really common in functional languages, and the reason is simple: in some cases you don't care about all the values, and other cases you do. For example you might have a number of states a method can return. In some cases you want to deal with them all seperately. In others you only care if it was successful or not. |
Sorry, @AartBluestoke, but that is so not true. If, within a particular context, I explicitly call out two of the twenty values, then that's because - in that context - I'm only interested in those two values. So I'll have a default handler for the rest. But in a situation where I explicitly list all of them (which is likely to occur elsewhere in the code) then I absolutely want a compiler error is a new value is added. And having it be a binary breaking change at that point too would be the icing on the cake. |
So a default clause for a closed enum will mean:
No default clause:
It sounds about right for me as a library consumer. There's an escape hatch I can use. However, I would still like to opt-out of binary breaking changes, if those are added. public closed enum Choice
{
First,
Second,
Third
}
string AnalyzeChoice(Choice choice)
{
switch choice
{
case Choice.First:
return "Good";
case Choice.Second:
return "Average";
case Choice.Third:
return "Bad";
missing:
return "Unknown"; // Only when value not part of the enum is passed (like a version of the Enum in a future assembly).
//Exhaustiveness checking is still applied to all cases. Recompilation will give an error.
}
} |
This may do the job: [StructLayout(LayoutKind.Explicit)]
public struct Shape
{
public struct Point {}
public struct Rectangle { public double Width; public double Length; }
public struct Circle { public double Radius; }
private closed enum ShapeType : short { Point, Rectangle, Circle }
[FieldOffset(0)]
private ShapeType shapeType;
[FieldOffset(2)]
private Point point;
[FieldOffset(2)]
private Rectangle rectangle;
[FieldOffset(2)]
private Circle circle;
public bool Deconstruct(out Point p)
{
if (shapeType == ShapeType.Point) { p = point; return true; }
else { p = default; return false; }
}
.....
} However, it won't work that easily with reference types inside |
I think I have bigger problem with closed enum's behavior more than the syntax itself. The sole purpose of introducing the closed enum seems to be to influence For once, strict switch allows for more flexibility, because it can guard literally anything, what is enumerable and finite (namely bool, bool?, all int and int? types and also enums). Also, sealed enum looks a little bit weird for me, because the place where it is declared and where it actually takes effect are different. You'd need to lookup the type to know what the behavior in the code is, whereas in case of strict switch it is obvious at first glance. Finally, you might want to be strict in switch sometimes and sometimes not. With strict switch notation you can do that freely, while with sealed enum you are forced to be strict every time. |
Because the integer switch {
< 0 => -1,
> 0 => 1
}
nullableBool switch {
true => 1,
false => 0
} They produce these warnings:
(Note that you need to enable nullable reference types to get the warning about a nullable value type, which is weird, but by design per dotnet/roslyn#52714.) The problem only happens with |
I forgot about this issue and started a new duplicate discussion despite commenting here previously :) Can we please get this at some point? Not as some DU (like some have suggested), but as a good old closed enum HorizontalAlignment
{
Left = 1, Center = 2, Right = 3
}
var x = (HorizontalAlignment)0; // runtime error
// the above line is compiled to:
var x = Enum.ConvertToClosedEnum<HorizontalAlignment>(0); // this helper method would check and throw, if needed
x = (HorizontalAlignment)1; // OK
I would also be OK with a blanket prohibition on converting backing values to |
@TahirAhmadov THe issue is a championed and is something the LDM is discussing. There is no need to repeatedly ask :) |
@TahirAhmadov, would you expect an InvalidCastException here: closed enum HorizontalAlignment
{
Left = 1, Center = 2, Right = 3
}
class C
{
HorizontalAlignment M()
{
int[] ints = { 1, 2, 3, 4 };
var alignments = (HorizontalAlignment[])(object)ints;
return alignments[3];
}
} If the enum is not closed, then the CLR allows this cast at run time, even though the C# standard doesn't. If this must throw with a closed enum, then I think it can be implemented in several ways:
|
@KalleOlaviNiemitalo Yes, I would expect an exception to be raised there. Minor point - it should be a new exception type, like |
I gave this idea a thought and I'm still not convinced, that this is a proper place to define enum's "closed-ness". To wrap up:
I think it makes far more sense and far less problems to make something like 'strict switch', which actually kind-of already exists in the form of switch statement (except that not checking all possible cases there is marked with a warning only). It resolves all above issues:
And in addition:
I honestly fail to see a benefit in having closed enums. Yes, they will solve one problem of not handling all enum cases, but at the same time they may introduce a ton of other problems ranging from backwards compatibility to standard library code changes, to nuances with casting enums to int types and so on - and I'm really not convinced, that this benefit is worth the problems. |
@wojciechsura I think there is a misunderstanding here. "Closed enums" prevent invalid values being assigned to them by way of casting a value of the backing type, such as Regarding modifying existing enums to make them "closed" and potential breaking change, that's a valid question, but one that's unlikely to manifest itself in a // new .NET version: "closed" is added
closed enum Orientation { Left = 1, Center = 2, Right = 3, Justify = 4 };
class Foo
{
Orientation _orientation; // after .NET upgrade, this becomes an error, because 0 is invalid
void Bar()
{
// however, even after .NET upgrade, none of the below changes in any way
switch(this._orientation)
{
case Orientation.Left: .... break;
case Orientation.Right: .... break;
default:
...
// here, before .NET upgrade, this._orientation could have been Center, Justify, or some invalid value like 5 or -2
// after upgrade, it can only be Center or Justify. while it's a subtle behavior change,
// given the context of the enum type, is extremely unlikely to actually be a breaking change
break;
}
}
} PS. I would also remind you that |
I think in this case a compile error can also be emitted. |
Is the idea still that a switch statement on a closed enum, with no default case, will give a compile error for a missing case?
|
@ericsampson I would think with a closed enum:
|
@TahirAhmadov thats what I would expect as well, but I just wanted to verify because I haven't seen this documented explicitly yet. |
Why? What if a value is added in the future, and you want to make sure your code is resilient to that? Closed may mean "at the time you compiled against me, these are all the known values", but it doesn't mean: these values can't even change in the future. |
@CyrusNajmabadi that depends on whether you're talking about the rebuild scenario or executable update scenario. If it's rebuilt, a new value will cause a compilation error, which will let you update the |
Yes. This is the case i would care a lot about. We should always assuem that preexisting code will still be run (without recompile) on later versions of a library. |
It's certainly a breaking change in your API. But the question on the consumption side is: is it possible for me to write code that is resilient to breaking changes. I would hope so. Especially something as basic as "in case of something i don't understand, bail out". |
I updated my post above to correct that point. |
@TahirAhmadov I think that we'd want to warn on all cases that don't have a I think that would make your above list something like this:
I feel like this pretty much kills the entire point of this proposal, though. I wonder if we should ping Gafter and ask his thoughts 🙂 |
@Bosch-Eli-Black I don't think we need to go that far; the compiler will output a hidden To the extent that either reflection or new DLL can introduce an unexpected value, an argument can even be made that you still shouldn't define a PS. Another approach to consider would be an even stricter version: the binder refuses to work if a DLL with an unexpected enum option is copied to the executable folder; and verification code is emitted (not sure if it's possible during IL emit or JIT) whenever any value is set to a |
@TahirAhmadov Sorry, I just read through more of the thread, and I think now I understand things a bit better 🙂 Your list looks good to me! (#3179 (comment)) |
Closed enum for me is a tool for error handling. Error handling often requires additional informations (messages, args, etc) I propose "constructor" syntax for those enums that's similar to the concept used in records
Random ass usage example:
|
because even 'closed' enums can have added, is it even physically possible to load a library "closed" value and have that make sense? library v1 has library v2 has you compile against v1. loading v2 is now a binary breaking change, where you now unexpectedly fall through the previously "Complete" switch. |
Probably jumping in to beat a dead horse, but it seems to me that much confusion has been caused by using the term "enum" here. C#'s "original sin" IMO was to borrow the C concept of a list of values backed by integers and to allow the use of bitfield operations (Flags). What I want almost all of the time when declaring an "enum" is actually a closed set of distinct tags that are strongly typed. It's usually desirable to be able to enumerate over that set. In other words, I want the strongly-typed enum pattern without confusing things by providing access to the backing type (which allows people to inject unrepresentable values). |
The int-backed C style enums (what is now |
Yes, I appreciate that enums are what we have now just pointing out that when people talk about "closed sets" they may be coming at it from a perspective that is inconsistent with the assumption that the set must be based on a numeric backing field so perhaps it's better (or at least useful) to consider solving the general problem rather than 'just' patching up int-backed enums. For example #2849 is effectively a proposal for closed set of string-backed values. IOW at some point maybe it's better not to call these things "enums". |
I don't have any strong opinions on naming here, however my main point is that, the int and string backed "enums" (or however we name them) are needed irrespective of other proposals, specifically because in some cases you want to assign an undefined backing value to it, such as, in many network communication scenarios. PS. Please see this comment of mine: #2849 (comment) |
Related to #485, it would be helpful to be able to declare an enumerated type that is known to the language to be closed. The syntax is an open question, but as a first proposal I'll suggest using
closed
as in #485.For example, a closed enum
Could be used like this
This code would be considered OK by flow analysis, as the method returns for every possible value of the parameter
b
. Similarly a switch expression that handles every possible declared enumeration value of its switch expression of aclosed enum
type is considered exhaustive.To ensure safety:
closed enum
types are not defined. For example, there is nooperator Bool +(Bool, int)
. Alternately, we might consider them defined but only usable inunsafe
code.Bool
(or only inunsafe
code)Design meetings
https://github.com/dotnet/csharplang/blob/main/meetings/2022/LDM-2022-09-26.md#discriminated-unions
The text was updated successfully, but these errors were encountered: