Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: "Closed" enum types #3179

Open
gafter opened this issue Feb 10, 2020 · 71 comments
Open

Proposal: "Closed" enum types #3179

gafter opened this issue Feb 10, 2020 · 71 comments
Assignees
Milestone

Comments

@gafter
Copy link
Member

gafter commented Feb 10, 2020

Related to #485, it would be helpful to be able to declare an enumerated type that is known to the language to be closed. The syntax is an open question, but as a first proposal I'll suggest using closed as in #485.

For example, a closed enum

closed enum Bool { False, True }

Could be used like this

int M(Bool b)
{
    switch (b)
    {
        case Bool.False: return 0;
        case Bool.True: return 1;
    }
}

This code would be considered OK by flow analysis, as the method returns for every possible value of the parameter b. Similarly a switch expression that handles every possible declared enumeration value of its switch expression of a closed enum type is considered exhaustive.

To ensure safety:

  • The arithmetic operators on closed enum types are not defined. For example, there is no operator Bool +(Bool, int). Alternately, we might consider them defined but only usable in unsafe code.
  • There is no conversion from some underlying type to the type Bool (or only in unsafe code)

Design meetings

https://github.com/dotnet/csharplang/blob/main/meetings/2022/LDM-2022-09-26.md#discriminated-unions

@gafter gafter self-assigned this Feb 10, 2020
@0xd4d
Copy link

0xd4d commented Feb 10, 2020

If you add [Flags] to it, is that an error?

There must be a way to convert an integer to the closed enum (unsafe code).

@gafter
Copy link
Member Author

gafter commented Feb 10, 2020

I imagine you could add [Flags] to it, but it wouldn't have any effect other than possibly generating a warning. That attribute has no language meaning.

@alrz
Copy link
Member

alrz commented Feb 11, 2020

Since enum class is going to be the syntax for discriminated-unions, I vote enum struct for this. When there is no sub-member, it is practically a closed enum,

// if you can do this
enum struct Option<T> { Some(T), None }
// there's nothing to stop you from doing this
enum struct Bool { False, True }

I think this is more of a lowering strategy question - whether or not this should be emitted as a proper enum.

If we go with proper enums, we can't manage external values, so I think we will need to emit it as a struct anyways.

(some examples from F#)

@DavidArno
Copy link

@0xd4d,

There must be a way to convert an integer to the closed enum (unsafe code).

Why? I see no reason for closed enums to have any sort of fixed underlying value. If you want to convert, then use a switch expression:

Bool b = i switch {
    0 => False,
    _ => True
};

@alrz,
enum struct Option<T> { Some(T), None } is sort of how I envisioned DU's looking (and so agree that enum struct Bool { False, True } is a good choice for closed enums.

However, isn't there still an issue pattern matching on those as I'd imagined DU pattern matching would be type based, ie I'd write something like:

var x = option switch {
    Some v => v,
    None => default
};

But with enum struct Option<T> { Some(T), None }, Some isn't a subtype of Option.

I can invent some syntax here:

enum struct ShapeName { Point, Rectangle, Circle }

enum struct Shape 
{
    Point,  
    Rectangle(double Width, double Length),
    Circle(double Radius)
}

and I'm happy that this is consistent: a closed enum is a DU where the values have no parameters.

And I can imagine how I'd pattern match this stuff:

var area = shape switch {
    Point => 0,
    Rectangle { Width: var w, Length: var l } => w * l,
    Circle { Radius: var r } => Math.PI * r * r
};

But, I've no idea how this lot gets lowered by the compiler. For a struct, Shape, that contains a Circle "value", what does shape is Circle { Radius: var r } mean?

@dsaf
Copy link

dsaf commented Feb 11, 2020

Does it mean that existing enums will be updated to be somehow open? Otherwise the terminology is confusing.

@DavidArno
Copy link

@dsaf, existing enums are already open. For:

enum Shape { Point, Rectangle, Shape }

Shape can take any int value. Point, Rectangle, and Shape are little more than int constants.

But it would indeed be prudent to talk of them as being open enums to clarify the difference between existing enums and the closed variety.

@alrz
Copy link
Member

alrz commented Feb 11, 2020

@DavidArno

isn't there still an issue pattern matching on those as I'd imagined DU pattern matching would be type based

No they wouldn't be types (or members), if it's a struct DU, all parameters get emitted as flat fields into the struct, the rest is compiler magic. See what F# produces for your example as a value DU.

Though I can imagine C# could do a better job utilizing runtime support for FieldOffsetAttribute if possible (e.g. if all parameters are blittable) - minimizing the struct size.

@0xd4d
Copy link

0xd4d commented Feb 11, 2020

@0xd4d,

There must be a way to convert an integer to the closed enum (unsafe code).

Why? I see no reason for closed enums to have any sort of fixed underlying value. If you want to convert, then use a switch expression:

Bool b = i switch {
    0 => False,
    _ => True
};

That's inefficient unless it's a trivial enum like the above Bool enum. Deserialization should be quick.

It should be as simple as it is today, except the programmer must verify that it's a valid value (or it's undefined behavior)

int value = 1;
if (InvalidValue(value)) throw ...;
unsafe { return (MyEnum)value; }

@alrz
Copy link
Member

alrz commented Feb 11, 2020

@0xd4d You can add it yourself, or have a generator to do it.

enum struct Bool { 
  False, True;
  public static explicit operator Bool(int v)
    => v switch { 0 => False, 1 => True, _ => throw InvalidCastEx };
}

I think unlike enums the corresponding "integer tag" is solely an implementation detail for DUs, so I don't think it's a good idea for the compiler to generate that out of the box.

@0xd4d
Copy link

0xd4d commented Feb 11, 2020

@alrz Your code is the same as @DavidArno's code, a big switch statement which generates a lot of code and causes extra CPU usage at runtime. My solution to use a cast in an unsafe block has the same perf and code size as code we can already use today.

@DavidArno
Copy link

@alrz, "compiler magic" suits me. It's then @gafter's job to solve how to have struct-based DUs (including closed enums) work with a is X b, rather than mine 👍

@0xd4d, it's very unlikely that the solution to serialization of closed enums and other DUs is to convert them to ints. I guess it all depends on whether closed enums are implemented as a completely different solution to DUs or not. Doing so would be a mistake in my view. So I'd argue that any solution to serializing

enum struct Bool { False, True }

also has to be able to serialize

enum struct Option<T> { Some(T), None }

@alrz
Copy link
Member

alrz commented Feb 11, 2020

The only difference with the original proposal would be that False and True in that example wouldn't be compile-time constants so we can't use enum struct and enum interchangeably.

That is because an enum type as it exists today, is just a group of named constants with zero safety guarantees, I'm not sure if it makes sense to try to "fix" that or take the trade-off and favor struct DUs.

@333fred
Copy link
Member

333fred commented Feb 11, 2020

why not closed by default, and open as suggested syntax to open them?

Because open is the default today, and we can't change that.

@AartBluestoke
Copy link
Contributor

@gafter would your intent to be that adding a new value to a "closed" enum is a breaking change? would this be binary compat breaking change, or just a "switch statement which must pick a value entered unknown territory by not finding a match" exceptional territory. eg: what if you wanted to add trinary logic "not provided by customer" to the bool enum (bad naming example)

@DanFTRX
Copy link

DanFTRX commented Feb 11, 2020

@AartBluestoke I believe its true, false and filenotfound.

@gafter
Copy link
Member Author

gafter commented Feb 11, 2020

@AartBluestoke I imagine that adding a new value to a closed enum would be a binary breaking change. Just like b switch { false => 0, true => 1 } is translated to b == false ? 0 : 1, when b is a bool, I would expect b switch { Bool.False => 0, Bool.True => 1 } to be translated to b == Bool.False ? 0 : 1 when b is a Bool. But this design point could be debated and changed.

@popcatalin81
Copy link

popcatalin81 commented Feb 12, 2020

I imagine that adding a new value to a closed enum would be a binary breaking change.

How would this be enforced without the C# compiler generating an under the hood default case?
And also the user might add a default case himself in which case the compiler must generate code to check the range of values expected have not been modified.

Something like:
In library A

public closed enum ObjectAccess
{
     Read,
     ReadWrite,
     Write
}

Consumer of Library A code in assembly B:

switch objectAccess
{
    case ObjectAccess.Read:
           return new ReadOnlyFooBar(...);
    default :
           return new ReadWriteFooBar(...);
}

I asume the C# generated default case would actually look like:

switch objectAccess
{
    case ObjectAccess.Read:
           return new ReadOnlyFooBar(...);
    default :
    {
          if (objectAccess != ObjectAccess.ReadWrite && objectAccess != ObjectAccess.Read)
                throw new InvalidOperationException($"Unexpected value: {objectAccess:g}");
          return new ReadWriteFooBar(...);
    }
}

When library Author adds a new value the the closed enum:

public closed enum ObjectAccess
{
     Read,
     ReadWrite,
     Write,
     ReadWriteExclusive
}

This will indeed break binary compatibility, however
the problem is now that recompiling B will catch the new value in the default clause without the library author having the chance to have to treat the new value explicitly.

So this feature can only work if the default clause is not allowed for closed enums, which I think would make the feature impractical for any enums with more than a few members.

@canton7
Copy link

canton7 commented Feb 12, 2020

@popcatalin81

Other languages seem to get by just fine. In e.g. Rust, the default case doesn't add any extra checks: if there's a default case, then that will catch all other unhandled values, now and in the future.

Currently, if I have a switch that I want to be exhaustive, I need to have an explicit case for every possible enum member and a default which throws an exception. The downside is that I don't get a compiler warning/error if a new enum member is added, only a runtime exception.

With closed enums, I still need to have a case for each enum member, but I can avoid having a default at all, and I can get a compiler warning/error if someone adds a new enum member.

@AartBluestoke
Copy link
Contributor

AartBluestoke commented Feb 12, 2020

I believe the following spec would match the desired behavior:

The compiler will enforce that:
Switching on a closed enum must be exhaustive,
A switch on a closed enum must either not have a default case, or have a default case that guarantees an exception.
If a default case is not specified the compiler will auto-generate a default path that throws an exception if accessed

No other compiler or runtime changes are needed.

@HaloFour
Copy link
Contributor

@AartBluestoke

A switch on a closed enum must either not have a default case, or have a default case that guarantees an exception.

That seems unnecessary. What if the closed enum has 20 values but I only care about 2 of them? Why can't I have a default that returns some value instead?

@gafter
Copy link
Member Author

gafter commented Feb 13, 2020

The intent was not to require that switches be exhaustive. We'd still permit a switch expression to be non-exhaustive, with a warning. And I don't think we'd add any requirements for switch statements, other than the fact that we can infer that a default case isn't needed (e.g. for definite assignment) if you have cases for every enumeration constant.

@AartBluestoke
Copy link
Contributor

A switch on a closed enum must either not have a default case, or have a default case that guarantees an exception.

That seems unnecessary. What if the closed enum has 20 values but I only care about 2 of them? Why can't I have a default that returns some value instead?

if you are only using 2 out of 20 values, then you effectively have open use of a closed enum; you're already saying i don't care if new values get added to this. If that is true then the enum shouldn't be closed. If it is true, you should handle the remaining values.

Perhaps the contextual keyword "default" becomes "and the rest of the values" within the compiler, with the compiler generating a true default branch for exceptions for if non-standard values get into the enum somehow?

My thoughts when i was writing that was if you add a new value to the closed enum, it has to explode, otherwise, "closing" has no meaning other than telling roslyn "don't generate a warning for this one specific situation", and preventing implicit cast from enum to int. and that doesn't seem like a necessary language feature; feels like a job of an analyzer.

@YairHalberstadt
Copy link
Contributor

if you are only using 2 out of 20 values, then you effectively have open use of a closed enum; you're already saying i don't care if new values get added to this. If that is true then the enum shouldn't be closed. If it is true, you should handle the remaining values.

This is really common in functional languages, and the reason is simple: in some cases you don't care about all the values, and other cases you do.

For example you might have a number of states a method can return. In some cases you want to deal with them all seperately. In others you only care if it was successful or not.

@DavidArno
Copy link

... if you are only using 2 out of 20 values, then you effectively have open use of a closed enum; you're already saying i don't care if new values get added to this ...

Sorry, @AartBluestoke, but that is so not true. If, within a particular context, I explicitly call out two of the twenty values, then that's because - in that context - I'm only interested in those two values. So I'll have a default handler for the rest. But in a situation where I explicitly list all of them (which is likely to occur elsewhere in the code) then I absolutely want a compiler error is a new value is added. And having it be a binary breaking change at that point too would be the icing on the cake.

@popcatalin81
Copy link

popcatalin81 commented Feb 13, 2020

So a default clause for a closed enum will mean:

  • Don't bother me about exhaustiveness.

No default clause:

  • Error for missing cases, not handling all values.

It sounds about right for me as a library consumer. There's an escape hatch I can use.

However, I would still like to opt-out of binary breaking changes, if those are added.
It may be important to make certain libraries forward compatible with closed enums through some mechanism or another.
Example:

public closed enum Choice
{
        First,
        Second,
        Third
}

string AnalyzeChoice(Choice choice)
{
       switch choice
      {
            case Choice.First:
                  return "Good";
            case Choice.Second:
                  return "Average";
            case Choice.Third:
                  return "Bad";
            missing:
                return  "Unknown"; // Only when value not part of the enum is passed (like a version of the Enum in a future assembly). 
                //Exhaustiveness checking is still applied to all cases. Recompilation will give an error.
      }
}

@declard
Copy link

declard commented Mar 3, 2020

@DavidArno

And I can imagine how I'd pattern match this stuff:

var area = shape switch {
    Point => 0,
    Rectangle { Width: var w, Length: var l } => w * l,
    Circle { Radius: var r } => Math.PI * r * r
};

But, I've no idea how this lot gets lowered by the compiler. For a struct, Shape, that contains a Circle "value", what does shape is Circle { Radius: var r } mean?

This may do the job:

[StructLayout(LayoutKind.Explicit)]
public struct Shape
{
  public struct Point {}
  public struct Rectangle { public double Width; public double Length; }
  public struct Circle { public double Radius; }
  private closed enum ShapeType : short { Point, Rectangle, Circle }
  [FieldOffset(0)]
  private ShapeType shapeType;
  [FieldOffset(2)]
  private Point point;
  [FieldOffset(2)]
  private Rectangle rectangle;
  [FieldOffset(2)]
  private Circle circle;
  public bool Deconstruct(out Point p)
  {
    if (shapeType == ShapeType.Point) { p = point; return true; }
    else { p = default; return false; }
  }
.....
}

However, it won't work that easily with reference types inside

@wojciechsura
Copy link

wojciechsura commented Jun 9, 2022

I think I have bigger problem with closed enum's behavior more than the syntax itself. The sole purpose of introducing the closed enum seems to be to influence switch statement behavior, so why don't we work on switch, but instead on enum?

For once, strict switch allows for more flexibility, because it can guard literally anything, what is enumerable and finite (namely bool, bool?, all int and int? types and also enums). Also, sealed enum looks a little bit weird for me, because the place where it is declared and where it actually takes effect are different. You'd need to lookup the type to know what the behavior in the code is, whereas in case of strict switch it is obvious at first glance. Finally, you might want to be strict in switch sometimes and sometimes not. With strict switch notation you can do that freely, while with sealed enum you are forced to be strict every time.

@svick
Copy link
Contributor

svick commented Jun 9, 2022

@wojciechsura

Because the switch expression is effectively already strict. For example, consider these two switch expressions:

integer switch {
    < 0 => -1,
    > 0 => 1
}
nullableBool switch {
    true => 1,
    false => 0
}

They produce these warnings:

warning CS8509: The switch expression does not handle all possible values of its input type (it is not exhaustive). For example, the pattern '0' is not covered.
warning CS8655: The switch expression does not handle some null inputs (it is not exhaustive). For example, the pattern 'null' is not covered.

(Note that you need to enable nullable reference types to get the warning about a nullable value type, which is weird, but by design per dotnet/roslyn#52714.)

The problem only happens with enums, because the language disagrees with most of their usages regarding which values are valid.

@TahirAhmadov
Copy link

I forgot about this issue and started a new duplicate discussion despite commenting here previously :) Can we please get this at some point? Not as some DU (like some have suggested), but as a good old enum, just with a runtime check for "closeness".

closed enum HorizontalAlignment
{
  Left = 1, Center = 2, Right = 3
}
var x = (HorizontalAlignment)0; // runtime error
// the above line is compiled to:
var x = Enum.ConvertToClosedEnum<HorizontalAlignment>(0); // this helper method would check and throw, if needed
x = (HorizontalAlignment)1; // OK

[Flags] can be supported by creating a pool of all possible values by bit-wise OR - it's just a set of numbers and not much different from the options' numbers themselves.
Implementing this can be done in a few ways. One approach is to generate the set of possible values and the logic at runtime using IL emitting. Another would be generating a hidden static method at compile time and linking to said method with a hidden attribute on the enum type. There may be other hybrid approaches, as well.

I would also be OK with a blanket prohibition on converting backing values to closed enum type at compile time (as long as I can do it in reflection/IL emit).
Thank you!

@CyrusNajmabadi
Copy link
Member

Can we please get this at some point?

@TahirAhmadov THe issue is a championed and is something the LDM is discussing. There is no need to repeatedly ask :)

@KalleOlaviNiemitalo
Copy link

KalleOlaviNiemitalo commented Jul 25, 2022

@TahirAhmadov, would you expect an InvalidCastException here:

closed enum HorizontalAlignment
{
  Left = 1, Center = 2, Right = 3
}

class C
{
  HorizontalAlignment M()
  {
    int[] ints = { 1, 2, 3, 4 };
    var alignments = (HorizontalAlignment[])(object)ints;
    return alignments[3];
  }
}

If the enum is not closed, then the CLR allows this cast at run time, even though the C# standard doesn't. If this must throw with a closed enum, then I think it can be implemented in several ways:

  • Make HorizontalAlignment a type derived from System.Enum but mark it with some ClosedEnumAttribute. Change the CLR to recognise this attribute and disallow the cast. Would need wording for ECMA-335 augments.
  • Make HorizontalAlignment a type derived from System.ClosedEnum, which is in turn derived from System.Enum. Change the CLR to support this and disallow the cast. Again augment ECMA-335.
  • Make HorizontalAlignment a type derived from System.ValueType. Reflection would not treat it as an enum type.
  • Lower the cast to Enum.ConvertToClosedEnumArray<HorizontalAlignment[]>(object), which then throws if needed. The type argument would be the array type rather than the element type, to support multidimensional arrays. This might get complex if the code casts to some T[] where T is a type parameter and the C# compiler doesn't know whether the type argument will be a closed enum type.

@TahirAhmadov
Copy link

TahirAhmadov commented Jul 25, 2022

@KalleOlaviNiemitalo Yes, I would expect an exception to be raised there. Minor point - it should be a new exception type, like InvalidEnumValueException.
My original thought was an attribute like the one you mentioned - [ClosedEnum] - but now that you listed other approaches, I kind of like the ClosedEnum type solution more (if it's possible). It may even be that ClosedEnum does not inherit from Enum, or that Enum is changed to inherit from ClosedEnum; I can't say for sure what will work best.
I don't think "transpiling the cast" approach is feasible - what if generics are used and T is HorizontalAlignment[]?
PS. It would be cool if we could have a static abstract method on ClosedEnum type and it's then implemented by concrete closed enum types.

@wojciechsura
Copy link

wojciechsura commented Jul 26, 2022

I gave this idea a thought and I'm still not convinced, that this is a proper place to define enum's "closed-ness". To wrap up:

  1. How is that you know at the moment of definition, that enum should be closed? (ie. all of its members should be processed in switch statement)? Can you please tell me upfront: should enum Alignment { Left, Center, Right, Justify } be closed? How do you know right at this moment? (hint: sometimes it should be, sometimes it shouldn't)
  2. To take advantage of closed-ness of enum, it will affect all switch statements. Now when you want to use it as closed only once and as not closed in other places, you will have to write numerous, pointless default: break; statements.
  3. Should standard library enums be closed or not? Introducing their closedness may break existing code. But then why would we do that? How do we know at that point, that they should be closed or not?
  4. What if I want to use some enum as closed in some cases and as open in others? This may lead to people creating two versions of an enum, which will be harmful to the code quality.
  5. On top of 3, we'd need to modify the reflection mechanisms to know, if enum is closed or not.
  6. From the discussion I understand, that not exhausting all cases of a closed enum should cause an exception. Why not a compilation error, since compiler is able to check that? Why do we push error-checking (possibly) to the user instead of keeping it at the development level?

I think it makes far more sense and far less problems to make something like 'strict switch', which actually kind-of already exists in the form of switch statement (except that not checking all possible cases there is marked with a warning only).

It resolves all above issues:

  1. We decide if we have to process all cases in the place where it actually matters
  2. You can mix strict and non-strict switches however you like
  3. Standard library enums may stay as they are and still be used in a strict (or non-strict) switches
  4. Enum definitions do not change, you are free to use them as "open" or "closed", depending on whether you will do a strict or a non-strict switch.
  5. Reflection can stay as it is (though to be fair, expressions may need a change)
  6. Strict switch violation is a compilation error: it will never reach the user.

And in addition:

  1. Literally zero problems with casting whatsoever (the whole enum functionality is kept as it is)
  2. We may do a strict switch on top of all enumerable types (in addition to enums, also all int types - possibly with ranges, as well as bool, and including nullables)
  3. We may introduce strict switch statement as well, which will yield error instead of warning
  4. We will keep consistency between existing and new language structures
  5. All existing code will be unaffected in any way

I honestly fail to see a benefit in having closed enums. Yes, they will solve one problem of not handling all enum cases, but at the same time they may introduce a ton of other problems ranging from backwards compatibility to standard library code changes, to nuances with casting enums to int types and so on - and I'm really not convinced, that this benefit is worth the problems.

@TahirAhmadov
Copy link

TahirAhmadov commented Jul 26, 2022

@wojciechsura I think there is a misunderstanding here. "Closed enums" prevent invalid values being assigned to them by way of casting a value of the backing type, such as int, to the enum type, where the value does not have a corresponding option.
It doesn't have anything to do with switch statements/expressions; whether you choose to handle all enum options or just some subset, is indeed the prerogative of the handling site, not enum definition.
In this example, yes, Orientation should be closed, because (for example) Left = 1, ...., Justify = 4, and you do not want an invalid Orientation of 5 or -2. The meaning of said enum type is not congruent with non-option values. But, in the code which has to do something basing on Orientation, it can do something for Left and Right only, and ignore the Center and Justify options - that's fine. (Provided control flow analysis does not require you to handle all enum options, say, to ensure a method returns a value.)
There are other enums where non-option values should be allowed, for example, hypothetical MessageType enum - your implementation of a binary protocol recognizes certain messages, but another peer in your communication scenario can be on a newer version of the software with more message types supported, in which case you want to gracefully respond with "message type not supported" response instead of crashing.

Regarding modifying existing enums to make them "closed" and potential breaking change, that's a valid question, but one that's unlikely to manifest itself in a switch statement.

// new .NET version: "closed" is added
closed enum Orientation { Left = 1, Center = 2, Right = 3, Justify = 4 };

class Foo
{
  Orientation _orientation; // after .NET upgrade, this becomes an error, because 0 is invalid
  void Bar()
  {
    // however, even after .NET upgrade, none of the below changes in any way
    switch(this._orientation)
    {
      case Orientation.Left: .... break;
      case Orientation.Right: .... break;
      default:
        ...
        // here, before .NET upgrade, this._orientation could have been Center, Justify, or some invalid value like 5 or -2
        // after upgrade, it can only be Center or Justify. while it's a subtle behavior change,
        // given the context of the enum type, is extremely unlikely to actually be a breaking change
        break;
    }
  }
}

PS. I would also remind you that switch is just one scenario where you usually want to make sure your enum is a valid option; other scenarios include if statements, dictionaries/hashsets, serialization, etc.

@hez2010
Copy link

hez2010 commented Jul 26, 2022

var x = (HorizontalAlignment)0; // runtime error

I think in this case a compile error can also be emitted.

@ericsampson
Copy link

Is the idea still that a switch statement on a closed enum, with no default case, will give a compile error for a missing case?
ie switch on a closed enum:

  • has default case > non-exhaustive, warning if missing case
  • no default case > exhaustive, missing case means compile error

@TahirAhmadov
Copy link

TahirAhmadov commented Jul 26, 2022

@ericsampson I would think with a closed enum:

  • if it's exhaustive and has a default case, there should be a warning about unreachable code;
    • per @CyrusNajmabadi feedback below, it may be better not to warn here, to allow handling new values that are introduced if a DLL is replaced without project rebuild; and because it's probably impossible to prevent setting an invalid value using reflection.
  • if it's exhaustive and doesn't have a default case, there would be no errors/warnings;
  • if it's not exhaustive and has a default case, there would be no errors/warnings;
  • if it's not exhaustive and doesn't have a default case, there would be an error for a missing case.

@ericsampson
Copy link

@TahirAhmadov thats what I would expect as well, but I just wanted to verify because I haven't seen this documented explicitly yet.

@CyrusNajmabadi
Copy link
Member

if it's exhaustive and has a default case, there should be a warning about unreachable code;

Why? What if a value is added in the future, and you want to make sure your code is resilient to that? Closed may mean "at the time you compiled against me, these are all the known values", but it doesn't mean: these values can't even change in the future.

@TahirAhmadov
Copy link

TahirAhmadov commented Jul 26, 2022

@CyrusNajmabadi that depends on whether you're talking about the rebuild scenario or executable update scenario. If it's rebuilt, a new value will cause a compilation error, which will let you update the switch. If executable is updated without rebuild, then you are right - it probably makes sense to allow defining a default case without a warning, to allow for a more graceful handling of such a new value other than compiler-emitted exception.
I was just unsure whether that scenario - the executable replacement - is supported in general, in the context of both closed enums and DUs. With DUs, for example, especially struct-based ones, it would have to be very carefully designed to avoid bizarre outcomes. Or maybe DUs can have a different rule for this edge case? If so, it would introduce an inconsistency which may be unexpected when looking at the big picture.

@CyrusNajmabadi
Copy link
Member

If executable is updated without rebuild,

Yes. This is the case i would care a lot about. We should always assuem that preexisting code will still be run (without recompile) on later versions of a library.

@CyrusNajmabadi
Copy link
Member

With DUs, for example, especially struct-based ones, it would have to be very carefully designed to avoid bizarre outcomes.

It's certainly a breaking change in your API. But the question on the consumption side is: is it possible for me to write code that is resilient to breaking changes. I would hope so. Especially something as basic as "in case of something i don't understand, bail out".

@TahirAhmadov
Copy link

I updated my post above to correct that point.
Another reason may be that it's probably impossible to prevent setting an invalid value using reflection, and that's yet another reason to still be able to handle unexpected values (in certain codebases).

@Eli-Black-Work
Copy link

@TahirAhmadov I think that we'd want to warn on all cases that don't have a default: label, in case a new .dll gets dropped in that has new values in the enum.

I think that would make your above list something like this:

  • If has a default case, don't warn.
  • If it's exhaustive and doesn't have a default case, warn.
  • If it's not exhaustive and doesn't have a default case, error.

I feel like this pretty much kills the entire point of this proposal, though. I wonder if we should ping Gafter and ask his thoughts 🙂

@TahirAhmadov
Copy link

TahirAhmadov commented Aug 1, 2022

@Bosch-Eli-Black I don't think we need to go that far; the compiler will output a hidden default which throws something like UnexpectedEnumValueException, which is sufficient for 99% of use cases.

To the extent that either reflection or new DLL can introduce an unexpected value, an argument can even be made that you still shouldn't define a default case, and instead should just catch that exception with a try-catch. On the other hand, the try-catch is boilerplate, and it would be weird if default is specified but then purposefully not emitted and replaced with the built-in throw line. (Perhaps specifying a default in an exhaustive switch would be an error in that case.)

PS. Another approach to consider would be an even stricter version: the binder refuses to work if a DLL with an unexpected enum option is copied to the executable folder; and verification code is emitted (not sure if it's possible during IL emit or JIT) whenever any value is set to a closed enum type field/local/etc., making it "airtight". This is probably too expensive, though.

@Eli-Black-Work
Copy link

@TahirAhmadov Sorry, I just read through more of the thread, and I think now I understand things a bit better 🙂 Your list looks good to me! (#3179 (comment))

@Swiftly1
Copy link

Swiftly1 commented Nov 28, 2022

Closed enum for me is a tool for error handling.

Error handling often requires additional informations (messages, args, etc)

I propose "constructor" syntax for those enums that's similar to the concept used in records

closed enum ValidationError
{ 
	EmptyString,
	UnexpectedCharacter(int index),
}

closed enum UserServiceError 
{ 
	NotFound,
	NonOkHttpStatusCode(int statusCode),
	NetworkError(string debug_log)
}

Random ass usage example:

public (bool Success, ValidationError? Error) Validate(string s)
{
	if (string.IsNullOrEmpty(s))
		return (false, ValidationError.EmptyString);
		
	if (s[0] != "[")
		return (false, ValidationError.UnexpectedCharacter(0));
		
	return (true, null);
}

var error = Validate("s").Error.Value;

switch (error)
{
	case EmptyString:
	{
		Console.WriteLine("empty");
	}
	case UnexpectedCharacter context:
	{
		Console.WriteLine($"unexpected character at {context.index}");
	}
}

@HaloFour
Copy link
Contributor

@Swiftly1

See #113, as adding data elements definitely turns them into discriminated unions.

@AartBluestoke
Copy link
Contributor

because even 'closed' enums can have added, is it even physically possible to load a library "closed" value and have that make sense?
If so, it makes loading any library version with a different Enum list a breaking change, but in many circumstances a newer version of a library is pulled in via dependencies.

library v1 has
closed enum Error{
Missing =1,
Forbidden =2}

library v2 has
library v1 has
closed enum Error{
Missing =1,
Forbidden =2,
Corrupted =3}

you compile against v1. loading v2 is now a binary breaking change, where you now unexpectedly fall through the previously "Complete" switch.

@NeilMacMullen
Copy link

Probably jumping in to beat a dead horse, but it seems to me that much confusion has been caused by using the term "enum" here. C#'s "original sin" IMO was to borrow the C concept of a list of values backed by integers and to allow the use of bitfield operations (Flags).

What I want almost all of the time when declaring an "enum" is actually a closed set of distinct tags that are strongly typed. It's usually desirable to be able to enumerate over that set. In other words, I want the strongly-typed enum pattern without confusing things by providing access to the backing type (which allows people to inject unrepresentable values).

@TahirAhmadov
Copy link

Probably jumping in to beat a dead horse, but it seems to me that much confusion has been caused by using the term "enum" here. C#'s "original sin" IMO was to borrow the C concept of a list of values backed by integers and to allow the use of bitfield operations (Flags).

What I want almost all of the time when declaring an "enum" is actually a closed set of distinct tags that are strongly typed. It's usually desirable to be able to enumerate over that set. In other words, I want the strongly-typed enum pattern without confusing things by providing access to the backing type (which allows people to inject unrepresentable values).

The int-backed C style enums (what is now enum in C#) have a number of benefits: they are simple; performant; the bitfield operations are desirable in some cases; easy convertibility to the underlying integer is frequently needed.
Yes, the ability to convert any arbitrary number to enum is a downside, but that's exactly what this proposal aims to solve.

@NeilMacMullen
Copy link

NeilMacMullen commented Apr 6, 2023

Yes, I appreciate that enums are what we have now just pointing out that when people talk about "closed sets" they may be coming at it from a perspective that is inconsistent with the assumption that the set must be based on a numeric backing field so perhaps it's better (or at least useful) to consider solving the general problem rather than 'just' patching up int-backed enums. For example #2849 is effectively a proposal for closed set of string-backed values. IOW at some point maybe it's better not to call these things "enums".

@TahirAhmadov
Copy link

TahirAhmadov commented Apr 6, 2023

I don't have any strong opinions on naming here, however my main point is that, the int and string backed "enums" (or however we name them) are needed irrespective of other proposals, specifically because in some cases you want to assign an undefined backing value to it, such as, in many network communication scenarios.
Having said that, it's downright impossible to rename the backing-value enums at this point, so even if everybody agreed that "enum" is a bad term for them, there isn't much to be done about that....

PS. Please see this comment of mine: #2849 (comment)
It shows why, I think, current C# enum is a good feature to have - and not a legacy baggage.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests