Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design direction for sum types #157

Merged
merged 29 commits into from
Feb 10, 2021
Merged

Design direction for sum types #157

merged 29 commits into from
Feb 10, 2021

Conversation

geoffromer
Copy link
Contributor

@geoffromer geoffromer commented Sep 8, 2020

@geoffromer geoffromer added proposal A proposal WIP labels Sep 8, 2020
@googlebot googlebot added the cla: yes PR meets CLA requirements according to bot. label Sep 8, 2020
proposals/p0157.md Outdated Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
@geoffromer geoffromer added proposal rfc Proposal with request-for-comment sent out and removed WIP labels Sep 11, 2020
proposals/p0157.md Show resolved Hide resolved
proposals/p0157.md Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
…nsequently switch to using `Array(Storage)` instead of `StorageArray`
- Introduce `alternatives` blocks as a way of asserting that a set of alternatives is exhaustive and mutually exclusive.
- Largely eliminate discussion of "substitution principle" for pattern matching as a whole, and instead focus on a "mirroring requirement" that applies to pattern functions specifically. Add extensive discussion of the potential drawbacks of this requirement.
- Add discussion of sum types that are implemented with sentinel values rather than discriminators.
- More comprehensive discussion of evaluation order
proposals/p0157.md Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
Copy link
Contributor

@josh11b josh11b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like I would be more enthusiastic about a proposal where:

  • Choice has a concise syntax, like most language's sum types (e.g. Rust/Swift enums). This construct would be made efficient in normal cases through some mechanism for identifying unused values of types. This construct would be encouraged for situations where the precise bit pattern doesn't need to be specified. I would also support expanding this construct to encompass C++ enum use cases (e.g. specifying values of constants).

  • Alternatives have even more power to represent strange representations, at the cost of: alternatives must be mutually exclusive, and a specific decoding function would be provided for the reverse direction which both determines both which alternative, and the values of the arguments. This would address order-of-operation concerns, in addition to allowing representations the compiler would not be able to compute an inverse. Example:
    encoding 0 <= size <= capacity in a single integer using (capacity * (capacity + 1) / 2) + size.

proposals/p0157.md Outdated Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
proposals/p0157.md Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
Copy link
Contributor

@chandlerc chandlerc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing urgent to respond to here -- below are just some random thoughts that occurred to me while going through the proposal in detail.

FWIW, I feel like there are somewhat two dimensions of interesting design space here.

First: how do we represent user-defined type patterns: a) pattern functions, b) a proxy choice type, or c) as named continuations in an interface. The current syntax is only loosely tied to which model we use here; the model seems like the most interesting decision.

Second: do we have construction syntax and pattern match syntax be constructively mirror each other, or are matching and construction objects independent (if potentially intentionally similar/overlapping) syntactic constructs?

I feel like we've been assuming a strong mirror in the second dimension, and exploring the resulting options. I have to say, I'm not overly persuaded by the results. But I'm also interested in others' perspective here.

If we do continue down the mirroring strategy, I have to say that tho continuation model seems very tempting. I think it largely allows us to defer all the discussions around copy semantics and references etc. because they end up rooted in the same semantics we end up using for function parameters. IMO, that is likely to let us approach them in highly orthogonal ways, and to be a sustainable model because it converges everything in the pattern matching space -- its all ultimately parameter binding.

We should probably have some (I'm guessing almost certainly more than one) fairly open-ended discussion sessions to figure out if we're happy with the second dimension I mentioned above, and then find a way to make the syntax compelling for whatever we end up with.

I do want to say, the current write up for me has been very helpful to explore exactly what pattern functions would look like, and what the proxy objects limitations are. Both of those were, for me, very underexplored areas.

proposals/p0157.md Outdated Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
proposals/p0157.md Outdated Show resolved Hide resolved
Comment on lines 497 to 499
template specialization takes place. By the same token, `Match` cannot be
overloaded, because overload resolution is likewise driven by the concrete types
of the function arguments, which may not be known at that point.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we discussed overloading Match in Discord. The idea was that you would have two different interfaces, and it would pick which Match call based on what was present in the various cases. Presumably the interfaces would have to be sufficiently distinct that there was no ambiguity about which Match call to use (e.g. no overlapping names might be sufficient).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but on reflection I concluded it wasn't worth the trouble. A sum type that can be destructured along multiple axes seems theoretically possible, but I really can't think of any practical use cases. We can always add this later if we think it's important.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it should say "we won't support overloading" rather than "cannot be overloaded" to clarify that this is a choice rather than a technical blockage?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, done.

Max(Alignof(T), Alignof(Error))): storage;

interface MatchContinuation {
var Type:$$ ReturnType;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
var Type:$$ ReturnType;
var Type:$ ReturnType;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're probably right about this, but it's raising some non-trivial questions for me, so I think we should leave it in the form that the core team reviewed.

omitted. Similarly, I propose that patterns of the form `S.Alt` or
`S.Alt(<subpatterns>)`, where `S` is the type being matched against, the `S` can
be omitted. Note that both of these shorthands are allowed only at top level,
not as subexpressions or subpatterns.
Copy link
Contributor

@josh11b josh11b Jan 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other cases where we might want to use this syntax:

if (x == .None) { ... }
CallFunctionTakingOptionalArgs(.None, .Some(3));

These both seem forbidden as "subexpressions", which suggests these tasks will be quite verbose:

if (x == Optional(Int).None) { ... }
CallFunctionTakingOptionalArgs(Optional(Int).None, Optional(Int).Some(3));

The if could be changed into a pattern match (also verbose), but the other case might require some other trickery, like creating None and Some as names outside of Optional:

struct NoneType { }
var NoneType:$$ None;
impl[Type:$$ T] ConvertibleTo(Optional(T)) for NoneType { ... }
fn Some[Type:$$ T](T: x) -> Optional(T) { return .Some(x); }

but this means things like:

var auto: x = None;

will be a bit surprising. Which is to say, it will work until you try and assign a non-None value to it.

Copy link
Contributor

@josh11b josh11b Feb 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also assignment:

var Optional(Int): x = .None;
x = .Some(3);

This pushes me toward preferring the type-indexed approach.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One problem with not supporting convenient syntax for comparing choice values in an if condition is when the condition is a complicated compound expression that does not lend itself to a match statement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in the decision meeting, I agree that there are likely to be problems here, but I think we should try it this way, see where the pain points are in practice, and then revisit the issue.


#### Different spelling for `choice`

The Rust and Swift counterparts of `choice` are spelled `enum`. I have avoided
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd still strongly consider enum, regardless of whether it's really "explicitly enumerated", on the grounds of familiarity. This is basically identical to the Rust feature (and to the Swift feature, I'd guess), and it's a superset of the C++ feature. From the teaching perspective, there's a lot to say for stealing things from other language rather than being original.

proposals/p0157.md Show resolved Hide resolved
Comment on lines +753 to +754
- choice types cannot be default constructible, unless we provide a separate
mechanism for specifying which alternative is the default.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More relevant to Carbon is whether the choice type has an unformed state. It can have an unformed state if (a) any alternative has unformed state (including alternatives with no state), or (b) the descriminator can represent one additional state.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but it's harder to talk about that without a design for unformed states.

@sidney13 sidney13 added proposal accepted Decision made, proposal accepted and removed needs decision labels Feb 2, 2021
@sidney13
Copy link
Contributor

sidney13 commented Feb 2, 2021

Approval announcement

@sidney13 sidney13 removed the proposal rfc Proposal with request-for-comment sent out label Feb 2, 2021
sidney13 added a commit that referenced this pull request Feb 3, 2021
* Decision for #157: Design direction for sum types

* Update p0157_decision.md

* Update p0157_decision.md
proposals/p0157.md Show resolved Hide resolved
because they cannot be interpreted as function calls. However, this would
require us to have a syntax for matching alternatives that is disjoint from the
syntax for constructing alternatives. This would be at odds with existing
practice in languages like Rust, Swift, Haskell, and ML, all of which use the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something I'm a little surprised to see is that this is the only place where existing practice is mentioned in the context of dealing with the syntactic ambiguity. If all of those languages use the same syntax for matching alternatives as for constructing alternatives, how do they avoid the problems you're identified above? I'd be strongly biased in favor of an approach that is known to be acceptable in practice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To my knowledge, those languages don't support user-defined pattern matching, so they can provide a language-level guarantee that there's no observable difference between the pattern and expression semantics of a given type. Consequently, this ambiguity only arises as a hidden implementation detail within the compiler, so it can be resolved heuristically and doesn't need to be explicitly specified.

@geoffromer geoffromer merged commit efab91c into carbon-language:trunk Feb 10, 2021
@geoffromer geoffromer deleted the direction branch February 10, 2021 19:56
chandlerc pushed a commit that referenced this pull request Jun 28, 2022
* Decision for #157: Design direction for sum types

* Update p0157_decision.md

* Update p0157_decision.md
chandlerc pushed a commit that referenced this pull request Jun 28, 2022
Directional proposal for supporting sum types in Carbon
@JohnScience
Copy link

There's a design proposal on Rust internals regarding enums. The same ideas can apply to Carbon: https://internals.rust-lang.org/t/better-enums/16692

@josh11b
Copy link
Contributor

josh11b commented Jul 27, 2022

There's a design proposal on Rust internals regarding enums. The same ideas can apply to Carbon: https://internals.rust-lang.org/t/better-enums/16692

@JohnScience This PR was merged and closed some time ago, and is not a good venue for new discussion.

zygoloid added a commit to zygoloid/carbon-lang that referenced this pull request Sep 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes PR meets CLA requirements according to bot. proposal accepted Decision made, proposal accepted proposal A proposal
Projects
None yet
Development

Successfully merging this pull request may close these issues.