-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Basic Syntax #162
Basic Syntax #162
Conversation
jsiek
commented
Sep 19, 2020
•
edited by jonmeow
Loading
edited by jonmeow
- RFC topic
- Decision request
- Decision announcement
- Decision PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still want to go through in more detail, but I had a few minor notes I captured as we were looking at this during the meeting this week and didn't want them to get lost so posting them right away.
Co-authored-by: Dmitri Gribenko <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please take my comments as a starting point for discussion rather than demands. I think I have formed a picture in my head of where I thought the syntax was going and this PR has been great for (a) firming up those ideas, and (b) helping me examine the places where there are weaknesses. I fully expect there are others on the team who have different ideas on specific syntax points, and we have mostly been postponing resolving those differences until we have agreed on more of the rest of the design. I think this is a great forum where we can hash out things in a very concrete way.
The grammar rule | ||
|
||
```Bison | ||
expression: "fnty" tuple return_type | ||
``` | ||
|
||
is for function types. They are meant to play the role that function pointers | ||
play in C and that `std::function` plays in C++. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's premature to include this. If the rest of the proposal doesn't depend on it, can we remove function types for now and reintroduce them in a later proposal if needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with removing function types is that it significantly reduces the expressiveness of this little language. You would no longer be able to put functions inside of tuples and you would no longer be able to pass functions as arguments or return values of other functions. In other words, functions won't be "first class" values. That will reduce the scope of examples that we'll be able to handle with this subset. I'd prefer that we put functions types in right away, and then when we have a better alternative, it can be swapped in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with including them is that (from my perspective) we often care more about an overload set than a specific function signature. Having machinery about function types sets us down the path of re-introducing that mistake.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tracking issue for this question: #191
precedence. Proposal 168 differs in that the operator groups are partially | ||
ordered instead of being totally ordered. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that necessarily follows. We won't be able to use bison's operator precedence parsing system. But you can implement operator precedence in other ways, for example using the precedence-climbing method (https://en.wikipedia.org/wiki/Operator-precedence_parser#Precedence_climbing_method) -- that's how the C++ specification describes operator precedence, for example -- and I would expect bison to be able to handle that. And I think the precedence-climbing method does support partial precedence orders such as the one suggested by #168. It also has the advantage of removing ambiguities from the formal grammar rather than doing so in a separate set of rules -- for that reason I'd prefer that our formal specification uses that formulation or one like it rather than a precedence table even if we don't use a partial precedence ordering.
struct Expression { | ||
ExpressionKind tag; | ||
union { | ||
struct { string* name; } variable; | ||
struct { Expression* aggregate; string* field; } get_field; | ||
struct { Expression* aggregate; Expression* offset; } index; | ||
struct { string* name; Expression* type; } pattern_variable; | ||
int integer; | ||
bool boolean; | ||
struct { vector<pair<string,Expression*> >* fields; } tuple; | ||
struct { | ||
Operator operator_; | ||
vector<Expression*>* arguments; | ||
} primitive_op; | ||
struct { Expression* function; Expression* argument; } call; | ||
struct { Expression* parameter; Expression* return_type;} function_type; | ||
} u; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason to prefer a union here over a class hierarchy? I think it would be clearer for expositionary purposes to use a class hierarchy.
In the executable semantics, I don't think you ever mutate the tag of an expression after creating it, and generally treating AST nodes as immutable seems preferable, especially in the executable semantics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The use of enums and unions goes well with the use of switch statements to express dispatching on the kind of AST node. I much prefer switch statements to virtual functions for that purpose, because 1) it keeps the code together instead of spreading it out over many classes, 2) it is easier to access variables that are defined in the scope enclosing the switch statement, and 3) there is less extra boilerplate code compared to virtual functions.
Why do you think that a class hierarchy is clearer for expository purposes? Is it that you think that the readers will be more used to thinking in terms of object-oriented ways to organize code, or do you have other reasons in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The class hierarchy approach would avoid needing pointers everywhere (presumably done to keep the union members trivial) and would avoid the .u.foo
s all over the place. The fact that there's a "tag" field isn't part of the semantic model, it's an implementation detail, and one that we could hide with a class hierarchy approach.
With a small helper utility you can express dispatch on the node kind as something like:
visit(expr)
.Case([&] (VariableExpr &v) { ... handle variable case ... })
.Case([&] (GetFieldExpr &g) { ... handle get_field case ... })
.Default([&] { ... handle default case ... });
... which I think would handle your concerns (1), (2), and (3).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, Smalltalk style control flow. Fun! Yes, I can see that working just fine.
Does this approach provide exhaustiveness checking for the cases? That's pretty important
when adding new features to a language.
The approach you're proposing has the great advantage of being more type safe than the
old-school enum-struct-union approach.
The down side is that it adds some complexity to the code which affects readability and
reasoning about the semantics... e.g. one needs to completely understand order of evaluation
of the code to understand what the executable semantics is saying about the language being defined.
In this case I think the added complexity is relatively small so that's not a big down-side.
Regarding the pointers everywhere, yes, that was to keep the members trivial, which
was forced on me by Bison. I'd love to have a solution for that.
The `pattern` non-terminal is defined in terms of `expression`. It could instead | ||
be the other way around, defining `expression` in terms of `pattern`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a specification perspective, I'd prefer that we define a parameterized grammar. One way to do this: add an optional suffix to each production, and say that where the suffix is omitted, the implication is that the production is the same for all possible suffixes. So:
patexp ::= patexp
[
patexp]
...
patexpP ::= patexpP:
identifier
pattern ::= patexpP
expression ::= patexpE
where
patexp ::= patexp
[
patexp]
is shorthand for
patexpP ::= patexpP
[
patexpP]
patexpE ::= patexpE[
patexpE]
The ECMA-262 specification does something like this, though their approach is a bit more explicit.
The grammar rule | ||
|
||
```Bison | ||
expression: "fnty" tuple return_type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I filed issue #188 "Repeated 'fnty' in curried function types".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The grammar rule | ||
|
||
```Bison | ||
expression: "fnty" tuple return_type | ||
``` | ||
|
||
is for function types. They are meant to play the role that function pointers | ||
play in C and that `std::function` plays in C++. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tracking issue for this question: #191
@jsiek This PR would have been fine to commit after being accepted. Can you commit it? |
@jonmeow I'm not sure what you mean by "commit" it. Do you mean "merge"? I see a button for that but it says I'm not authorized. |
Yes, merge. Can you resolve the README.md conflict? (merge changes from trunk to your branch, pre-commit run -a, add the generated changes) I'll check with Chandler if he had an intent behind restricting merges to the core team and mtainainers. |
Ok, looks like I've fixed the conflict in the README. Please go ahead and merge. |
I think you should have access to merge now: can you please try, to verify? (not quite set up how I'd like long-term, but should be good enough for now) |
I don't see a button for merge on this github.com page. How do I merge this PR into trunk? |
Yeah, the merge button appeared! |
👍 |
- [Proposal PR](#162) - [RFC topic](https://forums.carbon-lang.dev/t/rfc-basic-syntax/142) - [Decision request](https://forums.carbon-lang.dev/t/request-for-decision-basic-syntax-162/165) - [Decision announcement](https://forums.carbon-lang.dev/t/accepted-basic-syntax-162/170) Tracking issues filed as part of decision: - Should there be a function type? #191 - Should types be values? #192 Finalized on 2020-11-24
I just noticed the executable syntax code hasn't made it to a separate PR. Would it help if I pull that code together for commit? |
Hi Jon, that would be great. The code that exactly corresponds to the basic syntax proposal is no longer on github... I removed it as per your request. It's on my hard drive. How would you like me to communicate that to you? P.S. The executable-semantics branch on my fork of carbon-lang has a bunch of extra stuff, experiments, that were not in the basic syntax proposal. |
I've done a first pass pulling in the code from the previous state of this PR: I'll work on cleaning that up (still plenty to do, it's not compiling for me at present). If you think it's worth pulling in the rest at the same time, an easy way to share it would be to commit it to a branch on your fork and/or make a PR (could even close the PR once done). I could grab it via GitHub via either of those. An alternative would be that, once I get that cleaned up for commit, maybe it'll be easier for you to contribute more incrementally? Perhaps we can figure that out once this first lump is in? |
Ahh, good idea to grab the code from the previous state. I think it will be easier to go with the incremental approach. |
* first draft of proposal for basic syntax * rename proposal * formating * fixing typos * precendence and associativity * minor edit * added abstract syntax * some revisions based on feedback from meeting today * optional return type * oops, not optional for function declarations * replacing abbreviations with full names * name changes * Update proposals/p0162.md Co-authored-by: Dmitri Gribenko <[email protected]> * removed = from precedence table * for fun decl, back to optional return type, shorthand for void return * more fiddling with return types * change base case of statement_list to empty * added pattern non-terminal * added expression style function definitions * change && to and, || to or * change and to have equal precedence * updating text to match grammar, fix typo in grammar regarding pattern * adding trailing comma thing to tuples * removed 'alt' keyword, not necessary * flipped expression and pattern * added period for named arguments and documented the reason * change alternative syntax to use tuple instead of expression * comment about abstract syntax * code block language annotations * added alternative designs, some cleanup for pre-commit * spell checking * filled out the TOC * minor edits * minor edit * trying to fix pre-commit error * changes from pre-commit? * changes based on meeting today * describe alternatives regarding methods * more rationale in discussion of alternatives * addressing comments * typo fix, added text about next steps * removed *, changed a ! to not * Update proposals/p0162.md Co-authored-by: Geoff Romer <[email protected]> * edits from feedback * pre-commit * added second reason for period in field initializer * added executable semantics, fixed misunderstanding regarding associativity * update README * added a paragraph about tuples and tuple types * Update proposals/p0162.md Co-authored-by: josh11b <[email protected]> * addressing comments from josh11b * Update executable-semantics/README.md Co-authored-by: Dmitri Gribenko <[email protected]> * link to implicit parameters in generics proposal * added non-terminal for designator as per geoffromer's suggestion * changed handling of tuples in function call and similar places as per josh11b and zygoloid * moving code to separate PR Co-authored-by: Dmitri Gribenko <[email protected]> Co-authored-by: Geoff Romer <[email protected]> Co-authored-by: josh11b <[email protected]>