Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Basic Syntax #162

Merged
merged 59 commits into from
Dec 1, 2020
Merged

Basic Syntax #162

merged 59 commits into from
Dec 1, 2020

Conversation

jsiek
Copy link
Contributor

@jsiek jsiek commented Sep 19, 2020

@googlebot googlebot added the cla: yes PR meets CLA requirements according to bot. label Sep 19, 2020
@jsiek jsiek added proposal A proposal WIP labels Sep 19, 2020
@jsiek jsiek changed the title first draft of proposal for basic syntax Basic Syntax Sep 20, 2020
proposals/p0162.md Outdated Show resolved Hide resolved
Copy link
Contributor

@chandlerc chandlerc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still want to go through in more detail, but I had a few minor notes I captured as we were looking at this during the meeting this week and didn't want them to get lost so posting them right away.

proposals/p0162.md Outdated Show resolved Hide resolved
proposals/p0162.md Outdated Show resolved Hide resolved
proposals/p0162.md Outdated Show resolved Hide resolved
proposals/p0162.md Outdated Show resolved Hide resolved
proposals/p0162.md Outdated Show resolved Hide resolved
proposals/p0162.md Outdated Show resolved Hide resolved
Copy link
Contributor

@josh11b josh11b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please take my comments as a starting point for discussion rather than demands. I think I have formed a picture in my head of where I thought the syntax was going and this PR has been great for (a) firming up those ideas, and (b) helping me examine the places where there are weaknesses. I fully expect there are others on the team who have different ideas on specific syntax points, and we have mostly been postponing resolving those differences until we have agreed on more of the rest of the design. I think this is a great forum where we can hash out things in a very concrete way.

proposals/p0162.md Outdated Show resolved Hide resolved
proposals/p0162.md Outdated Show resolved Hide resolved
proposals/p0162.md Outdated Show resolved Hide resolved
proposals/p0162.md Outdated Show resolved Hide resolved
proposals/p0162.md Outdated Show resolved Hide resolved
proposals/p0162.md Outdated Show resolved Hide resolved
proposals/p0162.md Outdated Show resolved Hide resolved
proposals/p0162.md Outdated Show resolved Hide resolved
proposals/p0162.md Outdated Show resolved Hide resolved
Comment on lines +159 to +166
The grammar rule

```Bison
expression: "fnty" tuple return_type
```

is for function types. They are meant to play the role that function pointers
play in C and that `std::function` plays in C++.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's premature to include this. If the rest of the proposal doesn't depend on it, can we remove function types for now and reintroduce them in a later proposal if needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with removing function types is that it significantly reduces the expressiveness of this little language. You would no longer be able to put functions inside of tuples and you would no longer be able to pass functions as arguments or return values of other functions. In other words, functions won't be "first class" values. That will reduce the scope of examples that we'll be able to handle with this subset. I'd prefer that we put functions types in right away, and then when we have a better alternative, it can be swapped in.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with including them is that (from my perspective) we often care more about an overload set than a specific function signature. Having machinery about function types sets us down the path of re-introducing that mistake.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tracking issue for this question: #191

Comment on lines +271 to +272
precedence. Proposal 168 differs in that the operator groups are partially
ordered instead of being totally ordered.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that necessarily follows. We won't be able to use bison's operator precedence parsing system. But you can implement operator precedence in other ways, for example using the precedence-climbing method (https://en.wikipedia.org/wiki/Operator-precedence_parser#Precedence_climbing_method) -- that's how the C++ specification describes operator precedence, for example -- and I would expect bison to be able to handle that. And I think the precedence-climbing method does support partial precedence orders such as the one suggested by #168. It also has the advantage of removing ambiguities from the formal grammar rather than doing so in a separate set of rules -- for that reason I'd prefer that our formal specification uses that formulation or one like it rather than a precedence table even if we don't use a partial precedence ordering.

Comment on lines +317 to +334
struct Expression {
ExpressionKind tag;
union {
struct { string* name; } variable;
struct { Expression* aggregate; string* field; } get_field;
struct { Expression* aggregate; Expression* offset; } index;
struct { string* name; Expression* type; } pattern_variable;
int integer;
bool boolean;
struct { vector<pair<string,Expression*> >* fields; } tuple;
struct {
Operator operator_;
vector<Expression*>* arguments;
} primitive_op;
struct { Expression* function; Expression* argument; } call;
struct { Expression* parameter; Expression* return_type;} function_type;
} u;
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to prefer a union here over a class hierarchy? I think it would be clearer for expositionary purposes to use a class hierarchy.

In the executable semantics, I don't think you ever mutate the tag of an expression after creating it, and generally treating AST nodes as immutable seems preferable, especially in the executable semantics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of enums and unions goes well with the use of switch statements to express dispatching on the kind of AST node. I much prefer switch statements to virtual functions for that purpose, because 1) it keeps the code together instead of spreading it out over many classes, 2) it is easier to access variables that are defined in the scope enclosing the switch statement, and 3) there is less extra boilerplate code compared to virtual functions.

Why do you think that a class hierarchy is clearer for expository purposes? Is it that you think that the readers will be more used to thinking in terms of object-oriented ways to organize code, or do you have other reasons in mind?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The class hierarchy approach would avoid needing pointers everywhere (presumably done to keep the union members trivial) and would avoid the .u.foos all over the place. The fact that there's a "tag" field isn't part of the semantic model, it's an implementation detail, and one that we could hide with a class hierarchy approach.

With a small helper utility you can express dispatch on the node kind as something like:

visit(expr)
  .Case([&] (VariableExpr &v) { ... handle variable case ... })
  .Case([&] (GetFieldExpr &g) { ... handle get_field case ... })
  .Default([&] { ... handle default case ... });

... which I think would handle your concerns (1), (2), and (3).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh, Smalltalk style control flow. Fun! Yes, I can see that working just fine.
Does this approach provide exhaustiveness checking for the cases? That's pretty important
when adding new features to a language.
The approach you're proposing has the great advantage of being more type safe than the
old-school enum-struct-union approach.
The down side is that it adds some complexity to the code which affects readability and
reasoning about the semantics... e.g. one needs to completely understand order of evaluation
of the code to understand what the executable semantics is saying about the language being defined.
In this case I think the added complexity is relatively small so that's not a big down-side.

Regarding the pointers everywhere, yes, that was to keep the members trivial, which
was forced on me by Bison. I'd love to have a solution for that.

proposals/p0162.md Show resolved Hide resolved
Comment on lines +470 to +471
The `pattern` non-terminal is defined in terms of `expression`. It could instead
be the other way around, defining `expression` in terms of `pattern`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a specification perspective, I'd prefer that we define a parameterized grammar. One way to do this: add an optional suffix to each production, and say that where the suffix is omitted, the implication is that the production is the same for all possible suffixes. So:

patexp ::= patexp [ patexp ]
...
patexpP ::= patexpP : identifier
pattern ::= patexpP
expression ::= patexpE

where

patexp ::= patexp [ patexp ]

is shorthand for

patexpP ::= patexpP [ patexpP ]
patexpE ::= patexpE [ patexpE ]

The ECMA-262 specification does something like this, though their approach is a bit more explicit.

The grammar rule

```Bison
expression: "fnty" tuple return_type
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I filed issue #188 "Repeated 'fnty' in curried function types".

@jonmeow jonmeow added proposal accepted Decision made, proposal accepted and removed comment deadline labels Nov 10, 2020
This was referenced Nov 10, 2020
Copy link
Contributor

@jonmeow jonmeow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've filed #191 and #192 on behalf of the core team to track some important open questions arising from this proposal. However, please go ahead and commit this proposal as-is. #190 and opening tracking issues should be able to address any remaining open questions.

Comment on lines +159 to +166
The grammar rule

```Bison
expression: "fnty" tuple return_type
```

is for function types. They are meant to play the role that function pointers
play in C and that `std::function` plays in C++.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tracking issue for this question: #191

@jonmeow
Copy link
Contributor

jonmeow commented Nov 30, 2020

@jsiek This PR would have been fine to commit after being accepted. Can you commit it?

@jsiek
Copy link
Contributor Author

jsiek commented Nov 30, 2020

@jonmeow I'm not sure what you mean by "commit" it. Do you mean "merge"? I see a button for that but it says I'm not authorized.

@jonmeow
Copy link
Contributor

jonmeow commented Nov 30, 2020

Yes, merge. Can you resolve the README.md conflict? (merge changes from trunk to your branch, pre-commit run -a, add the generated changes) I'll check with Chandler if he had an intent behind restricting merges to the core team and mtainainers.

@jsiek
Copy link
Contributor Author

jsiek commented Nov 30, 2020

Ok, looks like I've fixed the conflict in the README. Please go ahead and merge.

@jonmeow
Copy link
Contributor

jonmeow commented Nov 30, 2020

I think you should have access to merge now: can you please try, to verify? (not quite set up how I'd like long-term, but should be good enough for now)

@jsiek
Copy link
Contributor Author

jsiek commented Dec 1, 2020

I don't see a button for merge on this github.com page. How do I merge this PR into trunk?

@jsiek jsiek merged commit e07d945 into carbon-language:trunk Dec 1, 2020
@jsiek
Copy link
Contributor Author

jsiek commented Dec 1, 2020

Yeah, the merge button appeared!

@jonmeow
Copy link
Contributor

jonmeow commented Dec 1, 2020

👍

jonmeow added a commit that referenced this pull request Dec 1, 2020
- [Proposal PR](#162)
- [RFC topic](https://forums.carbon-lang.dev/t/rfc-basic-syntax/142)
- [Decision request](https://forums.carbon-lang.dev/t/request-for-decision-basic-syntax-162/165)
- [Decision announcement](https://forums.carbon-lang.dev/t/accepted-basic-syntax-162/170)

Tracking issues filed as part of decision:

- Should there be a function type? #191
- Should types be values? #192

Finalized on 2020-11-24
@jonmeow
Copy link
Contributor

jonmeow commented Jan 13, 2021

I just noticed the executable syntax code hasn't made it to a separate PR. Would it help if I pull that code together for commit?

@jsiek
Copy link
Contributor Author

jsiek commented Jan 13, 2021

Hi Jon, that would be great. The code that exactly corresponds to the basic syntax proposal is no longer on github... I removed it as per your request. It's on my hard drive. How would you like me to communicate that to you?

P.S. The executable-semantics branch on my fork of carbon-lang has a bunch of extra stuff, experiments, that were not in the basic syntax proposal.

@jonmeow
Copy link
Contributor

jonmeow commented Jan 13, 2021

I've done a first pass pulling in the code from the previous state of this PR:
#237

I'll work on cleaning that up (still plenty to do, it's not compiling for me at present). If you think it's worth pulling in the rest at the same time, an easy way to share it would be to commit it to a branch on your fork and/or make a PR (could even close the PR once done). I could grab it via GitHub via either of those. An alternative would be that, once I get that cleaned up for commit, maybe it'll be easier for you to contribute more incrementally? Perhaps we can figure that out once this first lump is in?

@jsiek
Copy link
Contributor Author

jsiek commented Jan 13, 2021

Ahh, good idea to grab the code from the previous state.

I think it will be easier to go with the incremental approach.
Thanks!

chandlerc pushed a commit that referenced this pull request Jun 28, 2022
* first draft of proposal for basic syntax

* rename proposal

* formating

* fixing typos

* precendence and associativity

* minor edit

* added abstract syntax

* some revisions based on feedback from meeting today

* optional return type

* oops, not optional for function declarations

* replacing abbreviations with full names

* name changes

* Update proposals/p0162.md

Co-authored-by: Dmitri Gribenko <[email protected]>

* removed = from precedence table

* for fun decl, back to optional return type, shorthand for void return

* more fiddling with return types

* change base case of statement_list to empty

* added pattern non-terminal

* added  expression style function definitions

* change && to and, || to or

* change  and  to have equal precedence

* updating text to match grammar, fix typo in grammar regarding pattern

* adding trailing comma thing to tuples

* removed 'alt' keyword, not necessary

* flipped expression and pattern

* added period for named arguments and documented the reason

* change alternative syntax to use tuple instead of expression

* comment about abstract syntax

* code block language annotations

* added alternative designs, some cleanup for pre-commit

* spell checking

* filled out the TOC

* minor edits

* minor edit

* trying to fix pre-commit error

* changes from pre-commit?

* changes based on meeting today

* describe alternatives regarding methods

* more rationale in discussion of alternatives

* addressing comments

* typo fix, added text about next steps

* removed *, changed a ! to not

* Update proposals/p0162.md

Co-authored-by: Geoff Romer <[email protected]>

* edits from feedback

* pre-commit

* added second reason for period in field initializer

* added executable semantics, fixed misunderstanding regarding associativity

* update README

* added a paragraph about tuples and tuple types

* Update proposals/p0162.md

Co-authored-by: josh11b <[email protected]>

* addressing comments from josh11b

* Update executable-semantics/README.md

Co-authored-by: Dmitri Gribenko <[email protected]>

* link to implicit parameters in generics proposal

* added non-terminal for designator as per geoffromer's suggestion

* changed handling of tuples in function call and similar places as per josh11b and zygoloid

* moving code to separate PR

Co-authored-by: Dmitri Gribenko <[email protected]>
Co-authored-by: Geoff Romer <[email protected]>
Co-authored-by: josh11b <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes PR meets CLA requirements according to bot. proposal accepted Decision made, proposal accepted proposal rfc Proposal with request-for-comment sent out proposal A proposal
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants