Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Basic Syntax #162

Merged
merged 59 commits into from
Dec 1, 2020
Merged
Changes from 14 commits
Commits
Show all changes
59 commits
Select commit Hold shift + click to select a range
b43a9ce
first draft of proposal for basic syntax
jsiek Sep 19, 2020
53b8dd5
rename proposal
jsiek Sep 19, 2020
f048c11
formating
jsiek Sep 19, 2020
86cedc3
fixing typos
jsiek Sep 19, 2020
eed621c
precendence and associativity
jsiek Sep 21, 2020
e26781c
minor edit
jsiek Sep 21, 2020
c8d833e
added abstract syntax
jsiek Sep 22, 2020
974d91a
some revisions based on feedback from meeting today
jsiek Sep 23, 2020
2279899
optional return type
jsiek Sep 23, 2020
42068f7
oops, not optional for function declarations
jsiek Sep 23, 2020
c6daf89
replacing abbreviations with full names
jsiek Sep 24, 2020
556deab
name changes
jsiek Sep 24, 2020
0855d05
Update proposals/p0162.md
jsiek Sep 24, 2020
b02522b
removed = from precedence table
jsiek Sep 24, 2020
43c4b9f
for fun decl, back to optional return type, shorthand for void return
jsiek Sep 25, 2020
28bfaf5
more fiddling with return types
jsiek Sep 26, 2020
3ef030a
change base case of statement_list to empty
jsiek Sep 26, 2020
dd4cf52
added pattern non-terminal
jsiek Sep 27, 2020
34b50f2
added expression style function definitions
jsiek Sep 27, 2020
5f95ac3
change && to and, || to or
jsiek Sep 27, 2020
2cc4f67
change and to have equal precedence
jsiek Sep 27, 2020
e0afba2
updating text to match grammar, fix typo in grammar regarding pattern
jsiek Sep 27, 2020
fa2695c
adding trailing comma thing to tuples
jsiek Sep 27, 2020
dc80cbd
removed 'alt' keyword, not necessary
jsiek Sep 27, 2020
c611253
flipped expression and pattern
jsiek Sep 28, 2020
e780d81
added period for named arguments and documented the reason
jsiek Sep 30, 2020
24c2eac
change alternative syntax to use tuple instead of expression
jsiek Sep 30, 2020
6e8dc8a
comment about abstract syntax
jsiek Sep 30, 2020
097a10d
code block language annotations
jsiek Oct 1, 2020
c1a3e66
added alternative designs, some cleanup for pre-commit
jsiek Oct 1, 2020
c12a1e5
spell checking
jsiek Oct 1, 2020
58106f7
filled out the TOC
jsiek Oct 1, 2020
8472acf
minor edits
jsiek Oct 1, 2020
e1149fe
minor edit
jsiek Oct 1, 2020
5f6ed24
trying to fix pre-commit error
jsiek Oct 1, 2020
edc65bd
changes from pre-commit?
jsiek Oct 7, 2020
9555b53
changes based on meeting today
jsiek Oct 7, 2020
2386f28
describe alternatives regarding methods
jsiek Oct 7, 2020
246da20
more rationale in discussion of alternatives
jsiek Oct 7, 2020
0b6dd7a
addressing comments
jsiek Oct 8, 2020
4d66890
typo fix, added text about next steps
jsiek Oct 21, 2020
5918704
removed *, changed a ! to not
jsiek Oct 22, 2020
c2d963c
Update proposals/p0162.md
jsiek Oct 23, 2020
a76b975
edits from feedback
jsiek Oct 23, 2020
6aa6b6c
Merge branch 'basic-syntax' of https://github.com/jsiek/carbon-lang i…
jsiek Oct 23, 2020
9ca7249
pre-commit
jsiek Oct 23, 2020
df011b3
added second reason for period in field initializer
jsiek Oct 23, 2020
e62dac3
added executable semantics, fixed misunderstanding regarding associat…
jsiek Oct 24, 2020
0e38274
update README
jsiek Oct 24, 2020
49949e8
added a paragraph about tuples and tuple types
jsiek Oct 25, 2020
7292650
Update proposals/p0162.md
jsiek Oct 28, 2020
cb7efb6
addressing comments from josh11b
jsiek Oct 28, 2020
01045a8
Merge branch 'basic-syntax' of https://github.com/jsiek/carbon-lang i…
jsiek Oct 28, 2020
fa8a246
Update executable-semantics/README.md
jsiek Oct 28, 2020
6462c6f
link to implicit parameters in generics proposal
jsiek Oct 28, 2020
535d46f
added non-terminal for designator as per geoffromer's suggestion
jsiek Oct 29, 2020
1036bec
changed handling of tuples in function call and similar places as per…
jsiek Oct 29, 2020
2a81ff4
moving code to separate PR
jsiek Oct 30, 2020
6020b32
resolved merge conflict in proposals/README
jsiek Nov 30, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
373 changes: 373 additions & 0 deletions proposals/p0162.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,373 @@
# Syntax of Basic Carbon Features

<!--
Part of the Carbon Language project, under the Apache License v2.0 with LLVM
Exceptions. See /LICENSE for license information.
SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-->

[Pull request](https://github.com/carbon-language/carbon-lang/pull/0162)

## Table of contents

<!-- toc -->

- [Problem](#problem)
- [Background](#background)
- [Proposal](#proposal)

<!-- tocstop -->

## Problem

The purpose of this proposal is to establish some basic syntactic
elements of the Carbon language and make sure that the grammar is
unambiguous and can be parsed by an LALR parser such as `yacc` or
`bison`. The grammar presented here has indeed been checked by
`bison`. The language features in this basic grammar include control
flow via `if` and `while`, functions, simple structures, choice, and
pattern matching. The main syntactic categories are `declaration`,
`statement`, and `expression`. Establishing these syntactic categories
should help the other proposals choose syntax that is compatible with
the rest of the language.

## Background

The grammar proposed here is based on the following proposals:

* Carbon language overview []()
jsiek marked this conversation as resolved.
Show resolved Hide resolved
* proposals for pattern matching
[#87](https://github.com/carbon-language/carbon-lang/pull/87),
* structs [#98](https://github.com/carbon-language/carbon-lang/pull/98),
* tuples [#111](https://github.com/carbon-language/carbon-lang/pull/111), and
* metaprogramming [#89](https://github.com/carbon-language/carbon-lang/pull/89).

There may be places that this grammar does not accurately capture what
was intended in those proposal, which should trigger some useful
discussion and revisions.

## Proposal
jsiek marked this conversation as resolved.
Show resolved Hide resolved

We summarize the three main syntactic categories here and define the
grammar in the next section.

* `declaration` includes function, structure, and choice definitions.

* `statement` includes local variable definitions, assignment, blocks, `if`,
`while`, `match`, `break`, `continue`, and `return`.

* `expression` plays three roles. In an initial attempt these roles
were separate, with three different syntactic categories, but that
led to ambiguities in the grammar. Folding them into one category
resolved the ambiguities. The three roles will be teased apart in
the static and dynamic semantics.

1. The `expression` category plays the usual role of expressions
that produce a value, such as integer literals and arithmetic
expression.

2. To fascilitate metaprogramming and reflection, `expression`
also includes type expressions, such as function types
and variables that are aliases for types.

3. `expression` is also used for patterns, for example, in the
`case` of a `match`, in the parameters of a function, and on
the left-hand side of variable definitions.

The proposal also specifies the abstract syntax.

## Details

### Expressions

The following grammar defines the concrete syntax for
expressions. Below we comment on a few unusual aspects of the grammar.

expression:
jsiek marked this conversation as resolved.
Show resolved Hide resolved
identifier
| expression '.' identifier
| expression '[' expression ']'
| expression ':' identifier
jsiek marked this conversation as resolved.
Show resolved Hide resolved
| integer_literal
| "true"
| "false"
| tuple
| expression "==" expression
| expression '+' expression
| expression '-' expression
| expression "&&" expression
| expression "||" expression
jsiek marked this conversation as resolved.
Show resolved Hide resolved
| '!' expression
| '-' expression
| expression tuple
| "auto"
| "fn" tuple "->" expression
jsiek marked this conversation as resolved.
Show resolved Hide resolved
;
tuple:
'(' field_list ')'
;
field_list:
/* empty */
| field
| field ',' field_list
;
field:
expression
| identifier '=' expression
jsiek marked this conversation as resolved.
Show resolved Hide resolved
;

The grammar rule

expression: expression ':' identifier

is for pattern variables. For example, in a variable definition such as

var Int: x = 0;

the `Int: x` is parsed with the grammar rule for pattern variables.
In the above grammar rule, the `expression` to the left of the `:`
must evaluate to a type at compile time.

The grammar rule

tuple: '(' field_list ')'

is primarily for constructing a tuple, but it is also used for
creating tuple types and tuple patterns, depending on the context in
which the expression occurs.

### Statements

The following grammar defines the concrete syntax for statements.

statement:
"var" expression '=' expression ';'
| expression '=' expression ';'
| expression ';'
| "if" '(' expression ')' statement "else" statement
| "while" '(' expression ')' statement
| "break" ';'
| "continue" ';'
| "return" expression ';'
| '{' statement_list '}'
| "match" '(' expression ')' '{' clause_list '}'
;
statement_list:
statement
| statement statement_list
;
clause_list:
/* empty */
| clause clause_list
;
clause:
"case" expression "=>" statement
| "default" "=>" statement
;

In the grammar rule for the variable definition statement

statement: "var" expression '=' expression ';'

the left-hand-side `expression` is used as a pattern, so it would
typically evaluate to a variable pattern or some other kind of value
(such as a tuple) that contains variable patterns.

Likewise, in the rule for `case`

clause: "case" expression "=>" statement

the `expression` is used as a pattern.


### Declarations

The following grammar defines the concrete syntax for declarations.

declaration:
"fn" identifier tuple return_type '{' statement_list '}'
jsiek marked this conversation as resolved.
Show resolved Hide resolved
| "fn" identifier tuple ARROW expression ';'
jsiek marked this conversation as resolved.
Show resolved Hide resolved
| "struct" identifier '{' member_list '}'
| "choice" identifier '{' alternative_list '}'
;
return_type:
/* empty */
| ARROW expression
;
member:
"var" expression ':' identifier ';'
| "method" identifier expression "->" expression '{' statement_list '}'
;
member_list:
/* empty */
| member member_list
;
alternative:
"alt" identifier expression ';'
;
alternative_list:
/* empty */
| alternative alternative_list
;
declaration_list:
/* empty */
| declaration declaration_list
;

In the grammar fule for function definitions

declaration: "fn" identifier tuple return_type '{' statement_list '}'

the `tuple` is used as a pattern to describe the parameters of the
function whereas the `expression` in the `return_type` must evaluate
to a type at compile-time. The grammar for function definitions does
not currently include implicit parameters, but the intent is to add
them to the grammar in the future.

In the rule for field declarations

member: "var" expression ':' identifier ';'

the `expression` must evaluate to a type at compile time.
The same is true for the `expression` in the grammar
rule for an alternative:

alternative: "alt" identifier expression ';'

### Precedence and Associativity

The following defines the precendence and associativity of symbols
used in the grammar. The ordering is from lowest to highest
precendence. The main goal of the choices here is to stay close to
C++.

nonassoc '{' '}'
nonassoc ':' ','
left "||"
left "&&"
jsiek marked this conversation as resolved.
Show resolved Hide resolved
left "=="
left '+' '-'
right '!' '*' '&'
left '.' "->"
nonassoc '(' ')' '[' ']'

For more information on operators and precedence, see proposal
[#168](https://github.com/carbon-language/carbon-lang/pull/168).
jsiek marked this conversation as resolved.
Show resolved Hide resolved

### Abstract Syntax

The output of parsing is an abstract syntax tree. There are many ways
to define abstract syntax. Here I'll simply use C-style `struct`
definitions.


#### Abstract Syntax for Expressions

enum ExpressionKind { Variable, PatternVariable, Dereference, Int, Bool,
PrimitiveOp, Call, Tuple, Index, GetField,
IntT, BoolT, TypeT, FunctionT, AutoT };
enum Operator { Neg, Add, Sub, Not, And, Or, Eq };

struct Expression {
ExpressionKind tag;
union {
struct { string* name; } variable;
struct { Expression* aggregate; string* field; } get_field;
struct { Expression* aggregate; Expression* offset; } index;
struct { string* name; Expression* type; } pattern_variable;
int integer;
bool boolean;
struct { vector<pair<string,Expression*> >* fields; } tuple;
struct {
Operator operator_;
vector<Expression*>* arguments;
} primitive_op;
struct { Expression* function; Expression* argument; } call;
struct { Expression* parameter; Expression* return_type;} function_type;
} u;
};

The correspondence between most of the grammar rules and the abstract
syntax is straightforward. However, the parsing the `field_list`
deserves some explanation. The fields can be labeled with the grammar
rule:

field: identifier '=' expression

or unlabeled, with the rule

field: expression

The unlabeled fields are given numbers (represented as strings) for
field labels, starting with 0 and going up from left to right.

Regarding the rule for tuples:

tuple: '(' field_list ')'

if the field list only has a single unlabeled item, then the parse
result for that item is returned. Otherwise a `tuple` AST node is
created containing the parse results for the fields.
jsiek marked this conversation as resolved.
Show resolved Hide resolved


#### Abstract Syntax for Statements

enum StatementKind { ExpressionStatement, Assign, VariableDefinition,
Delete, If, Return, Sequence, Block, While, Break,
Continue, Match };

struct Statement {
StatementKind tag;
union {
Expression* exp;
struct { Expression* lhs; Expression* rhs; } assign;
struct { Expression* pat; Expression* init; } variable_definition;
struct { Expression* cond; Statement* thn; Statement* els; } if_stmt;
Expression* return_stmt;
struct { Statement* stmt; Statement* next; } sequence;
struct { Statement* stmt; } block;
struct { Expression* cond; Statement* body; } while_stmt;
struct {
Expression* exp;
list< pair<Expression*,Statement*> >* clauses;
} match_stmt;
} u;
};

#### Abstract Syntax for Declarations

struct FunctionDefinition {
string name;
Expression* param_pattern;
Expression* return_type;
Statement* body;
};

enum MemberKind { FieldMember };

struct Member {
MemberKind tag;
union {
struct { string* name; Expression* type; } field;
} u;
};

struct StructDefinition {
string* name;
list<Member*>* members;
};

enum DeclarationKind { FunctionDeclaration, StructDeclaration,
ChoiceDeclaration };

struct Declaration {
DeclarationKind tag;
union {
struct FunctionDefinition* fun_def;
struct StructDefinition* struct_def;
struct {
string* name;
list<pair<string, Expression*> >* alts;
} choice_def;
} u;
};