Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: fix #26137, change parsing of unary ops on parenthesized expressions #26154

Merged
merged 3 commits into from
Feb 27, 2018

Conversation

JeffBezanson
Copy link
Member

@JeffBezanson JeffBezanson commented Feb 21, 2018

Part 1: Unary operators

This addresses an ambiguity in the parsing of <unary op>( <expr>... ): is it a prefix function call to the operator, or a unary operator expression with a parenthesized argument?

Our current approach is to treat it as a unary operator expression most of the time, with a special case for +(x,y) to treat that as a 2-argument call, since calling a unary operator on a tuple is not useful, and to make method definitions on + easier. This approach has some bugs. One is that +((1,2)) is also treated as a 2-argument call. Another is that +(a=1, b=2) is not correctly parsed as a keyword argument call. Another is that -(x,y)^2 is parsed as -((x,y)^2), likely not what you want since you can't square a tuple.

Given -(x), one might wonder whether there is any difference between considering it a unary operator expression or a prefix function call. Both interpretations result in calling - with a single argument. However, there are two differences at parse time:

  • -(x)^2 needs to parse like -(x^2). Function calls have higher precedence than ^, but a leading unary operator has lower precedence.
  • Assignments stay as-is in operator expressions, but are interpreted as keyword arguments in function calls. This affects e.g. -(a=1).

This PR implements a look-ahead trick to try to improve the situation. I recommend reading the test cases to see what it does.

The basic idea is that, given -(....)^, I first parse just the parenthesized part. If it looks anything like an argument list, then this is interpreted as a prefix function call to -. If it's just a simple parenthesized expression, then it's treated as a unary operator. This also handles -((1,2)), which is now parsed as a one-argument call to - where the argument is a 2-tuple.

A weird case is -(a=1)^2, which assigns a=1 and then computes -1^2 (this is the same as what we did before).

Another weird case is -(x...)^2. This PR interprets it as a prefix function call, which seems more logical to me.

Clearly defining "looks like an argument list" is the subject of the next section.

Part 2: Parenthesized block syntax

There is a potential ambiguity between whether a parenthesized expression is a tuple ((1,2)) or a block of statements ((1;2)) when you write strange combinations of commas and semicolons. For example in 0.6 we have

julia> (1;2,)
2

julia> (1;2,4)
ERROR: syntax: unexpected semicolon in tuple

Huh?
The third commit cleans this up. One of the following must be true to trigger tuple parsing:

  • it is empty
  • there is a comma
  • it begins with x... and isn't just (x...)
  • it begins with (; (corresponding to the positional part being empty)

Anything beginning with (x; will be considered a block; you need a comma (x,;) to get tuple parsing. This could become relevant in the future if e.g. (2,; a=1) becomes valid syntax for some sort of named-tuple-like thing.

This is related to part 1 above because anything parsed as a tuple by these rules becomes an argument list in a prefix function call to the operator, while for anything else we use unary operator syntax.

Part 3: Two other misc changes

  • Add line numbers to blocks written as (a; b; c)
  • Preserve empty parameters blocks in parsing e.g. f(x;) (somewhat in line with AST Format Cleanup #21774)

@JeffBezanson JeffBezanson added breaking This change will break code parser Language parsing and surface syntax labels Feb 21, 2018
@JeffBezanson
Copy link
Member Author

fixes #26137

@JeffBezanson JeffBezanson added the triage This should be discussed on a triage call label Feb 22, 2018
@test Meta.parse("(x;y)") == Expr(:block, :x, LineNumberNode(1,:none), :y)
@test Meta.parse("(x...;)") == Expr(:tuple, Expr(:parameters), Expr(:(...), :x))
@test Meta.parse("(;x...)") == Expr(:tuple, Expr(:parameters, Expr(:(...), :x)))
@test Meta.parse("(x...;y)") == Expr(:tuple, Expr(:parameters, :y), Expr(:(...), :x))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this require a , like (1...) now does?

Copy link
Member Author

@JeffBezanson JeffBezanson Feb 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It parses, but lowering gives unexpected semicolon in tuple in this case.

@JeffBezanson JeffBezanson removed the triage This should be discussed on a triage call label Feb 22, 2018
@StefanKarpinski StefanKarpinski added the needs news A NEWS entry is required for this change label Feb 22, 2018
@StefanKarpinski StefanKarpinski added this to the 1.0 milestone Feb 22, 2018
@JeffBezanson
Copy link
Member Author

Ok, the vast majority of this is intact, but I had to go back on one of the corner cases. The problem is a case like (x;y,)->x, where we need to parse the parenthesized part before finding out that it will become an argument list for a function. All it prevents us from doing is giving more parse errors, and writing (x; y,z) to write a block whose second statement is a tuple.

… blocks

a parenthesized expression is a tuple when at least one of these is true:

- it is empty
- there is a comma
- it begins with `x...` and isn't just `(x...)`
- it begins with `(;` (corresponding to the positional part being empty)

add line numbers to blocks written as `(a; b; c)`
@JeffBezanson JeffBezanson removed the needs news A NEWS entry is required for this change label Feb 23, 2018
@JeffBezanson JeffBezanson merged commit 1e13b12 into master Feb 27, 2018
@JeffBezanson JeffBezanson deleted the jb/fix26137 branch February 27, 2018 04:27
@sbromberger
Copy link
Contributor

sbromberger commented Feb 27, 2018

Just FYI, this appears to have broken SimpleTraits.jl, which uses <-:

@traitimpl IsAnything{X} <- (x->true)(X)

ref mauro3/SimpleTraits.jl#54

This broke sometime between 4389 and 4401.

cc @mauro3; hat tip @ararslan, @andreasnoack, @KristofferC, and @MikeInnes .

@JeffBezanson
Copy link
Member Author

Hmm, ok, but in my defense there is no <- operator...

@ararslan
Copy link
Member

Arguably SimpleTraits should just stop relying on a < -b to mean something weird.

@sbromberger
Copy link
Contributor

sbromberger commented Feb 27, 2018

This impacts ~317 dependent packages: https://juliaobserver.com/packages?dependent_id=SimpleTraits

@JeffBezanson
Copy link
Member Author

I'll fix this. There are some other less-weird regressions too.

@mauro3
Copy link
Contributor

mauro3 commented Feb 27, 2018

I can also try and find something else which looks good and parses. But if it is ok to keep as was then I will not say no.

JeffBezanson added a commit that referenced this pull request Feb 27, 2018
This affects chains of unary operators, calls, and juxtaposition,
as in `-(f)(x)`.
JeffBezanson added a commit that referenced this pull request Feb 28, 2018
This affects chains of unary operators, calls, and juxtaposition,
as in `-(f)(x)`.
mbauman added a commit that referenced this pull request Mar 5, 2018
…gnment

* master:
  Make stdlib tests runnable standalone. (#26242)
  fix unary-related parsing regressions caused by #26154 (#26239)
  Formatting changes to new SSA IR devdocs [ci skip]
  use medium code model on PPC
  `retry` should support the check function proposed in the docstring. (#26138)
  mention axes in docs instead of size (#26233)
  exclude more CI files from source distro (#25906)
  Describe three-valued logic in docstrings
  deprecate using the value of `.=`. fixes #25954 (#26088)
  backport change to make CodegenPrepare work with isNoopAddrSpaceCast
  optimize the python version of abs2 in the microbenchmarks (#26223)
  Add notes for package maintainers (#25912)
  typo
  Fix broken links to functions in the manual (#26171)
  [NewOptimizer] Track inlining info
  Begin work on new optimizer framework
  add patch to make GC address spaces work on PPC
  also backport sover patch to LLVM 4.0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking This change will break code parser Language parsing and surface syntax
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants