Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changing Space Operator Precedence For Maths #10366

Closed
jdunkerley opened this issue Jun 25, 2024 · 11 comments · Fixed by #10597
Closed

Changing Space Operator Precedence For Maths #10366

jdunkerley opened this issue Jun 25, 2024 · 11 comments · Fixed by #10597
Assignees
Labels

Comments

@jdunkerley
Copy link
Member

Want to evaluate the complexity of making it so whitespace not affect the precedence of mathematical operators (arithmetic, bitwise and logical ops).

We would keep the space precedence for the other operators like . and ->.

a + b>4 . floor == ((a+b)>4).floor NOT (a+(b>4)).floor
a-> b-> a+b

Goal is to understand how hard to change on the engine/parser side and how hard to update all the library code needed.

@jdunkerley jdunkerley changed the title Space Operator Precedence Investigate How Changing Space Operator Precedence For Maths Jun 25, 2024
@JaroslavTulach JaroslavTulach moved this from ❓New to 📤 Backlog in Issues Board Jun 25, 2024
@wdanilo
Copy link
Member

wdanilo commented Jun 26, 2024

My idea would be to introduce another precedence level in parser. I don't think this involves any part of engine beside parser (please correct me if I'm wrong). Now, we hierarchical precedence discovery - first level is "spaced" and "non-spaced". Then in these groups we apply precedences of operators. We could add another group to the top level, so we will have "spaced", "non-spaced", "math" and then just resolve precedences in these groups. Just my idea, might not be the best one, but I shared it because maybe it will be useful to someone :)

@kazcw kazcw removed the -compiler label Jun 26, 2024
@kazcw
Copy link
Contributor

kazcw commented Jun 26, 2024

ExpressionInterpretation (current)Interpretation (Solution 1)Interpretation (Solution 2)

a * b+c . floor

a * ((b + c) . floor)

((a * b) + c) . floor1

a * b+c.floor

a * (b + (c . floor))

((a * b) + (c . floor))1

x+y * z

(x + y) * z

x + (y * z)1

f x+y * z

(f (x + y)) * z

f (x + (y * z))1

unchanged

f x + y

(f x) + y

f (x + y)2

unchanged

f +y

f (+y)

unchanged

f x +y

((f x) (+y))

unchanged

Footnotes

  1. In this case a warning may be emitted that spacing is inconsistent with effective precedence. 2 3 4

  2. The interpretation of this expression has changed, but no warning is emitted. This may take some getting used to for library developers.

@kazcw
Copy link
Contributor

kazcw commented Jun 26, 2024

Solution 0: Math-consistent spacing

Maintain the uniform application of space-precedence rules, but warn if the result is inconsistent with mathematical precedence.

Solution 1: High precedence math

Place math operators in a precedence level between unspaced non-math and spaced non-math.

Current logic:

  • Reduce each sequence of terms that contains no spaces.
  • Reduce the remainder.

New logic:

  • Reduce each sequence of terms that contains no spaces or symmetrically-unspaced binary mathematical operators (but if a binary math operator has an unspaced term on exactly one side, reduce it to an operator section).
  • Reduce each sequence consisting of X (F X)+ (regex notation), where X is a non-operator term and F is a binary mathematical operator.
  • Reduce the remainder.

Solution 2: Space-invariant math precedence

Redefine the effect of spacing in terms of precedence-modification.

Current logic (after refactoring):

  • When an operator is unspaced, its precedence is modified to be higher than any spaced operator.

New logic:

  • When an operator is unspaced, its precedence is modified to be higher than any spaced operator, but when comparing precedence of mathematical operators unmodified precedence is used.

Discussion

Complexity for language users:

  • Solution 0 allows the simplest language rules, but users unfamiliar with space-precedence may be surprised by a warning if they use spacing inconsistent with mathematical precedence.
  • Solution 1 can be explained in terms of a precedence hierarchy, but it would no longer be possible for any operators (or function application) to have a higher precedence than mathematical operators.
  • Solution 2 achieves space-invariant precedence within mathematical subexpressions without losing the high precedence of function application, though it requires a "modified precedence" concept that may be less intuitive than the precedence hierarchy concept of Solution 1.

Implementation

Parsing changes

  • Solution 1: 3 points
  • Solution 2: 5 points

Warnings

  • Parser: Each solution would require different logic for detecting unexpected spacing. The parser has no mechanism for emitting warnings; this would need this to be added to the JNI and JS bindings.
    • Solution 0/1: 2 points
    • Solution 2: 1 point
  • Compiler (? points): The compiler would receive a span->warnings map from the parser. I don't know whether it can make use of this data structure directly, or each warning needs to be attached to the IR with the corresponding span--the latter may take some work. The IDE doesn't need to receive these warnings from the compiler, since the IDE parses too; but the compiler should print them when run standalone, and maybe have a mode to treat parser warnings as errors when running tests etc.

Library changes

parse_all_enso_files.sh can be used to detect expressions whose interpretation have been changed. If there are few they could be updated by hand; if there are many, I could add refactoring suggestions to the warnings, and create a tool that can apply them (2 points).

Total (excluding compiler changes to print warnings)

  • Solution 0: 2 points
  • Solution 1: 7 points, assuming automated refactoring is needed
  • Solution 2: 6-8 points, depending on whether automated refactoring turns out to be needed

@kazcw kazcw modified the milestone: Future Release Jun 27, 2024
@JaroslavTulach
Copy link
Member

My favorite syntax in Enso is fib n-1 n-2 - I don't think it has any bad impact on mathematical precedence. It is natural to read it as: fib (n - 1) (n - 2). Do we foresee a solution where meaning of fib n-1 n-2 remains unchanged?

@kazcw
Copy link
Contributor

kazcw commented Jul 8, 2024

My favorite syntax in Enso is fib n-1 n-2 - I don't think it has any bad impact on mathematical precedence. It is natural to read it as: fib (n - 1) (n - 2). Do we foresee a solution where meaning of fib n-1 n-2 remains unchanged?

I don't think that will change. We are considering either grouping adjacent mathematical operations or ignoring whitespace when comparing precedence of mathematical operators, and neither would affect that.

@JaroslavTulach
Copy link
Member

mathematical operators

Another question which is not clear to me is: how do we differentiate between operators and mathematical operators? Aren't all operators the same in Enso?

@jdunkerley
Copy link
Member Author

The main ones, which I felt we should keep the space precedence on, were . and ->.
But I would guess we would also want <| and |>.

@kazcw
Copy link
Contributor

kazcw commented Jul 8, 2024

mathematical operators

Another question which is not clear to me is: how do we differentiate between operators and mathematical operators? Aren't all operators the same in Enso?

I think the distinction we would want could be described as value-level (math, logic, comparison) vs functional operators (lambda operator, application operators including space, dot operator). User-defined operators could be handled based on their precedence character, with the addition of recognizing <- and -> subsequences as indicating non-mathematical operations (as we already ascribe properties based on the presence of these sequences).

@farmaazon
Copy link
Contributor

Refinement notes:

  1. Let's add tests where we check if applying warnings' suggestions creates the same AST without warnings.
  2. As we assume to auto-fix node edits with warnings, let the warnings be called "(auto) formatting"
  3. But actually applying (or displaying) them in GUI is not in scope of this task.

@enso-bot
Copy link

enso-bot bot commented Jul 17, 2024

Keziah Wesley reports a new STANDUP for today (2024-07-16):

Progress: Implemented new precedence rules, reviewed changed parses. It should be finished by 2024-07-18.

Next Day: Next day I will be working on the #10366 task. Work on warnings.

@kazcw kazcw changed the title Investigate How Changing Space Operator Precedence For Maths Changing Space Operator Precedence For Maths Jul 18, 2024
@kazcw kazcw moved this from 📤 Backlog to 🔧 Implementation in Issues Board Jul 18, 2024
@kazcw kazcw moved this from 🔧 Implementation to 👁️ Code review in Issues Board Jul 18, 2024
@enso-bot
Copy link

enso-bot bot commented Jul 18, 2024

Keziah Wesley reports a new STANDUP for yesterday (2024-07-17):

Progress: Finished implementation; updated libraries; added new linter to run tool and CI. It should be finished by 2024-07-18.

Next Day: Next day I will be working on the #10473 task. Start on context errors

@mergify mergify bot closed this as completed in #10597 Jul 24, 2024
mergify bot pushed a commit that referenced this issue Jul 24, 2024
In a sequence of value-level operators, whitespace does not affect relative precedence. Functional operators still follow the space-precedence rules.

The "functional" operators are: `>> << |> |>> <| <<| : .`, application, and any operator containing `<-` or `->`. All other operators are considered value-level operators.

Asymmetric whitespace can still be used to form *operator sections* of value-level operators, e.g. `+2 * 3` is still equivalent to `x -> (x+2) * 3`.

Precedence of application is unchanged, so `f x+y` is still equivalent to `f (x + y)` and `f x+y * z` is still equivalent to `(f (x + y)) * z`.

Any attempt to use spacing to override value-level operator precedence will be caught by the new enso linter. Mixed spacing (for clarity) in value-operator expressions is allowed, as long as it is consistent with the precedences of the operators.

Closes #10366.

# Important Notes
Precedence warnings:
- The parser emits a warning if the whitespace in an expression is inconsistent with its effective precedence.
- A new enso linter can be run with `./run libraries lint`. It parses all `.enso` files in `distribution/lib` and `test`, and reports any errors or warnings. It can also be run on individual files: `cargo run --release --bin check_syntax -- file1 file2...` (the result may be easier to read than the `./run` output).
- The linter is also run as part of `./run lint`, so it is checked in CI.

Additional language change:
- The exponentiation operator (`^`) now has higher precedence than the multiplication class (`*`, `/`, `%`). This change did not affect any current enso files.

Library changes:
- The libraries have been updated. The new warnings were used to identify all affected code; the changes themselves have not been programmatically verified (in many cases their equivalence relies on the commutativity of string concatenation).
@github-project-automation github-project-automation bot moved this from 👁️ Code review to 🟢 Accepted in Issues Board Jul 24, 2024
jdunkerley pushed a commit that referenced this issue Jul 24, 2024
In a sequence of value-level operators, whitespace does not affect relative precedence. Functional operators still follow the space-precedence rules.

The "functional" operators are: `>> << |> |>> <| <<| : .`, application, and any operator containing `<-` or `->`. All other operators are considered value-level operators.

Asymmetric whitespace can still be used to form *operator sections* of value-level operators, e.g. `+2 * 3` is still equivalent to `x -> (x+2) * 3`.

Precedence of application is unchanged, so `f x+y` is still equivalent to `f (x + y)` and `f x+y * z` is still equivalent to `(f (x + y)) * z`.

Any attempt to use spacing to override value-level operator precedence will be caught by the new enso linter. Mixed spacing (for clarity) in value-operator expressions is allowed, as long as it is consistent with the precedences of the operators.

Closes #10366.

# Important Notes
Precedence warnings:
- The parser emits a warning if the whitespace in an expression is inconsistent with its effective precedence.
- A new enso linter can be run with `./run libraries lint`. It parses all `.enso` files in `distribution/lib` and `test`, and reports any errors or warnings. It can also be run on individual files: `cargo run --release --bin check_syntax -- file1 file2...` (the result may be easier to read than the `./run` output).
- The linter is also run as part of `./run lint`, so it is checked in CI.

Additional language change:
- The exponentiation operator (`^`) now has higher precedence than the multiplication class (`*`, `/`, `%`). This change did not affect any current enso files.

Library changes:
- The libraries have been updated. The new warnings were used to identify all affected code; the changes themselves have not been programmatically verified (in many cases their equivalence relies on the commutativity of string concatenation).

(cherry picked from commit e5b85bf)
@farmaazon farmaazon moved this from 🟢 Accepted to 🗄️ Archived in Issues Board Aug 6, 2024
@jdunkerley jdunkerley moved this from 🗄️ Archived to 🟢 Accepted in Issues Board Oct 22, 2024
@AdRiley AdRiley moved this from 🟢 Accepted to 🗄️ Archived in Issues Board Oct 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: 🗄️ Archived
Development

Successfully merging a pull request may close this issue.

5 participants