Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Format Slice Expressions #5047

Merged
merged 2 commits into from
Jun 21, 2023
Merged

Format Slice Expressions #5047

merged 2 commits into from
Jun 21, 2023

Conversation

konstin
Copy link
Member

@konstin konstin commented Jun 13, 2023

This formats slice expressions and subscript expressions.

Spaces around the colons follows the same rules as black (https://black.readthedocs.io/en/stable/the_black_code_style/current_style.html#slices):

e00 = "e"[:]
e01 = "e"[:1]
e02 = "e"[: a()]
e10 = "e"[1:]
e11 = "e"[1:1]
e12 = "e"[1 : a()]
e20 = "e"[a() :]
e21 = "e"[a() : 1]
e22 = "e"[a() : a()]
e200 = "e"[a() : :]
e201 = "e"[a() :: 1]
e202 = "e"[a() :: a()]
e210 = "e"[a() : 1 :]

Comment placement is different due to our very different infrastructure. If we have explicit bounds (e.g. x[1:2]) all comments get assigned as leading or trailing to the bound expression. If a bound is missing [:], comments get marked as dangling and placed in the same section as they were originally in:

x = "x"[ # a
      # b
    :  # c
      # d
]

to

x = "x"[
    # a
    # b
    :
    # c
    # d
]

Except for the potential trailing end-of-line comments, all comments get formatted on their own line. This can be improved by keeping end-of-line comments after the opening bracket or after a colon as such but the changes were already complex enough.

I added tests for comment placement and spaces.

@konstin
Copy link
Member Author

konstin commented Jun 13, 2023

Current dependencies on/for this PR:

This comment was auto-generated by Graphite.

@konstin konstin force-pushed the format-expr-slice branch from adc3036 to aaaa79d Compare June 13, 2023 10:38
@github-actions
Copy link
Contributor

github-actions bot commented Jun 13, 2023

PR Check Results

Ecosystem

✅ ecosystem check detected no changes.

Benchmark

Linux

group                                      main                                   pr
-----                                      ----                                   --
formatter/large/dataset.py                 1.01      6.6±0.03ms     6.2 MB/sec    1.00      6.5±0.02ms     6.3 MB/sec
formatter/numpy/ctypeslib.py               1.01   1355.0±2.03µs    12.3 MB/sec    1.00   1341.3±3.36µs    12.4 MB/sec
formatter/numpy/globals.py                 1.00    129.8±0.98µs    22.7 MB/sec    1.00    130.0±1.55µs    22.7 MB/sec
formatter/pydantic/types.py                1.01      2.7±0.01ms     9.3 MB/sec    1.00      2.7±0.00ms     9.4 MB/sec
linter/all-rules/large/dataset.py          1.00     13.2±0.09ms     3.1 MB/sec    1.02     13.5±0.10ms     3.0 MB/sec
linter/all-rules/numpy/ctypeslib.py        1.00      3.4±0.01ms     5.0 MB/sec    1.00      3.4±0.02ms     5.0 MB/sec
linter/all-rules/numpy/globals.py          1.00    423.7±0.90µs     7.0 MB/sec    1.00    425.8±1.00µs     6.9 MB/sec
linter/all-rules/pydantic/types.py         1.00      5.9±0.05ms     4.4 MB/sec    1.01      5.9±0.05ms     4.3 MB/sec
linter/default-rules/large/dataset.py      1.00      6.7±0.03ms     6.1 MB/sec    1.01      6.8±0.04ms     6.0 MB/sec
linter/default-rules/numpy/ctypeslib.py    1.00   1467.2±7.27µs    11.3 MB/sec    1.01   1478.5±1.93µs    11.3 MB/sec
linter/default-rules/numpy/globals.py      1.00    164.2±0.87µs    18.0 MB/sec    1.01    166.3±0.31µs    17.7 MB/sec
linter/default-rules/pydantic/types.py     1.00      3.0±0.01ms     8.4 MB/sec    1.00      3.1±0.01ms     8.4 MB/sec

Windows

group                                      main                                   pr
-----                                      ----                                   --
formatter/large/dataset.py                 1.12     10.3±0.51ms     4.0 MB/sec    1.00      9.2±0.28ms     4.4 MB/sec
formatter/numpy/ctypeslib.py               1.07      2.1±0.08ms     8.1 MB/sec    1.00  1909.1±77.11µs     8.7 MB/sec
formatter/numpy/globals.py                 1.05    197.5±8.07µs    14.9 MB/sec    1.00    188.2±7.54µs    15.7 MB/sec
formatter/pydantic/types.py                1.05      4.1±0.14ms     6.2 MB/sec    1.00      3.9±0.13ms     6.5 MB/sec
linter/all-rules/large/dataset.py          1.00     18.6±0.39ms     2.2 MB/sec    1.00     18.6±0.40ms     2.2 MB/sec
linter/all-rules/numpy/ctypeslib.py        1.00      4.9±0.10ms     3.4 MB/sec    1.01      4.9±0.27ms     3.4 MB/sec
linter/all-rules/numpy/globals.py          1.03   616.1±19.68µs     4.8 MB/sec    1.00   595.4±21.85µs     5.0 MB/sec
linter/all-rules/pydantic/types.py         1.00      8.2±0.23ms     3.1 MB/sec    1.01      8.3±0.26ms     3.1 MB/sec
linter/default-rules/large/dataset.py      1.05      9.9±0.27ms     4.1 MB/sec    1.00      9.4±0.25ms     4.3 MB/sec
linter/default-rules/numpy/ctypeslib.py    1.05      2.1±0.06ms     7.9 MB/sec    1.00  1998.5±69.67µs     8.3 MB/sec
linter/default-rules/numpy/globals.py      1.07    243.2±9.55µs    12.1 MB/sec    1.00    227.1±8.76µs    13.0 MB/sec
linter/default-rules/pydantic/types.py     1.08      4.5±0.17ms     5.7 MB/sec    1.00      4.2±0.12ms     6.1 MB/sec

@konstin konstin requested a review from MichaReiser June 19, 2023 06:36
@konstin konstin added the formatter Related to the formatter label Jun 19, 2023
@konstin konstin force-pushed the format-expr-slice branch from aaaa79d to 0402d6e Compare June 19, 2023 06:44
@@ -833,6 +835,87 @@ fn handle_module_level_own_line_comment_before_class_or_function_comment<'a>(
}
}

/// Handles the attaching the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment seems incomplete

crates/ruff_python_formatter/src/comments/placement.rs Outdated Show resolved Hide resolved
crates/ruff_python_formatter/src/comments/placement.rs Outdated Show resolved Hide resolved
@@ -439,6 +439,14 @@ impl<'a> DecoratedComment<'a> {
pub(super) fn text_position(&self) -> CommentTextPosition {
self.text_position
}

/// Change the position
pub(super) fn change_position(self, text_position: CommentTextPosition) -> Self {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I think taking a &mut self would be fine here. It could simplify some of your call sites .

crates/ruff_python_formatter/src/expression/expr_slice.rs Outdated Show resolved Hide resolved
crates/ruff_python_formatter/src/expression/expr_slice.rs Outdated Show resolved Hide resolved
Comment on lines 70 to 73
if !lower_trailing_comments.is_empty() {
// make sure the colon is after the comments
hard_line_break().fmt(f)?;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the following also work? It would be cheaper because querying comments is somewhat expensive (not very but it requires a hash map lookup).

write!(f, [group(&format_args[lower.format(), soft_line_break(), first_colon_space, text(:)])]?;

Or does that not work because we want to keep the : on the same line as lower even if it expands?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that sounds like a really complex rewrite and i'd like to avoid that 😬

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could give it a try for a single expression AND/OR try it as a separate PR.

crates/ruff_python_formatter/src/expression/expr_slice.rs Outdated Show resolved Hide resolved
crates/ruff_python_formatter/src/expression/expr_slice.rs Outdated Show resolved Hide resolved
Comment on lines 121 to 122
if !dangling.is_empty() {
// put the colon and comments on their own lines
hard_line_break().fmt(f)?;
}
dangling_comments(dangling).fmt(f)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: We shouldn't call into the dangling comments formatting if we know that there are none.

Suggested change
if !dangling.is_empty() {
// put the colon and comments on their own lines
hard_line_break().fmt(f)?;
}
dangling_comments(dangling).fmt(f)?;
if !dangling.is_empty() {
// put the colon and comments on their own lines
write!(f, [hard_line_break(), dangling_comments(dangling)])?;
}

@konstin konstin requested a review from MichaReiser June 19, 2023 11:10
@konstin konstin force-pushed the format-expr-slice branch from 3b225cb to c10d30c Compare June 21, 2023 09:42
second_colon
.as_ref()
.map_or(slice_dangling_comments.len(), |second_colon| {
slice_dangling_comments[first_colon_partition_index..]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Would it make sense to already split the halves to avoid the manual slicing e.g. on line 75.

let (dangling_first_colon_comments, slice_dangling_comments) = slice_dangling_comments.split_at(first_colon_partition_index);

It also removes the need to manually add the first_colon_partition_index offset to the computed second_colon_partition_index

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, i refactored the whole splitting

Comment on lines 69 to 73
lower.format().fmt(f)?;
if comments.has_trailing_comments(lower.as_ref()) {
// make sure the colon is after the comments
hard_line_break().fmt(f)?;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit. You could also use a line_suffix_boundary here. That removes the need for checking if there's a comment. A line_suffix_boundary automatically enforces a hard line break if there's any pending line_suffix (trailing comment)

Suggested change
lower.format().fmt(f)?;
if comments.has_trailing_comments(lower.as_ref()) {
// make sure the colon is after the comments
hard_line_break().fmt(f)?;
}
write!(f, [left.format(), line_suffix_boundary()])?;

if !all_simple {
space().fmt(f)?;
}
text(":").fmt(f)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't like the write macro, do you ;)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

procedural code is kinda convenient 🙈

Comment on lines +253 to +259
fn leading_comments_spacing(
f: &mut PyFormatter,
leading_comments: &[SourceComment],
) -> FormatResult<()> {
if let Some(first) = leading_comments.first() {
if first.line_position().is_own_line() {
// Insert a newline after the colon so the comment ends up on its own line
hard_line_break().fmt(f)?;
} else {
// Insert the two spaces between the colon and the end-of-line comment after the colon
write!(f, [space(), space()])?;
}
}
Ok(())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: An alternative would be to format all end_of_line comments as trailing_comments which does the whitespace normalization for you but it may be more complicated in the end

Comment on lines 107 to 111
upper.format().fmt(f)?;
if comments.has_trailing_comments(upper.as_ref()) {
// make sure the colon is after the comments
hard_line_break().fmt(f)?;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: use a line suffix boundary to avoid checking comments for trailing comments


let comments = f.context().comments().clone();
let dangling_comments = comments.dangling_comments(item.as_any_node_ref());
debug_assert!(dangling_comments.len() <= 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason for this assertion? What happens if it is violated in production? Is it necessary because we use trailing_comments instead of dangling_comments during formatting?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only dangling comment a subscript can have is the one after the [, the rest has been assigned to the slice expression.

What happens if it is violated in production?

Bad, potentially unstable formatting. I can see no case where it would be violated, and there is none in the cpython code base.

@konstin konstin force-pushed the format-expr-slice branch from 9e568e0 to ad00427 Compare June 21, 2023 14:22
@konstin konstin enabled auto-merge (squash) June 21, 2023 15:03
@konstin konstin merged commit 6155fd6 into main Jun 21, 2023
@konstin konstin deleted the format-expr-slice branch June 21, 2023 15:09
@konstin konstin mentioned this pull request Jun 22, 2023
72 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
formatter Related to the formatter
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants