Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to make expressions multiline based on number of subexpressions rather than character length #1143

Closed
3 of 5 tasks
Fizzixnerd opened this issue Sep 18, 2020 · 8 comments

Comments

@Fizzixnerd
Copy link
Contributor

I propose we add options for controlling when to make certain constructs multiline based on the logical size of those constructs (as opposed to their character width). See below for examples.

The existing way of Fantomas deals with this problem is using character width as the primary method of determining when to split expressions on multiple lines.

Pros and Cons

The advantages of making this adjustment to Fantomas are that code becomes more consistent stylistically, especially for similarly structured expressions:

// input
let xs = [ 1; 2; 3; 4; 5; 6 ]
let ys = [ "one"; "two"; "three"; "four"; "five"; "six" ]
let zs = [ 1 .. 10 ]
ys |> List.zip zs |> List.reverse |> someFunctionExpr
ys |> List.zip xs |> List.head |> someFunctionExpr

// current Fantomas output
let xs = [ 1; 2; 3; 4; 5; 6 ]

let ys =
    [ "one"
      "two"
      "three"
      "four"
      "five"
      "six" ]

let zs = [ 1 .. 10 ]

ys
|> List.zip zs
|> List.reverse
|> someFunctionExpr

ys |> List.zip xs |> List.head |> someFunctionExpr

// proposed Fantomas output
let xs =
    [ 1
      2
      3
      4
      5
      6 ]

let ys =
    [ "one"
      "two"
      "three"
      "four"
      "five"
      "six" ]

let zs = [ 1 .. 10 ]

ys
|> List.zip zs
|> List.reverse
|> someFunctionExpr

ys
|> List.zip xs
|> List.head
|> someFunctionExpr

I (and the people I work with) believe these "parallel expressions" being formatted differently due to line length makes the code harder to understand on a structural level.

The disadvantages of making this adjustment to Fantomas are the added complexity of supporting more different output formats, as well as the configuration options to enable the alternative formatting behavior.

Examples

Records

Desired Formatting

// desired to be single line, since only one field
type A = { x: int }

let a = { x = 0 }

// desired to be multiline, since multiple fields
type B =
    { y: int
      z: string }

// desired to be multiline, since multiple fields
let b =
   { y = 0
     z = "" }

// desired to be single line, even though long because single field
let a' = { a with x = AReallyLongExpressionThatIsMuchLongerThan50Characters }

// desired to be multiline, even though short because multiple fields
let b' =
    { b with
          y = b.y + 1
          z = "hello" }

Current Formatting (fantomas online)

// desired to be single line, since only one field
type A = { x: int }

let a = { x = 0 }

// desired to be multiline, since multiple fields
type B = { y: int; z: string }

// desired to be multiline, since multiple fields
let b = { y = 0; z = "" }

// desired to be single line, even though long because single field
let a' =
    { a with
          x = AReallyLongExpressionThatIsMuchLongerThan50Characters }

// desired to be multiline, even though short because multiple fields
let b' = { b with y = b.y + 1; z = "hello" }

Rules applied: If the number of fields in the expression/statement is > 1 then force a multiline version. If the number of fields in the expression/statement is = 1 then force a single line version, unless exceeding the maximum line length.

Note: I don't think the first line break in the a' expression is part of the issue I'm talking about. It's more the line break within the record expression. I am unsure at this point whether they are related -- it may be that Fantomas forces a line break there when making the record expression multiline.

Lists and Arrays

Desired Formatting

// oneline preferred because only one element
let xs = [ AReallyLongExpressionThatIsMuchLongerThan50Characters ]

// multiline preferred because multiple elements
let ys =
    [ aShortExpression
      anotherOne ]

// and the same for arrays ...

Current Formatting

// oneline preferred because only one element
let xs =
    [ AReallyLongExpressionThatIsMuchLongerThan50Characters ]

// multiline preferred because multiple elements
let ys = [ aShortExpression; anotherOne ]

// and the same for arrays ...

Rules applied: If the number of elements in the expression is > 1 then force a multiline version. If the number of elements in the expression = 1 then force a single line version, unless exceeding the maximum line length.

(Pipeline-like) Infix Operator Expressions

Desired Formatting

// multiline even though short, because of multiple infix operators
x
|> f
|> g
|> anotherOne

// singleline even though long, because of single infix operator
x |> AReallyLongExpressionThatIsMuchLongerThan50Characters

Current Formatting

// multiline even though short, because of multiple infix operators
x |> f |> g |> anotherOne

// singleline even though long, because of single infix operator
x
|> AReallyLongExpressionThatIsMuchLongerThan50Characters

Rules applied: If the number of (pipeline) operators (i.e., |>, >>, >>=, etc) is > 1 then foce a multiline version. If the number of (pipeline) operators is = 1 then force a single line version, unless exceeding the maximum line length

Note: This does not have to be restricted to pipeline operators, but it does need to exclude the logical operators and the mathematical operators at the very least. Nobody wants to see

x
+ y
+ z

More...

Fluent API calls, maybe more...

Extra information

Estimated cost (XS, S, M, L, XL, XXL): For each instance, S-M. Overall, L.

Related suggestions:

Affidavit (please submit!)

Please tick this by placing a cross in the box:

Please tick all that apply:

  • This is not a breaking change to Fantomas
  • I or my company would be willing to help implement and/or test this
  • This suggestion is part of the Microsoft style guide (please add a link to section if so)
  • This suggestion is part of the G-Research style guide (please add a link to section if so)
@nojaf
Copy link
Contributor

nojaf commented Sep 19, 2020

Hello @Fizzixnerd, thanks for this extensive description for this feature request.
I think this might be easiest to tackle as three related requests, so I'll give my feedback on the first topic.
The main challenge with this request as a whole is that it will touch multiple existing settings and one will have to out rule the other. This would be the first time Fantomas has such a notion.

Lists and Arrays

Imagine we name the new setting to format according to the rule you explained: fsharp_multiple_elements_in_array_or_list_make_multiline. (We can pick another name but just for explanation sake).

In the example of let a = [1;2], three settings are already determining the output of this.
There is fsharp_max_value_binding_width to determine if there should be a new line and indent after the = sign.
Then fsharp_max_array_or_list_width checks if the list should be formatted as a one-liner or each element on a new line.
In the case of multiline list, fsharp_multiline_block_brackets_on_same_column determines how it looks like.

fsharp_multiple_elements_in_array_or_list_make_multiline would in that example take over of fsharp_max_array_or_list_width I think.

let a =
    [ 1
      2 ]

But in order to respect the other rules:

let v = [ "looooooooooooooooooooooooonnnnnnnnnnnnnngString" ]

should still be formatted as

let v =
    [ "looooooooooooooooooooooooonnnnnnnnnnnnnngString" ]

So that is the first thing I want to point out. The scope of the impact should be clear before implementing.

What happens when lists are passed as arguments?

List.map (fun x -> x * x) [1;2;] 

does that look like this then?

List.map (fun x -> x * x)
    [ 1
      2 ]

And when it comes to arguments there is also another thing to consider: Elmish support.
Any expression that makes an Elmish shape will be formatted differently.

For example:

p [] [str "x"; str "y"]

should be

p [] [
    str "x"
    str "y"
]

Even though, that didn't cross the fsharp_max_elmish_width. So some people will be confused here. My p expression was still short enough to have in one line, but fsharp_multiple_elements_in_array_or_list_make_multiline dictates that the second list should be multiline.

So in conclusion, if we want to introduce fsharp_multiple_elements_in_array_or_list_make_multiline, we should carefully considerate when it overrules what other settings.

@Fizzixnerd
Copy link
Contributor Author

Imagine we name the new setting to format according to the rule you explained: fsharp_multiple_elements_in_array_or_list_make_multiline. (We can pick another name but just for explanation sake).

Should there also be a second setting (such as fsharp_multiple_elements_in_array_or_list_make_multiline_cutoff or whatever,) that determines how many elements is "too many" to make single line? This would further complicate things, is not required for our use-case (we would choose 1), and could be added at a later date, so I'm tempted to say no at this point.

In the example of let a = [1;2], three settings are already determining the output of this.
There is fsharp_max_value_binding_width to determine if there should be a new line and indent after the = sign.
Then fsharp_max_array_or_list_width checks if the list should be formatted as a one-liner or each element on a new line.
In the case of multiline list, fsharp_multiline_block_brackets_on_same_column determines how it looks like.

fsharp_multiple_elements_in_array_or_list_make_multiline would in that example take over of fsharp_max_array_or_list_width I think.

This is what I was thinking as well. I was wondering if it would be more future-proof and easier to work with if the option to enable this was actually a string-enumeration, such as:

fsharp_array_or_list_multiline_formatter = "character_count" | "number_of_elements"

with the default being "character_count". This way, if a later person wished to add a similar type of change as an option, there wouldn't be a bool-precedence problem to worry about, and wouldn't lead to a proliferation of new config option labels.

let a =
    [ 1
      2 ]

But in order to respect the other rules:

let v = [ "looooooooooooooooooooooooonnnnnnnnnnnnnngString" ]

should still be formatted as

let v =
    [ "looooooooooooooooooooooooonnnnnnnnnnnnnngString" ]

So that is the first thing I want to point out. The scope of the impact should be clear before implementing.

What happens when lists are passed as arguments?

List.map (fun x -> x * x) [1;2;] 

does that look like this then?

List.map (fun x -> x * x)
    [ 1
      2 ]

This:

List.map (fun x -> x * x)
    [ oneTwoThreeFourFiveSix
      sevenEightNineTen ]

is how Fantomas currently formats a "too long" equivalent expression. In the interest in changing as little as possible about Fantomas (at least in a single request ;) ), I think what you have given above is perfectly fine, and consistent with the rest of Fantomas.

And when it comes to arguments there is also another thing to consider: Elmish support.
Any expression that makes an Elmish shape will be formatted differently.

For example:

p [] [str "x"; str "y"]

should be

p [] [
    str "x"
    str "y"
]

Even though, that didn't cross the fsharp_max_elmish_width. So some people will be confused here. My p expression was still short enough to have in one line, but fsharp_multiple_elements_in_array_or_list_make_multiline dictates that the second list should be multiline.

I agree this could become confusing, but I wonder if this isn't exactly the same thing as the fsharp_max_value_binding_width instance above, in that first we check the (value-binding | elmish)-width and make a decision of whether to multiline that, and then we decide whether to multiline the subexpressions based on their own rules. I may be oversimplifying something here, however, and would like to know if I'm understanding correctly.

My personal opinion is that what you have above is "correct" in the sense that that is how I would expect it to be formatted.

So in conclusion, if we want to introduce fsharp_multiple_elements_in_array_or_list_make_multiline, we should carefully considerate when it overrules what other settings.

I agree, though I believe this override logic should be made as explicit as possible in the config file itself (i.e., by a string-enumeration I suggested above, or some other method), otherwise it could lead to very complicated rules that are difficult to understand/use for users and difficult to debug for developers.

@nojaf
Copy link
Contributor

nojaf commented Oct 2, 2020

Sorry, I linked the wrong issue in another PR. My bad.

@nojaf
Copy link
Contributor

nojaf commented Oct 30, 2020

Hey @Fizzixnerd, were you able to give the beta a try? Any feedback?

One thing I noticed was when using fsharp_multiline_infix_multiline_formatter=number_of_items is is a bit weird that other operators are go multiline because of character width and multiline ones do not.
My proposal would here be to go multiline as well when the fsharp_max_infix_operator_expression is reached for multiline operator expressions. Now we only do this is the max_width boundary is crossed.
I hope that explanation makes sense ;) what do you think?

@Fizzixnerd
Copy link
Contributor Author

Hi @nojaf,

Here are some repros.

Settings:

[*.fs]
fsharp_multiline_infix_multiline_formatter=number_of_items
fsharp_record_multiline_formatter=number_of_items
fsharp_array_or_list_multiline_formatter=number_of_items
fsharp_max_value_binding_width=120
fsharp_max_function_binding_width=120
fsharp_max_elmish_width=120
fsharp_single_argument_web_mode=true
fsharp_align_function_signature_to_indentation=true

Case 1: Multiline-formatting when it should be single-line

a <| ffffffffffffffffffffffffffffffffffffffffffffffff

becomes

a
<| ffffffffffffffffffffffffffffffffffffffffffffffff

fantomas-online

Believe this is happening because it is over 50 characters long. I believe this is doing what you explained above in your comment as wanting to do, but it seems it is already being done. The above expression is 53 characters long, and a little shorter means it isn't line-broken.

Changing a setting to

fsharp_max_infix_operator_expression=120

gives the behavior I would like, but I don't really think multiline infix operator expressions should be split on the non-multiline width by default. We can discuss this with more examples, if you'd like.

Case 2: Mixed expressions

This may actually be the same issue, since it has an identical fix of changing the above setting, but it seemed different at the time I discovered it, so I'll add it here just in case.

fantomas-online

sprintf "DC-%s" (randomNumberGenerator.NextDouble() * double BigNumber.MaxValue |> uint64)

becomes

sprintf
    "DC-%s"
    (randomNumberGenerator.NextDouble()
     * double BigNumber.MaxValue
     |> uint64)

I think it's again splitting on character count, which is undesirable in my mind.

Conclusions

At first I didn't know that changing the setting could recover the behavior I wanted. I can see that if we change the default behavior to what I want it to be, there will be no way to recover the alternative behavior you think is better (that is, splitting on multiline with character count as well). However, keeping it the way it is, I just need to change a setting to get what I want. On the other hand, I think having this setting act slightly differently from all the other number_of_items settings is probably not the best, and the fix of changing a completely different setting seems a bit obscure. It is possible this could be alleviated by documentation, but it just seems sort of off to me.

Since I can recover the behaviour needed for our use cases in either case, I am not super picky, just adding my two cents. Please let me know what you think.

@nojaf
Copy link
Contributor

nojaf commented Nov 8, 2020

Hey Matt, the more I think about it the more I think we are creating a monster here.

Case 1

This is actually correct behaviour, as <| is not part of our current set of let internal newLineInfixOps = set [ "|>"; "||>"; "|||>"; ">>"; ">>=" ] newline operators.
Unfortunately, this can open the discussion of which operators are multiline and which are not.
I have no interest in having those kinds of discussions in this project and wish to rely on the guidance of the style guides when it comes to this.
The set we have today is a legacy set and the debates should be brought up in the style guide spaces.

Case 2

This is also correct behaviour as from AST point of view it can be seen as expr1 |> expr2. expr1 is multiline so it makes the entire expression multiline.

See fsharp_max_infix_operator_expression=65

Conclusions

I recently wrote a good portion of our Contribution Guidelines and the more I think about it this last multiline infix setting just fits the section of What are we not looking for?. There is too much ambiguity in it and I can't say for sure who is really going to benefit from this (besides yourself).

The lack of style guide input is a red flag and I don't want this anymore in the project.
fsharp_record_multiline_formatter and fsharp_array_or_list_multiline_formatter can stay as these are well defined and do not violate anything.
But I don't want to make fsharp_multiline_infix_multiline_formatter canon as it will act as a stylistic precedent.
Thicking of the checkbox of I or my company would be willing to help implement and/or test this should not mean a contributor gets a free pass on what gets into the project.
I'm not referring to your attitude in any way, you never claimed this whatsoever but I'm seeing more and more people suggestion something they want, offering to do the work and then they disappear with the northern sun.
At the end of the day, the sad truth is that is just me maintaining this project and is why I only to stick as close as possible to the two style guides.

I hope you can understand my decision and I will remove all code related to fsharp_multiline_infix_multiline_formatter next week. You can still see a change in behaviour of infix operator formatting in this project if it ends up in one of the two style guides first.

@Fizzixnerd
Copy link
Contributor Author

Hi @nojaf,

While I certainly disagree with your decision, I think I can understand where you are coming from. Thank you for allowing the other two contributions. I apologize for not being more available in the last few weeks, and though I was not planning on maintaining the project, I also had no intention of simply disappearing forever.

Thank you again for all of your hard work in maintaining this project.

@nojaf
Copy link
Contributor

nojaf commented Nov 9, 2020

Hi Matt, sorry again for how things transpired here. I should have handled this better from my side.
I've just closed 5 other feature requests that don't fit in the style guides. So I guess I'll be disappointing multiple people tonight.

I would also like to address that your availability was not the issue here. Apologies for expressing myself poorly.
There were always mixed feelings about messing with infix operators from my side and I should have been very vocal about this from the start. The cases you raised just showed me that I need to be very conscious about this matter.

Where does this leave us? Well, as a consolation prize I can give some pointers:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants