Formatter hangs on deeply nested function calls #261

nkrkv · 2021-02-12T12:36:32Z

Steps to reproduce

Put the following in a file (say, Hang.res):

let foo = () =>
  bar(x =>
  bar(x =>
  bar(x =>
  bar(x =>
  bar(x =>
  bar(x =>
  bar(x =>
  bar(x =>
  bar(x =>
  bar(x =>
  bar(x =>
  bar(x =>
  bar(x =>
  bar(x =>
  bar(x => x)
  ))))))))))))))

Run:

bsc -format Hang.res

Expected result

The source is indented properly and printed to stdout

Actual result

The process hangs, loads CPU to 100%, eats one GB of RAM after another, then after 1m30s or so crashes with out-of-memory.

The source (if put back into the context), compiles fine. Just formatting causes the problem.

Machine info

$ node --version                                               
v12.16.3

$ bsc -version        
BuckleScript 9.0.0 ( Using OCaml:4.06.1+BS )

$ uname -a            
Linux archxx 5.10.6-arch1-1 #1 SMP PREEMPT Sat, 09 Jan 2021 18:22:35 +0000 x86_64 GNU/Linux

The text was updated successfully, but these errors were encountered:

IwanKaramazow · 2021-02-12T12:42:22Z

Thanks for the report. I will investigate this over the weekend. This code is probably generated by a ppx?

cristianoc · 2021-02-12T12:44:37Z

Part involved: https://github.com/rescript-lang/syntax/blob/master/src/res_printer.ml#L4059

nkrkv · 2021-02-12T12:46:00Z

This code is probably generated by a ppx?

Nope, it’s hand-written (simplified version) to convert 15-tuple of Result.t to Result.t of 15-tuple

bobzhang · 2021-05-27T01:30:09Z

might be there is some exponential complexity iN the algorithm

cristianoc · 2021-05-27T03:26:24Z

We already found the issue some time back.
It is an exponential blowup.
There are 3 possible formatting s for callbacks, and the first one that fits line width is chosen.

With n nested callbacks it's 3^n cases. The cases are generated statically, and the explosion happens at case generation time.

One needs to be a little creative here. By Eg prioritize one case when there are nested callbacks. Or use some other creative solution.

leeor · 2022-06-21T09:40:15Z

Started running into this issue more now that I'm removing let-syntax from our code-base. It hits tests files pretty hard because they always involve at least two callbacks before any "real" code (describe + test boilerplate). Some of which I have to break into multiple files just because they can't be formatted.

Out of interest, how is this different from how prettier works for JS? From my experience, formatting there is super fast.

Also, I'm assuming that the algorithm formats depth-last, and if so, can it be simplified to make a decision based on the specific "call-site" instead of how it affects deeper callbacks? In other words, each callback will be formatted based on it's own location, ignoring any additional nested callbacks?

cristianoc · 2022-06-21T12:56:21Z

Found back where the relevant code is:

  (* Sometimes one of the non-callback arguments will break.
   * There might be a single line comment in there, or a multiline string etc.
   * showDialog(
   *   `
   *   Do you really want to leave this workspace?
   *   Some more text with detailed explanations...
   *   `,
   *   ~danger=true,
   *   // comment   --> here a single line comment
   *   ~confirmText="Yes, I am sure!",
   *   ~onConfirm={() => ()},
   *  )
   * In this case, we always want the arguments broken over multiple lines,
   * like a normal function call.
   *)
  if Doc.willBreak printedArgs then breakAllArgs
  else Doc.customLayout [fitsOnOneLine; arugmentsFitOnOneLine; breakAllArgs]

This will print the first case out of the 3 cases [fitsOnOneLine; arugmentsFitOnOneLine; breakAllArgs] that fits.
This is exponential on directly nested callbacks.

This is where fit is computed:

      | CustomLayout docs ->
        let rec findGroupThatFits groups =
          match groups with
          | [] -> Nil
          | [lastGroup] -> lastGroup
          | doc :: docs ->
            if fits (width - pos) ((ind, Flat, doc) :: rest) then doc
            else findGroupThatFits docs
        in
        let doc = findGroupThatFits docs in
        process ~pos lineSuffices ((ind, Flat, doc) :: rest))

cristianoc · 2022-06-21T12:58:35Z

Happy to get PRs if someone is interested in contributing.

One obvious fix is to do Doc.customLayout [fitsOnOneLine] which will, very quickly, return the solution that probably does not look very good.

Another fix is to preserve the original heuristic as much as possible, but cut down space space exploration in the nested calls (say level 2 or 3).

Another fix is to come up with a different heuristic that does not require trying so many things. Guess that corresponds to what prettier does.

leeor · 2022-06-23T08:20:03Z

Results of some tinkering with the formatter:

Vanilla:

time ./_build/install/default/bin/rescript -print res <test file>
./_build/install/default/bin/rescript -print res   23.13s user 0.75s system 95% cpu 24.929 total

Using only arugmentsFitOnOneLine:

index a2780c79e3dc..befe8d9870f8 100644
--- a/src/res_printer.ml
+++ b/src/res_printer.ml
@@ -4011,18 +4011,7 @@ and printArgumentsWithCallbackInLastPosition ~uncurried args cmtTbl =
       let argDoc = printArgument arg cmtTbl in
       loop (Doc.line :: Doc.comma :: argDoc :: acc) args
   in
-  let printedArgs, callback, callback2 = loop [] args in
-
-  (* Thing.map(foo, (arg1, arg2) => MyModuleBlah.toList(argument)) *)
-  let fitsOnOneLine =
-    Doc.concat
-      [
-        (if uncurried then Doc.text "(." else Doc.lparen);
-        printedArgs;
-        callback;
-        Doc.rparen;
-      ]
-  in
+  let printedArgs, _callback, callback2 = loop [] args in
 
   (* Thing.map(longArgumet, veryLooooongArgument, (arg1, arg2) =>
    *   MyModuleBlah.toList(argument)
@@ -4063,7 +4052,7 @@ and printArgumentsWithCallbackInLastPosition ~uncurried args cmtTbl =
    * like a normal function call.
    *)
   if Doc.willBreak printedArgs then breakAllArgs
-  else Doc.customLayout [fitsOnOneLine; arugmentsFitOnOneLine; breakAllArgs]
+  else Doc.customLayout [arugmentsFitOnOneLine]
 
 and printArguments ~uncurried
     (args : (Asttypes.arg_label * Parsetree.expression) list) cmtTbl =

Result:

time ./_build/install/default/bin/rescript -print res <test file>
./_build/install/default/bin/rescript -print res   17.10s user 0.11s system 95% cpu 17.952 total

Refactor breakAllArgs to only be computed when needed:

index a2780c79e3dc..4eaefbb9f698 100644
--- a/src/res_printer.ml
+++ b/src/res_printer.ml
@@ -4011,18 +4011,7 @@ and printArgumentsWithCallbackInLastPosition ~uncurried args cmtTbl =
       let argDoc = printArgument arg cmtTbl in
       loop (Doc.line :: Doc.comma :: argDoc :: acc) args
   in
-  let printedArgs, callback, callback2 = loop [] args in
-
-  (* Thing.map(foo, (arg1, arg2) => MyModuleBlah.toList(argument)) *)
-  let fitsOnOneLine =
-    Doc.concat
-      [
-        (if uncurried then Doc.text "(." else Doc.lparen);
-        printedArgs;
-        callback;
-        Doc.rparen;
-      ]
-  in
+  let printedArgs, _callback, callback2 = loop [] args in
 
   (* Thing.map(longArgumet, veryLooooongArgument, (arg1, arg2) =>
    *   MyModuleBlah.toList(argument)
@@ -4045,7 +4034,6 @@ and printArgumentsWithCallbackInLastPosition ~uncurried args cmtTbl =
    *   (param1, parm2) => doStuff(param1, parm2)
    * )
    *)
-  let breakAllArgs = printArguments ~uncurried args cmtTblCopy2 in
 
   (* Sometimes one of the non-callback arguments will break.
    * There might be a single line comment in there, or a multiline string etc.
@@ -4062,8 +4050,8 @@ and printArgumentsWithCallbackInLastPosition ~uncurried args cmtTbl =
    * In this case, we always want the arguments broken over multiple lines,
    * like a normal function call.
    *)
-  if Doc.willBreak printedArgs then breakAllArgs
-  else Doc.customLayout [fitsOnOneLine; arugmentsFitOnOneLine; breakAllArgs]
+  if Doc.willBreak printedArgs then printArguments ~uncurried args cmtTblCopy2
+  else Doc.customLayout [arugmentsFitOnOneLine]
 
 and printArguments ~uncurried
     (args : (Asttypes.arg_label * Parsetree.expression) list) cmtTbl =

time ./_build/install/default/bin/rescript -print res <test file>
./_build/install/default/bin/rescript -print res   0.53s user 0.01s system 42% cpu 1.262 total

Maybe what's needed here to to refactor the custom layout algorithm to be lazy?

cristianoc · 2022-06-23T09:13:47Z

That would solve the perf issue.
Another question is about design: what should the output look like. Perhaps this is a good time to re-explore a bit, make sure all the choices are intentional.
E.g. what does prettier do? And if the behaviour is different, is the difference in language explaining it?
Based on the answer, there are follow-on questions.

cristianoc · 2022-06-28T01:55:19Z

Sorry should have put these comment on the PR. Now deleted and move them there now.

cristianoc · 2022-06-28T12:09:12Z

Will go with this one and clean up the various PRs #591

A more ambitious change for the future involves making the doc language more expressive, so that the logic used to say "if X fits then do Y else do Z" can be expressed directly.

leeor · 2022-06-28T18:44:32Z

@cristianoc any chance of seeing this in a new release soon? Would be very helpful :-)

cristianoc · 2022-06-28T19:11:43Z

It's merged so, yes in the upcoming release.

nkrkv mentioned this issue Feb 12, 2021

[bug] bsc -format hangs on deeply nested function calls rescript-lang/rescript#4948

Closed

cristianoc mentioned this issue May 27, 2021

Formatting is very slow for deeply nested callbacks #426

Closed

leeor mentioned this issue Jun 23, 2022

improve printing algorithm performance for many nested callbacks #576

Closed

cristianoc closed this as completed Jun 28, 2022

cristianoc mentioned this issue Sep 20, 2022

Create a v10.0 version of docs. rescript-lang/rescript-lang.org#552

Closed

29 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Formatter hangs on deeply nested function calls #261

Formatter hangs on deeply nested function calls #261

nkrkv commented Feb 12, 2021

IwanKaramazow commented Feb 12, 2021

cristianoc commented Feb 12, 2021

nkrkv commented Feb 12, 2021

bobzhang commented May 27, 2021

cristianoc commented May 27, 2021

leeor commented Jun 21, 2022

cristianoc commented Jun 21, 2022

cristianoc commented Jun 21, 2022 •

edited

Loading

leeor commented Jun 23, 2022

cristianoc commented Jun 23, 2022

cristianoc commented Jun 28, 2022

cristianoc commented Jun 28, 2022 •

edited

Loading

leeor commented Jun 28, 2022

cristianoc commented Jun 28, 2022

Formatter hangs on deeply nested function calls #261

Formatter hangs on deeply nested function calls #261

Comments

nkrkv commented Feb 12, 2021

Steps to reproduce

Expected result

Actual result

Machine info

IwanKaramazow commented Feb 12, 2021

cristianoc commented Feb 12, 2021

nkrkv commented Feb 12, 2021

bobzhang commented May 27, 2021

cristianoc commented May 27, 2021

leeor commented Jun 21, 2022

cristianoc commented Jun 21, 2022

cristianoc commented Jun 21, 2022 • edited Loading

leeor commented Jun 23, 2022

cristianoc commented Jun 23, 2022

cristianoc commented Jun 28, 2022

cristianoc commented Jun 28, 2022 • edited Loading

leeor commented Jun 28, 2022

cristianoc commented Jun 28, 2022

cristianoc commented Jun 21, 2022 •

edited

Loading

cristianoc commented Jun 28, 2022 •

edited

Loading