Convert `Array_Like_Helpers.map` to a builtin to reduce stack size #11363

Akirathan · 2024-10-18T16:38:08Z

Fixes #11329

Pull Request Description

The ultimate goal is to reduce the method calls necessary for Vector.map.

Important Notes

I managed to reduce the number of Java stack frames needed for each Vector.map call from 150 to 22 (See Convert Array_Like_Helpers.map to a builtin to reduce stack size #11363 (comment))
Introduced Stack_Size_Spec regression test that will ensure that Java stack frames needed for Vector.map method call does not exceed 40.

Checklist

Please ensure that the following checklist has been satisfied before submitting the PR:

Run (at least engine) benchmarks to ensure no regressions.
The documentation has been updated, if necessary.
All code follows the
Scala,
Java,
Unit tests have been written where possible.

Akirathan · 2024-10-18T16:43:50Z

The experiment in Stack_Size_Spec.enso generates code for nested Vector.map calls and invokes enso subprocess with custom -Xss option (and with disabled Truffle compiler). The current output is:

{nesting: 3, stack_size: 256k} SUCCEEDED
{nesting: 6, stack_size: 256k} FAILED with StackOverflow
{nesting: 9, stack_size: 256k} SKIPPED (SO already encountered with the same stack size in lower nesting levels)
{nesting: 3, stack_size: 300k} SUCCEEDED
{nesting: 6, stack_size: 300k} SUCCEEDED
{nesting: 9, stack_size: 300k} FAILED with StackOverflow

Which means that, e.g., a Vector.map method called on nested [[[[[[42]]]]]] with -Xss256k failed with SO.
The code for "nesting: 3" looks for example like this:

from Standard.Base import all
main =
    vec = [[[42]]]
    vec.map e0->
        e0.map e1->
            e1.map e2->
                e2 + 1

I am not sure yet what to do with that experiment. We should probably integrate it somehow later as a regression test.

Akirathan · 2024-10-23T17:48:23Z

990ee6a introduces a regression test Stack_Size_Spec that invokes enso as subprocess with some specific JVM options and counts the Java stack frames between invocations of Vector.map, and assumes that the count is less than 115. On develop, this test fails - there are 150 stack frames between calls. In this PR, it succeeds - there are 103 stack frames. With added @Tail_Call annotations to Vector.map and Array_Like_Helpers.map:

diff --git a/distribution/lib/Standard/Base/0.0.0-dev/src/Data/Vector.enso b/distribution/lib/Standard/Base/0.0.0-dev/src/Data/Vector.enso
index 4ce7c5bdb4..27c9bed438 100644
--- a/distribution/lib/Standard/Base/0.0.0-dev/src/Data/Vector.enso
+++ b/distribution/lib/Standard/Base/0.0.0-dev/src/Data/Vector.enso
@@ -700,7 +700,7 @@ type Vector a
              [1, 2, 3] . map +1
     map : (Any -> Any) -> Problem_Behavior | No_Wrap -> Vector Any
     map self function on_problems:(Problem_Behavior | No_Wrap)=..Report_Error =
-        Array_Like_Helpers.map self function on_problems
+        @Tail_Call Array_Like_Helpers.map self function on_problems
 
     ## ICON union
        Applies a function to each element of the vector, returning the `Vector`
diff --git a/distribution/lib/Standard/Base/0.0.0-dev/src/Internal/Array_Like_Helpers.enso b/distribution/lib/Standard/Base/0.0.0-dev/src/Internal/Array_Like_Helpers.enso
index aaf2442940..416be3d4a5 100644
--- a/distribution/lib/Standard/Base/0.0.0-dev/src/Internal/Array_Like_Helpers.enso
+++ b/distribution/lib/Standard/Base/0.0.0-dev/src/Internal/Array_Like_Helpers.enso
@@ -229,7 +229,7 @@ transpose vec_of_vecs =
             Vector.from_polyglot_array proxy
 
 map vector function on_problems =
-    vector_from_function vector.length (i-> function (vector.at i)) on_problems
+    @Tail_Call vector_from_function vector.length (i-> function (vector.at i)) on_problems
 
 map_with_index vector function on_problems =
     vector_from_function vector.length (i-> function i (vector.at i)) on_problems

the number of stack frames is 22, with the disadvantage that the Vector.map call is not visible in Runtime.get_stack_trace. At the end, I have decided not to add the @Tail_Call annotations there for now. I think that the reduction of 35 Java stack frames per Vector.map method call is sufficient improvement.

...e/runtime/src/main/java/org/enso/interpreter/runtime/data/vector/VectorFromFunctionNode.java

JaroslavTulach

Few code comments inlined.

At the end, I have decided not to add the @Tail_Call annotations there for now. I think that the reduction of 35 Java stack frames per Vector.map method call is sufficient improvement.

I don't think I agree with such decision. I am more concerned about performance and avoiding StackoverflowError than the shape of the stack. Moreover ...

the number of stack frames is 22, with the disadvantage that the Vector.map call is not visible in Runtime.get_stack_trace.

...if we want to see something that resembles Vector.map in the stack, then rename Array_Like_Helpers to Vector_Impl and vector_from_function to map and we'll get a reasonable stack names while keeping the stack size down to 22.

...e/runtime/src/main/java/org/enso/interpreter/runtime/data/vector/VectorFromFunctionNode.java

JaroslavTulach · 2024-10-24T02:57:40Z

...e/runtime/src/main/java/org/enso/interpreter/runtime/data/vector/VectorFromFunctionNode.java

+      @Cached BranchProfile errorEncounteredProfile) {
+    var ctx = EnsoContext.get(this);
+    var onProblems = processOnProblemsArg(onProblemsAtom, typesLib);
+    var len = Math.toIntExact(length);


What will happen on ArithmeticException? It shouldn't crash the interpreter.

...e/runtime/src/main/java/org/enso/interpreter/runtime/data/vector/VectorFromFunctionNode.java

JaroslavTulach · 2024-10-24T03:07:22Z

engine/runtime/src/main/java/org/enso/interpreter/node/expression/builtin/error/NoWrap.java

+import org.enso.interpreter.dsl.BuiltinType;
+import org.enso.interpreter.node.expression.builtin.Builtin;
+
+@BuiltinType(name = "Standard.Base.Data.Vector.No_Wrap")


Off-topic, but I'd like to know the answer: What is name attribute optional? It is specified on some of the builtins, but not on all. Shouldn't it be specified all the time? I am asking because of

Standard.Prelude needed as main = 6 * 7 no longer works #8852

the proposed Standard.Prelude should include all the builtins and their methods, but we want to have their names exactly the same as in Standard.Base - can we generate such a Prelude.enso from these annotations?

test/Base_Tests/src/Runtime/Stack_Size_Spec.enso

Akirathan · 2024-10-28T21:13:01Z

Engine benchmarks scheduled at https://github.com/enso-org/enso/actions/runs/11562719764
stdlib benchmarks scheduled at https://github.com/enso-org/enso/actions/runs/11562731094

GitHub
Benchmark Engine · enso-org/enso@a445a31
Hybrid visual and textual functional programming. Contribute to enso-org/enso development by creating an account on GitHub.

GitHub
Benchmark Standard Libraries · enso-org/enso@a445a31
Hybrid visual and textual functional programming. Contribute to enso-org/enso development by creating an account on GitHub.

Akirathan · 2024-10-29T16:58:09Z

Another benchmarks scheduled at:

engine benches - https://github.com/enso-org/enso/actions/runs/11578739585
stdlib benches: https://github.com/enso-org/enso/actions/runs/11578753258

GitHub
Benchmark Engine · enso-org/enso@6ac51d7
Hybrid visual and textual functional programming. Contribute to enso-org/enso development by creating an account on GitHub.

GitHub
Benchmark Standard Libraries · enso-org/enso@6ac51d7
Hybrid visual and textual functional programming. Contribute to enso-org/enso development by creating an account on GitHub.

Akirathan · 2024-10-30T11:34:59Z

Caching onProblems argument apparently helped:

Akirathan · 2024-10-30T11:36:12Z

Locally verified regression of IfVSCaseBenchmarks.ifBench6In:

local develop results: 1.685 ± 0.534
local results on this PR: 2.259 ± 0.320

JaroslavTulach · 2024-11-04T14:24:38Z

Locally verified regression of IfVSCaseBenchmarks.ifBench6In:...
local develop results: 1.685 ± 0.534 local results on this PR: 2.259 ± 0.320

I'd like to point out that #11365 is going to change the way we do if/then/else. Once #11365 is in, we will no longer try to invoke if_then_else method on the condition. However take a look at the ifBench6In benchmark:

        if_bench_6_in vec =
            vec.fold 0 acc-> curr->
                curr.f1.not.if_then_else acc <|
                    curr.f2.not.if_then_else acc <|
                        curr.f3.not.if_then_else acc <|
                            curr.f4.not.if_then_else acc <|
                                curr.f5.not.if_then_else acc <|
                                    curr.f6.not.if_then_else acc <|
                                        acc + 1

The ifBench6In benchmark is doing exactly that! It invokes the if_then_else method.

If slowdown in ifBench6In benchmark is the only blocker, then I suggest to ignore it.

Akirathan · 2024-11-04T15:52:16Z

Another locally verified regression of org.enso.interpreter.bench.benchmarks.semantic.TypePatternBenchmarks.matchOverAny benchmark, after increasing the dataset size (so that one iteration takes more than 10E-6 ms):

diff --git a/engine/runtime-benchmarks/src/main/java/org/enso/interpreter/bench/benchmarks/semantic/TypePatternBenchmarks.java b/engine/runtime-benchmarks/src/main/java/org/enso/interpreter/bench/benchmarks/semantic/TypePatternBenchmarks.java
index 9343bd153..97a742657 100644
--- a/engine/runtime-benchmarks/src/main/java/org/enso/interpreter/bench/benchmarks/semantic/TypePatternBenchmarks.java
+++ b/engine/runtime-benchmarks/src/main/java/org/enso/interpreter/bench/benchmarks/semantic/TypePatternBenchmarks.java
@@ -66,7 +66,7 @@ public class TypePatternBenchmarks {
 
     Function<String, Value> getMethod = (name) -> module.invokeMember(Module.EVAL_EXPRESSION, name);
 
-    var length = 100;
+    var length = 100_000;
     this.vec = getMethod.apply("gen_vec").execute(length, 1.1);
     switch (SrcUtil.findName(params)) {
       case "matchOverAny" -> this.patternMatch = getMethod.apply("match_any");

On develop: 0.213 ± 0.020
On this PR: 0.411 ± 0.004

Akirathan · 2024-11-04T19:31:35Z

Regressions resolved. Turns out I was caching the Atom onProblems argument incorrectly. Fixed by converting No_Wrap builtin type to uniquely constructible builtin and caching AtomConstructor directly in d2dcc04. That commit also introduces LoopConditionProfile which speeds up the benchmark tremendously. Local results of org.enso.interpreter.bench.benchmarks.semantic.TypePatternBenchmarks.matchOverAny is now 2x faster than on develop.

# Conflicts: # distribution/lib/Standard/Base/0.0.0-dev/src/Network/HTTP.enso

Akirathan · 2024-11-04T19:37:45Z

Wait for the tests to be green and then, let's schedule another round of benchmarks

JaroslavTulach · 2024-11-05T17:31:16Z

Wait for the tests to be green and then, let's schedule another round of benchmarks

Benchmark may run even without tests to be fully green. Trying to run the benchmarks right now might speed things up.

Akirathan · 2024-11-05T17:39:17Z

Benchmarks scheduled:

GitHub
Benchmark Engine · enso-org/enso@2cbc1e9
Enso Analytics is a self-service data prep and analysis platform designed for data teams. - Benchmark Engine · 2cbc1e9

GitHub
Benchmark Standard Libraries · enso-org/enso@2cbc1e9
Enso Analytics is a self-service data prep and analysis platform designed for data teams. - Benchmark Standard Libraries · 2cbc1e9

Akirathan · 2024-11-06T09:44:28Z

All (stable) benchmarks are either the same, or better, let's integrate.

…11363) The ultimate goal is to reduce the method calls necessary for `Vector.map`. # Important Notes - I managed to reduce the number of Java stack frames needed for each `Vector.map` call from **150** to **22** (See #11363 (comment)) - Introduced `Stack_Size_Spec` regression test that will ensure that Java stack frames needed for `Vector.map` method call does not exceed **40**.

[WIP] Add Stack_Size_Spec

1a0f946

Akirathan added the CI: No changelog needed Do not require a changelog entry for this PR. label Oct 18, 2024

Akirathan self-assigned this Oct 18, 2024

enso-bot bot mentioned this pull request Oct 18, 2024

Convert Array_Like_Helpers.map to a builtin to reduce stack size #11329

Closed

Akirathan added 13 commits October 21, 2024 18:29

Problem_Behavior is builtin type

695eeb3

No_Wrap is builtin type

8753d07

vector_from_function is builtin method

400f0fb

vector_from_function accepts No_Wrap as onProblems argument

a0fb709

Stack_Size_Spec will be a regression test

0487d00

Report_Error unwraps payload

6e0c7f8

Report_Error returns DataflowError

935fbdc

No_Wrap immediately returns error

3e32eb4

Report_Warning correctly attaches warning

d1eb427

Attach additional warnings when limit is reached

f026654

Remove unused code

1eb8edc

vector_from_function is builtin

847c496

Convert Stack_Size_Spec to (regression) test

990ee6a

Akirathan marked this pull request as ready for review October 23, 2024 17:48

Akirathan requested review from jdunkerley, radeusgd, GregoryTravis, AdRiley, marthasharkey, 4e6, JaroslavTulach and hubertp as code owners October 23, 2024 17:48

GregoryTravis approved these changes Oct 23, 2024

View reviewed changes

...e/runtime/src/main/java/org/enso/interpreter/runtime/data/vector/VectorFromFunctionNode.java Outdated Show resolved Hide resolved

JaroslavTulach reviewed Oct 24, 2024

View reviewed changes

Akirathan added 4 commits October 25, 2024 16:41

type error panic is hidden behind TruffleBoundary

fcbb60a

Stack_Size_Spec runs only on LInux

3949ed8

Fix native image run tests - add java.lang.RuntimeException to NI

eb38c35

Skip singleton builtin types in MetaIsATest

a445a31

Akirathan added 2 commits October 29, 2024 12:45

Cache onProblems parameter

89b5e59

Add filtering for benchmarks

6ac51d7

Add docs for tracing compilation for runtime-benchmarks

e13a77a

Akirathan added 3 commits November 4, 2024 17:29

Close context in TypePatternBenchmarks

ab27a4d

No_Wrap is an uniquely constructible builtin

5b96444

VectorFromFunctionNode caches onProblems atom ctor and uses loop profile

d2dcc04

Merge branch 'develop' into wip/akirathan/11329-map-reduce-stack

dee9abe

# Conflicts: # distribution/lib/Standard/Base/0.0.0-dev/src/Network/HTTP.enso

Akirathan added 2 commits November 5, 2024 11:44

Add value for No_Wrap to meta tests

cd3e04b

onProblems is No_Wrap.Value

2cbc1e9

Merge branch 'develop' into wip/akirathan/11329-map-reduce-stack

7f22096

Akirathan added the CI: Ready to merge This PR is eligible for automatic merge label Nov 6, 2024

mergify bot merged commit 701bba6 into develop Nov 6, 2024
39 of 42 checks passed

mergify bot deleted the wip/akirathan/11329-map-reduce-stack branch November 6, 2024 11:14

enso-bot bot mentioned this pull request Nov 6, 2024

Rewrite more of _mega passes_ to _mini passes_ #11326

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert `Array_Like_Helpers.map` to a builtin to reduce stack size #11363

Convert `Array_Like_Helpers.map` to a builtin to reduce stack size #11363

Akirathan commented Oct 18, 2024 •

edited

Loading

Akirathan commented Oct 18, 2024

Akirathan commented Oct 23, 2024

JaroslavTulach left a comment

JaroslavTulach Oct 24, 2024

JaroslavTulach Oct 24, 2024

Akirathan commented Oct 28, 2024 •

edited by unfurl-links bot

Loading

Akirathan commented Oct 29, 2024 •

edited by unfurl-links bot

Loading

Akirathan commented Oct 30, 2024

Akirathan commented Oct 30, 2024

JaroslavTulach commented Nov 4, 2024

Akirathan commented Nov 4, 2024

Akirathan commented Nov 4, 2024

Akirathan commented Nov 4, 2024

JaroslavTulach commented Nov 5, 2024

Akirathan commented Nov 5, 2024 •

edited by unfurl-links bot

Loading

Akirathan commented Nov 6, 2024

Convert Array_Like_Helpers.map to a builtin to reduce stack size #11363

Convert Array_Like_Helpers.map to a builtin to reduce stack size #11363

Conversation

Akirathan commented Oct 18, 2024 • edited Loading

Pull Request Description

Important Notes

Checklist

Akirathan commented Oct 18, 2024

Akirathan commented Oct 23, 2024

JaroslavTulach left a comment

Choose a reason for hiding this comment

JaroslavTulach Oct 24, 2024

Choose a reason for hiding this comment

JaroslavTulach Oct 24, 2024

Choose a reason for hiding this comment

Akirathan commented Oct 28, 2024 • edited by unfurl-links bot Loading

Akirathan commented Oct 29, 2024 • edited by unfurl-links bot Loading

Akirathan commented Oct 30, 2024

Akirathan commented Oct 30, 2024

JaroslavTulach commented Nov 4, 2024

Akirathan commented Nov 4, 2024

Akirathan commented Nov 4, 2024

Akirathan commented Nov 4, 2024

JaroslavTulach commented Nov 5, 2024

Akirathan commented Nov 5, 2024 • edited by unfurl-links bot Loading

Akirathan commented Nov 6, 2024

Convert `Array_Like_Helpers.map` to a builtin to reduce stack size #11363

Convert `Array_Like_Helpers.map` to a builtin to reduce stack size #11363

Akirathan commented Oct 18, 2024 •

edited

Loading

Akirathan commented Oct 28, 2024 •

edited by unfurl-links bot

Loading

Akirathan commented Oct 29, 2024 •

edited by unfurl-links bot

Loading

Akirathan commented Nov 5, 2024 •

edited by unfurl-links bot

Loading